As an industry, we often use the word legacy as a pejorative to describe an older system in need of replacement. However, legacy really means an existing system that lives on because it continues to provide business value and the cost of replacement is unreasonable. Managers and software developers are faced with the need to support and maintain legacy systems; but no system is ever completely frozen in time. Changes to business and technical requirements, driven by both internal and external forces, push us to evolve or adapt software over time.
In the course of our consulting practice, we come across a wide variety of software—through audits to identify potential problems or much needed improvements, and also through legacy code we were engaged to replace. This article outlines an approach to identifying and solving common problems, and presents ideas for implementing solutions to improve your business applications.
Look for ideas you can apply today for some easy wins, and also identify practices that could take a while to implement but will pay off over time.
Many of the ideas presented below have become the norm for teams both large and small, but we still run into customers with solutions that echo these same problems. It is surprising how often one sees some of these issues, even with very capable teams on larger projects. Thus, whether you are a stand-alone developer or the leader of a team of developers, these practices will lead to better quality code and happier customers. If the practices identified here are things that you are doing today... great! Otherwise, look for ideas you can apply today for some easy wins, and also identify some practices that could take a while to implement but will pay off over time.
1. Make an Inventory
It turns out that many organizations—in essence—do not know what they have or how it works. Documentation is weak or non-existent. Programmers may be working in a break-fix mode without any real context. Start by taking stock of what you have today. Create a comprehensive inventory of system features, existing documentation, the processes used to produce the software, as well as the libraries, components, and tools used to produce the software.
Along the way, ask yourself if there are any risks associated with the things you find. Still using SQL Server 2008 R2? That is a red flag, since Microsoft ended extended support for it in July 2019. Or, are you using a third-party library produced by a vendor that went out of business in 2016? That is also a risk since you can no longer obtain support or updates. In any case, start out by figuring out what you have now and what issues or risks might be associated with the tools and libraries being used today.
2. Create a Roadmap
Not all legacy systems are simply static. Some organizations have plans to grow and expand the capabilities of existing software. In these cases, planning needs to take two forms: one, a product roadmap, illustrating what new features will be added over the course of the software’s life; and two, a technical roadmap, outlining how the system will change to meet product requirements and technical challenges.
Of course, your roadmap could simply be to operate your existing system until it grinds to a halt and then replace it with a shiny new system. Even so, what is the plan to get from where you are today to that new thing? Stakeholders typically envision continuity in terms of three-to-five year increments.
First, determine the expected lifespan of the system. And then, ask yourself what you need to do in order to support and maintain this software for that period of time. Also, look at current trends and foreseeable changes on the technology horizon and determine what risks and opportunities exist. In the lifespan of your system, will you need to replace infrastructure or switch to a different cloud provider? Update key libraries or components? Since you now know what you have from your inventory and you’ve looked at the risks associated with those things, you can lay out the plan to migrate and update these dependencies as necessary.
3. Assess the Process
We ask teams “what process model do you follow?” The answer, for quite some time, has been some form of Agile or Scrum (at the same time, there are still people who essentially have no process). When asked the immediate follow-up question “is the process documented?” we are met with blank stares or just a flat “no.” Agile, of course, is high-level set of principles, more of a philosophy than a software process. Scrum on its own isn’t a complete software process, since it does not prescribe anything about languages, coding standards, source control, build process, and so on.
Many teams have a complete process (and are following it quite well), but in a variety of organizations, none of it is written down. New team members are introduced to the project and its process by another person who walks them through it step-by-step using a kind of show-and-tell approach. If you can have a walk-through, you can write it down. The next person to be brought on board can read about the process and serve as the test subject for validating the documentation. If part of the process isn’t working for you, change the process and update your documentation. If someone is confused over how something is supposed to work and something bad happens, but you do not have anything written down, then no one can be held accountable. Did you follow the process? What process?
4. Document the Standards
Much like the idea of a documented software process, the idea of a coding standard is often an afterthought. Many teams believe they have a standard or follow the implicit industry standard for their particular platform or language. The absence of an agreed-upon and documented standard usually leads to fruitless debates between programmers. Tabs versus spaces. Dynamic versus static. Even if you are not familiar with these examples, the struggle is real… and it could cost time wasted while your team debates the “right” approach during a code review.
Some teams lack even basic agreement on a standard approach. Also, your system may involve more than language or framework, so consider that your standard may need to encompass a variety of tools and languages with different conventions for front-end, API, and database development. Lack of a coding standard can lead to poorly structured code, inconsistent naming conventions, and code that is hard to read and maintain. Code that is hard to maintain ultimately costs you and your company money.
Lack of a coding standard can lead to poorly structured code, inconsistent naming conventions, and code that is hard to read and maintain.
Establish and maintain a coding standard. An immediate consequence will be that existing systems have code that does not match the standard—either build in an exemption for old code or agree that even changes to old code will adhere to the standards going forward.
5. Review the Code
Regular code reviews and inspections reveal potential problems and help prevent flaws from going into test or production. Your team may already hold peer-to-peer code reviews, but if you are responsible for the system, and you haven’t looked at it lately, you should. Do not try to catch everything; instead, look for a small list of specific “code smells” that seem to invade every code base: inline or unnamed constant values, abundant data type conversions, and the lack of basic structured coding techniques.
Look for repeated string and numeric values scattered throughout the code, or cases where the same constant appears in multiple files. A number of such constants represent values which are stored in data and may change over time or change due to evolving requirements. It is much easier for integrated development environments or code analysis tools to find repeated use of a symbol than it is for a person to review all cases where a variable is compared to “3”.
Look for frequent data type conversions, or data type conversions happening when messages are sent across different boundaries. Type conversions which take place when collecting or presenting data are normal, and likely necessary. Type conversions are slow and usually redundant; plus, when you see a lot of type conversion, it is also evidence that the domain model is not a good representation of the data or has not been designed well.
Do not try to catch everything; instead, look for a small list of specific “code smells” that seem to invade every code base.
Look at the overall structure of the code. If you see a small number of classes or modules that appear to be doing all the work, thousands of lines of code in a single code file, or hundreds of lines in a single method, these are signals regarding code quality, maintainability, and testability.
Lastly, look at the domain model and data access approach. In the past, with techniques rooted in ADO.NET or JDBC, you might see repeated code for commands, connections, and data handling; query statements scattered about in different layers or not isolated in a logical data layer, and values concatenated together with structured query language syntax.
Code reviews reveal the good, the bad, and the ugly. While the design and implementation of existing code can be improved over time, there is no “quick fix” to remediate basic coding issues. But, if you find the kind of issues identified above, don’t lose hope. Establish a “technical debt” backlog, which may or may not be separate from your primary backlog of features and requirements, and work through them as time permits. If you have someone on the bench, start that developer on the things you found.
6. Reduce the Attack Surface
Unfortunately, one of the most common issues in custom software is poor or inadequate security. We have seen instances in which passwords are stored in clear text, database passwords are embedded in application code, and there is an absence of encryption for sensitive or personally identifiable data. In the age of object-relational mappers, there are still systems vulnerable to SQL injection due to outdated data access architecture.
Furthermore, as we live in the age of web applications, many sites are not sufficiently protected from cross-site scripting (XSS) or cross-site request forgery (CSRF) attacks. Even for those applications which do not have any of the aforementioned flaws, imagine systems with no defined roles and permissions, or authorization rules applied inconsistently. These types of issues make systems vulnerable to attack and reading this should make possible legal or financial consequences more evident.
Assessing security issues requires a closer look at the code and infrastructure than a typical code review would reveal. It is not uncommon to employ third-party tools to scan applications, libraries, and data to expose security issues or engage experienced security experts to perform a detailed assessment. Whatever mechanism you employ to identify security risks, these kinds of issues should immediately jump to the top of your backlog until resolved.
7. Take Some Measurements
Check into what kind of logging, instrumentation, and error handling your system has. This is also a case where some real-world applications turn out to have little or no error handling, error handling at the wrong level, or they confront users with bizarre technical information or platform-default error views (i.e., the old “yellow screen of death”). Such problems make applications hard to use and also make it hard to diagnose and correct errors.
As the old adage goes, you cannot manage what you cannot measure, and if you do not know the full measure of your application, you cannot manage it.
Fortunately, there are a number of resources available at reasonable expense to provide application-level logging and instrumentation. A variety of both old and new platforms and tools have the ability to add global exception handling and logging. A slightly harder issue is providing users with a more streamlined experience when errors occur. Nevertheless, there are remedies, even if you just catch the error and display a generic page or screen explaining that an error occurred. Focus on the addition of instrumentation and logging if it is missing; as the old adage goes, you cannot manage what you cannot measure, and if you do not know the full measure of your application, you cannot manage it.
8. Establish Separate Environments
At a minimum, everyone should have a dedicated test environment separate from production. If you think it is unnecessary or redundant to point this out, you should meet one of the dozens of customers we met over the past few years who did not have a dedicated test environment, or a test environment used for both development and testing. On the other hand, some companies have the full gamut of environments: test, user acceptance test, staging, and production. I would debate the utility of having this structure for all types and sizes of applications, but it is a wise approach for enterprise or mission
A common objection is that having separate environments is too costly, but one should always assess the risk of not having a test environment. What is the cost of an issue or bug that causes downtime in production, especially an issue that could have been detected in test? If you do not already have a separate test environment, create one and start using it today.
9. Control the Source
With the variety of low-or-no-cost source control options available, it seems almost laughable when you find out that someone’s code is not under source control at all. We have seen situations where there is no source control in use, or cases in which it is not being used consistently. Operating without source control means there is never a guarantee you can identify the latest code, the exact version currently in production, and you have no reliable way to go back to a previous version in case there is an issue. At this step in the discussion, someone will inevitably say, “yes, but we zip archive all of the source before every production release.” There is no scenario in which that is ever a viable replacement for a stable version control system. Of all the things we come across, it is a relatively simple action item to gather up the most recent code, get it into source control, and get the team trained on how to use it properly.
Another slightly less serious issue is when the source code for a single system is stored in multiple repositories. At least the team has the source code in version control. It is understandable that large organizations have multiple version control systems, supporting multiple teams across different divisions or regions. Having the source in multiple repositories is an issue due to the extra complexity and, therefore, extra time it introduces to develop, test, and deploy a solution. Strive to keep all the source code for a single solution in a single repository whenever feasible.
10. Automate Builds and Deployment
Achieving the celebrated goal of continuous integration and continuous delivery (CI/CD) is merely a daydream for some, and others should be wary of the amount of effort required to produce automated tests that would certify a release to move directly to production. At the same time, there are an incredible number of things you can do to automate at least some of the build and deployment process. Another thing we find when working with customer teams is the absence of automated build and deployment and a lack of documentation regarding how to build and deploy the software. In the not-too-distant past, automating build and test processing required expensive tools, specialized knowledge, and time and energy to create and maintain.
Start by automating execution of unit tests and builds whenever developers check in code.
Today, these resources are widely available, inexpensive, and do not require extensive care or supervision. Start by automating execution of unit tests and builds whenever developers check in code; this saves each developer a little bit of time every day and makes even small issues immediately visible to the team. Whenever code is merged up to a well-known trunk (i.e., development) then your build environment should detect that update, build the solution, and deploy to the first-line test environment. And, even when it is neither feasible nor preferable to completely automate releases to production, build artifacts can still be prepared automatically and made ready to deploy, ideally by a final point-and-click.
11. Test, Test, Test
Incredibly, there are teams who do little or no unit testing and have no integration tests, no test plan, and few test cases. Whenever we audit systems for customers, we ask “how do you test?” The answers never fail to dismay. And even when there are unit tests or integration tests, hands-on tests frequently do not happen or are performed only by developers.
Full-dress test-driven development might not be an option for an existing application with no unit tests, but it might be feasible to create unit tests around core business logic or critical features. Admittedly, this depends on the overall software architecture. If there are classes, modules, or components that can be isolated to test business logic separate from other dependencies, then you can begin introducing unit tests right away. Otherwise, it could be challenging to execute unit tests without also running extensive set up and tear down of required environment and components.
Two of the greatest methods for testing are also the most challenging to do well: integrated testing and human testing. Integrated testing should happen automatically, and if possible, as part of your integration and deployment process. However, human testing—people using the system in a dedicated test environment—is most likely to expose issues that unit or integration tests may never reveal. Automated tests never report “this is not going to work for us”—that is something only a seasoned user with experience testing the system will know.
Make a few tests of your core features at first and grow them over time.
There are some developers who know the business, know the requirements, and are devoted to thorough testing. There are also some developers who are not or do not see it as their job. In the best case, the developer team does a great job testing. You should still have some level of user testing—it promotes ownership of the system, gives users a stake in the process, and diminishes the possible “us versus them” mentality that sometimes arises when issues occur.
Back up your human testing with integration tests. Integration tests take more time and expertise to put into place, but are better over time at detecting regression, and exercise a broader range of variations and edge cases. Start creating integration tests which run as part of an automated build and test process; make a few tests of your core features at first and grow them over time. The first time the tests detect regression in a critical feature, the time to put them in place will have already paid for itself.
In this article, you have learned about issues we find when asked to review and assess existing software, and the strategies we recommend for improving legacy code. As I mentioned earlier, the practices presented above are already standard to quite a few development teams for both new and legacy solutions. But, there are also teams producing software that are not doing any of these things. If these ideas are new to you, or your team is not following these practices, don't try to do it all at once. Take the same incremental and iterative approach to improving your existing code that you would take with a new project. As you think about each area, and as tasks or issues bubble to surface, add those things to your overall product backlog. Identify your assets and perform an overall risk assessment. Assess security features to limit or prevent potential attacks. Create or update the product roadmap. Document your team's process and standards. Review the code and hold some team code reviews. Put in place logging and instrumentation so you can see what is happening in production. Establish separate environments for development, test, and production in conjunction with source control and automated tests to streamline and strengthen builds, testing, and deployment, and ultimately, the overall quality of your existing code base.