image-blog-qac-legacy-code
November 11, 2024

9 Tips for Working With Legacy Code

Coding Best Practices
Static Analysis

What is legacy code? Legacy code is source code that already exists and needs to be used again. It's not necessarily lousy code, but it often needs some effort to integrate into newer systems. That means you need effective ways of overcoming legacy code's issues. 

What are your chances of working with legacy code? Given that most of the top 10 programming languages on the TIOBE popularity index have been around for at least 20 years, your chances are very high. (The exception being Go, which first appeared in 2009.)

legacy-code-tiobe-index-c++

TIOBE popularity index for C++ over time.

As you can't always rely on creating fresh code that adheres to modern programming practices, we've created this blog to help.

Read along or jump ahead to learn about what is legacy code, best practices for working with legacy code, how to write new code that stands the test of time, and tools that make working with legacy code easier:

➡️ Easily Refactor Legacy Code with Static Analysis

Back to top

What Is Legacy Code?

Legacy code is source code inherited from someone else or inherited from an older version of the software. It can also be any code that you don’t understand or know how to test, and that’s difficult to change.

📕 Related Resource: Explore why working with inherited code is important for software quality.

This is the simple definition. Working with legacy code can have many potential problems that need to be addressed. In his book, Working Effectively With Legacy Code, Michael C. Feathers offers a further definition that speaks to the difficulty in knowing what a piece of legacy code can do: 

"To me, legacy code is simply code without tests."

-Michael C. Feathers

Without tests, you would have no idea what requirements are being met and how to adapt the code to a new project. This means more effort is needed to understand it, adapt it, and write new tests to verify it still works. 

These definitions, however, miss the scenarios where you have older code with tests, but neither are relevant to modern requirements or compliant with newer standards. Or, the code could still be valuable and compliant but you have no idea how to integrate it into a newer system. 

Think of an old interface driver that you must port to a new application stack — you may have written the driver but don't understand the stack.

That is why our definition of legacy code above is more comprehensive and problem-aware, and we will continue to explore ways of addressing these gaps effectively and turning your legacy code into new, usable features. 

📕 Related Resource: ▶️ Interested in modernizing legacy web applications? Watch our webinar on-demand

➡️ Watch Webinar Now: Modernizing Legacy Web Apps

Back to top

Why Is Working With Legacy Code Effectively a Challenge?

The biggest challenge with working with older or unfamiliar code may be your assumptions about it.

You may think the code is bad. Whoever wrote it didn’t know what they were doing. You could have done a better job.

But the truth is, there's usually is a reason why the code is how it is. And, if you didn’t write it, you might not know that reason.

Here are the most common challenges when working with legacy code: 

  • Maintenance headaches — Legacy code often contributes to a system's overall technical debt, making it harder to maintain or modify. This could be due to the code's outdated programming practices, use of an older language version, inefficient algorithms, or lack of documentation. The longer this technical debt goes on, the harder it is to maintain. 
  • Integration issues — The system or the business has moved on from the time when the legacy code was created. Today's world has different types of connectivity, cloud platforms, data formats, functional safety standards, security guidelines, and more. All require careful planning to get integration right. 
  • Performance and scalability concerns — Legacy code may not be optimized for today's hardware nor take advantage of newer software techniques and platform options. Any bottlenecks between legacy code and its environment must be considered and addressed. 
  • Security exposure — Older code was likely developed without modern security practices in mind or knowledge of today's threat landscape. It may also be unsupported by security updates and patches, leaving it vulnerable to known and potential vulnerabilities.
  • Lack of expertise — It's not uncommon for developers to leave organizations, which may leave their code unsupported and undocumented. Vendors may disappear, or the community around an open-source package may dry up. Trying to maintain and upgrade legacy code without this knowledge takes time and effort. 

Care must be taken when making improvements to a legacy codebase. You can't just put a quick fix on one area or run your existing tests and hope they pass. There might be some dependencies you're unaware of or unintended behaviors that you cannot predict. 

That's why it's important to know when to maintain or change legacy code. 

Back to top

9 Tips for Working With Legacy Code and Refactoring Legacy Code

You can’t improve the inherited code overnight. But, you can take gradual steps to improve it.

It helps to first understand why the code must be reused. If it's due to financial or resource constraints, there's not much you can do other than to get on with it. The risk of disrupting a critical system may block progress, but there could be opportunities to plan a slow transition to a newer codebase. 

More often than not, sticking with a legacy system comes down to a resistance to change. Teams are comfortable with the status quo. Especially when things get busy and people are reluctant to migrate to new ways of doing things. In this situation, it's even more important to know how to work with legacy code effectively to dispel myths or concerns. 

Whether you’re just getting started — or you’ve been working on it for a while — here are nine tips you should follow.

1. Test the Legacy Code

One way to understand the code is to create characterization tests and unit tests. You can also use a code quality tool — like a static code analyzer — over your code to identify potential problems.

This will help you understand what the code actually does. And it will reveal any potentially problematic areas in function, performance, security, and coding standards. Once you understand the code, you can quantify the changes, document exceptions, and update with greater confidence.

2. Review Documentation

Reviewing documentation of the original requirements and functionality will help you understand where the code came from. This includes looking at everything from a requirements management tool to inline code comments.

That documentation helps you understand how the code currently works and identify gaps with the new intended functionality. And when it comes time to make changes. this information can help prevent accidental changes that introduce undesirable behavior or compromise a larger system.

3. Only Rewrite Code When It’s Necessary

Rewriting an entire inherited codebase can be tempting. But it’s usually a mistake.

Without complete knowledge of how the codebase works and interacts with the outside world, rewriting code can introduce new bugs and cause dependency issues. It also takes too much time and too many programmers to write everything. 

It's better to take a judicious approach to code rewrites, where select components are modified based on a high level of confidence and having tests to validate behavior.

4. Try Refactoring Legacy Code Instead

More effective than rewriting code is to refactor it. And it’s best to do it gradually.

Refactoring is the process of changing the structure of the code — without changing its functionality or altering its external behavior. Examples include: 

  • Simplifying conditional expressions that have grown too complex over time.
  • Removing hardcoded strings and values.
  • Improving function calls so they are easier to understand.
  • Moving features between functions to eliminate code duplication or avoid one component from having too much responsibility. 

These actions clean the code and make it easier to understand. It also eliminates potential errors and reduces a project's overall techincal debt. 

When refactoring legacy code, it’s best to:

  • Refactor code that has unit tests — so you know what you have.
  • Start with the deepest point of your code — it will be easiest to refactor.
  • Test after refactoring — to make sure you didn’t break anything.
  • Have a safety net — e.g., Continuous Integration — so you can revert to a previous build.

5. Make Changes in Different Review Cycles

Don’t make too many changes at once. It’s a bad idea to refactor legacy code in the same review cycle as functional changes, as they can get too complex to manage, test, and fix.

Plus, limiting the scope of changes makes code reviews easier. Isolated changes are much more obvious to the reviewer than a sea of changes.

6. Collaborate With Other Developers

You may not know the codebase very well. But some of your fellow developers probably do. It’s much faster to ask questions from those who know the codebase best.

So, if it’s possible, collaborate with someone who knows it better than you do. A second set of eyes on the code may help you understand it better.

7. Keep New Code Clean

There’s a way to avoid making the code more problematic. And that’s by ensuring the new code is clean. It ought to be written to adhere to best practices and run through automated tests to ensure compliance with coding standards.

You can’t control the quality of the inherited code. But you can make sure that the code you add is clean.

8. Use AI

One of the newer technical techniques is to use AI to translate legacy code into a newer language, find optimizations, detect unused code, and much more. You can even ask generative AI to refactor legacy code for you. 

The big question is, can you trust AI to code? Refactoring legacy code is a risky operation to begin with, and AI introduces new variables in terms of how it was trained and what it understands.

The good thing is that progress is being made. DARPA's Translating All C to Rust (TRACTOR) program is investigating ways of using large language models (LLM) and other techniques to reduce memory safety vulnerabilities in C code by converting it to Rust. 

"You can go to any of the LLM websites, start chatting with one of the AI chatbots, and all you need to say is 'here's some C code, please translate it to safe idiomatic Rust code,' cut, paste, and something comes out, and it's often very good, but not always." 

-Dr. Dan Wallach, DARPA program manager for TRACTOR

9. Do Further Research

Working with an inherited codebase gets easier with time. A junior developer may not understand why a codebase hasn’t been refactored (and may be keen to refactor it). But a senior developer will know when to leave it alone.

Learning more about the codebase will help you improve it.

A good starting point is Michael C. Feathers's book, which contains some good examples of how to make changes to a legacy codebase. 

Another good source is “Refactoring: Improving the Design of Existing Code” by Martin Fowler. This book offers many tips for effectively refactoring legacy code.

Back to top

How to Prevent Legacy Code Issues 

It's always better to prevent issues before they start, so here are some tips to consider when creating new code: 

  • Adopt a coding standard — Industry-accepted standards provide best practices and help avoid further issues. While they're often mandated for safety-critical applications, such as ISO 26262 and MISRA®, they're invaluable for fostering a "build for the future" mindset in any organization. 
  • Document and comment code — Helpful explanations, tips, and guides make code more accessible and maintainable by future development teams. 
  • Use version controlVersion control is essential for maintaining and tracking changes to a codebase. Having the history behind a piece of legacy code helps you understand it better and makes it easier to coordinate refactoring among the team. 
  • Run automated (continuous) testing — Before releasing code, it's critical to have comprehensive automated testing in place. This ensures that code remains functional and reliable as changes are made over time.
  • Follow code security best practices — Using secure coding techniques and security testing tools and static analysis tools minimizes the risk of compromise and keeps code secure over its lifetime. 
Back to top

Tools for Working With Legacy Code Effectively

You’ll always need to work with inherited code — or work around it. After all, the code is there for a reason. It works. And its results may be good enough that you can let known issues go.

There are good reasons for making changes to code, too. You might be adding a feature, fixing a bug, or improving the design.

In a perfect world, you’d continually rewrite that older or unfamiliar code until it’s fully debugged. But chances are, that won’t be practical.

So, what you need to do is figure out what you can change — and leave the rest alone.

Static Analysis

One way to do this is by using a code quality tool, like a static code analysis tool. You can set a baseline and then run an analysis on the new code to make sure it’s clean. And you can suppress results from your codebase.

Helix QAC and Klocwork, for example, make this very easy to do.

Analyze Legacy Code

Helix QAC can check your codebase against rules, typically from a coding standard. You’ll get diagnostics of violations. And you can prioritize them by severity. This means you can focus your attention on fixing the most error-riddled pieces first.

📕 Related Resource: Review coding standards >>

Set Baselines

You can also set your codebase as a baseline. Maybe the code is fine as-is, and you want to leave it alone. Setting a baseline means that the codebase won’t be pulled into your diagnostics. Instead, you can focus on finding issues in new code — and ensuring that’s clean.

image-blog-qac-legacy-code-baseline
Helix QAC makes it easy to set baselines.

 

Ensure Legacy Code Compliance

In some cases, you may be reusing source code from one project to another. But, some weren’t developed with coding standards. And if you need to achieve compliance (such as with MISRA), this can create problems. By using a Perforce static code analyzer — like Helix QAC for C/C++ or Klocwork for C, C++, C#, Java, JavaScript, and Python — it’s easy to see where the errors in your code are.

Back to top

Ensuring Your Code Legacy 

Navigating the challenges of working effectively with legacy code requires time and effort. But it's a necessary task.

Rather than think of it as "bad code that I have to use," it's better to consider the potential value of legacy code and break the steps down to get it refactored, tested, and working again. There's no sense in reinventing the wheel when it's already been proven to work. 

➡️ start your free static analysis trial

Related Resources About Legacy Code: 

Back to top