The Dark Art of Testing Legacy Code Nobody Understands
The Dark Art of Testing Legacy Code Nobody Understands
“Can you test this feature?”
“Sure, where’s the documentation?”
awkward silence
“There… isn’t any. But it’s been working fine for years. Just make sure it keeps working.”
Or, there is that meme: “I AM THE DOCUMENTATION”
https://www.instagram.com/reel/DExl6s0NnLb/?hl=en
If you’ve been in QA for more than six months, you’ve had this conversation. Welcome to testing legacy code, the stuff that works, nobody knows why, and everyone’s terrified to touch.
This isn’t your clean, well-documented, test-driven development dream project. This is code that was written by someone who left the company in 2015, runs on a framework that’s been deprecated twice, and somehow processes millions of dollars in transactions every day without exploding.
Your job? Don’t break it. (A hint: it is already broken, so let’s keep it as broken as it is and not a touch more!)
Here’s how to survive.
Accept That You’ll Never Fully Understand It (And That’s Okay)
First things first: stop trying to understand every line of code, every feature, every user flow, before you start testing. You are going to go crazy if you try, and probably fail while we are at it.
You’re not refactoring it. You’re not modernizing it. You’re not writing a master’s thesis on its architecture. You’re just trying to make sure that when someone changes one thing, everything else keeps working.
Treat legacy code like a black box with really weird behavior. You don’t need to know why it does what it does. You just need to know what it does, so you can verify it keeps doing that.
The senior dev who wrote it? Gone. The architect who designed it? Also gone. The documentation? Never existed, or it’s a 200-page PDF from 2009 that’s completely outdated.
You’re an archaeologist, not an engineer. Act accordingly.
Find Someone Who Remembers (Before They Quit)
The real documentation isn’t in Confluence or some dusty wiki page, or a 500-page-long user manual that was created but never completed while you were still in high school (I had experience with one of those 4/10 wouldn’t reccomend).
It’s in someone’s head. Yes, there is at least one or two people who know all this.
Find that person. Now. Before they give their two weeks’ notice. Or worse, die…
This is usually the person who:
- Has been at the company for 7+ years
- Sighs heavily when this system is mentioned
- Says things like “oh yeah, we tried to rewrite it twice but gave up”
- Knows which parts are “a little quirky” (translation: completely broken, but we’ve learned to work around it)
Buy them coffee. Ask them about the weird parts. Record the conversation if you can. Take notes. Ask about the time things went horribly wrong and what they learned.
These war stories? Those are your test cases.
“Oh yeah, never run a report on the last day of the month, the timezone calculation breaks and refunds everyone.” That’s not in the requirements doc. But it’s CRITICAL information.
The clock is ticking. These people are tired. They will leave, or worse, get a better-paying job with the competitor.
Get their knowledge while you can.
Production IS Your Documentation
Here’s the uncomfortable truth: if the legacy system has been running in production for five, ten years, then production behavior is the specification.
I know, I know. “But what if it’s been doing the wrong thing this whole time?”
Maybe. But if users have been using it successfully for years, if money has been flowing correctly, if nobody’s complained… then whatever it’s doing is good enough.
Your job isn’t to decide if the legacy behavior is “correct” by some theoretical standard. Your job is to make sure it keeps doing exactly what it’s been doing.
So watch production:
- Check the logs (if they exist)
- Monitor support tickets
- Look at user behavior analytics
- Ask customer support what weird workarounds users have developed
If the system has been returning dates in MM/DD/YYYY format for 5 years, and suddenly you “fix” it to DD/MM/YYYY because that’s “more correct”… congratulations, you just broke everyone’s integrations.
Production is your oracle.
Treat it with respect.
And fear, and more respect.
Document What It Actually Does (Not What It Should Do)
When you start testing legacy code, your first instinct is to test against requirements or best practices.
Stop.
Instead, do characterization testing: document what the system actually does right now, quirks and all.
“When you pass an empty string to the search function, it returns all records sorted by creation date in descending order.”
Is that good design? Probably not. Is that what it does? Yes. Has it been doing that for years? Also yes. Is some downstream system depending on that exact behavior? Almost certainly yes.
Write it down. Test that it keeps doing exactly that.
Once you’ve documented the current behavior, THEN you can have a conversation with the product team about whether that behavior should change. But until someone officially decides to change it, your job is to protect it.
Those quirks aren’t bugs. They’re features that nobody documented. Treat them accordingly.
Know Where the Bodies Are Buried
Not all legacy code is equally dangerous.
Some parts are just old but stable, they process data, they’ve been doing it the same way forever, they’re fine.
Remember, the entire USA banking system is standing on COBOL.
COBOL was developed in 1959.
Those parts?
Smoke test them and move on.
Then there are the danger zones:
- Anything that touches money
- Anything that handles user authentication or personal data
- Integrations with external systems (especially if those systems have been updated and this one hasn’t)
- Code with comments like “TODO: fix this properly” from 2012
- Anything in the “utils” folder (it’s never just utilities)
- The parts that make senior devs go pale when you mention them
Map these areas. Document them. Test them thoroughly every single time.
When the PM says “we’re just changing the button color,” and that button is in one of these danger zones? You test everything around it. Because in legacy code, everything is connected to everything else in ways nobody understands.
Don’t take this lightly, I am not joking.
When You Can’t Test It, Monitor It
Sometimes you literally cannot figure out if a change will break something. The logic is too convoluted, the dependencies are too hidden, and the impact is too unpredictable.
In those cases, add monitoring first, test in production second.
Yes, I said test in production. With legacy systems, sometimes that’s your only option.
Ok, let me rephrase that last one.
Test in Production, but Do It Smart
- Use feature flags to roll out to 1% of users first
- Add detailed logging around the changed functionality
- Set up alerts for unusual behavior
- Have a rollback plan ready
- Watch it like a hawk for the first 48 hours
- Have someone clone the entire production environment so it can be tested (you might get screamed out of the office for asking this, but if you get it, it is WORTH IT!)
Better to catch a problem when it affects 1% of users than during a full rollout. And with legacy code, you’re often choosing between “test it perfectly in staging” (impossible) or “test it carefully in prod” (terrifying but realistic).
Your staging environment probably doesn’t even match production anyway. Half the integrations are mocked. The data is sanitized. The configuration is slightly different. Testing in staging gives you false confidence.
Production is the only environment that matters. Treat it accordingly.
The Uncomfortable Reality
Testing legacy code isn’t about best practices. It’s about survival.
Your goal isn’t to make the code better.
Your goal is not to make it worse.
Every piece of legacy code that’s still running is a small miracle. It’s been patched, worked around, and held together with duct tape and hope. It survived database migrations, framework updates, infrastructure changes, and countless “quick fixes” over the years.
It’s probably terrible code. The original developers would be embarrassed by it. It violates every principle in Clean Code.
But it works.
And “works” is better than “theoretically perfect but completely broken.”
Your job is to keep it working. Document its quirks. Protect its weird behavior. Test the dangerous parts thoroughly. Make sure that when someone inevitably has to change it, you catch the explosions before users do.
Is it satisfying? Not really. Will you get credit for it? Probably not. Will you save the company from a production disaster? Absolutely, yeah, no, it’s more like probably :).
One Last Thing
If you’re testing legacy code, you’re already doing the hard work nobody else wants to do. The developers don’t want to touch it. The product managers don’t want to think about it. The architects want to rewrite it (but never will).
You’re the one keeping the lights on.
So be kind to yourself. You won’t understand everything. You’ll miss things. Something will break eventually, legacy code breaks, that’s what it does.
But you’re doing your best with an impossible situation, armed with no documentation, no tests, and no safety net.
That takes skill. That takes courage. That takes a special kind of patience.
And honestly? Companies don’t appreciate it enough.
But I do. Every tester who’s ever stared at a 20-year-old application with no docs and no idea, and thought “How do I even test this? Why is this even a project still, why is this alive?” knows exactly what you’re dealing with.
You’re not alone. We’ve all been there. And we’re all doing our best to make sure these Frankenstein systems keep shuffling along.
What’s your worst legacy code horror story? I want to hear it. Drop a comment or message me on LinkedIn, misery loves company, and I guarantee I’ve seen worse!
Grab the free ebook:
"Software Testing for Beginners" is packed with real-world tips on writing bug reports, test cases, and surviving chaotic projects.
💡 What's inside:
- Smart test case templates
- Bug reports devs actually respect
- Tools, tips, and tactics that work
No fluff. No BS. Just stuff every tester should know.