Rewrite Over Refactor: When to Start From Scratch

The Painful Realization

Aspira v1 worked. I could play it against other engines. It made legal moves, evaluated positions.

But deep down, I knew the architecture was wrong.

The Problems

Move Generation: Bad design. Very inefficient. Bad allocations. Could be way better.

Board Representation: Array of ints (can be efficient but mine was messy). Lead to poor performance and bugs.

Make/Unmake: Again, bad allocation patterns. Subtle bugs in restoring position state.

Search: Alpha-beta implementation wasen’t that bad, but coupled with the above issues led to poor performance and correctness problems.

No Abstractions: Everything was tangled. Couldn’t change one thing without breaking three others.

The Refactoring Attempt

I tried incremental refactoring first. Spent weeks trying to fix move generation without breaking everything else.

Every fix revealed deeper issues:

Fix move generation → expose bugs in board representation
Fix board representation → break make/unmake
Fix make/unmake → expose search bugs
Fix search → more confusion on what was actually broken

It was like pulling on a thread and watching the whole sweater unravel. It wasn’t pleasing anymore to work on it.

The Decision Moment

After some weeks of fighting the architecture, I asked myself:

“How long would it take to rewrite this from scratch with everything I’ve learned?”

The honest answer: “Probably less time than continuing to patch this mess.”

That’s when I made the decision: complete rewrite.

What Made This Hard

Throwing away weeks of work hurts. You have functioning code. It does things. People can use it.

Starting over feels like failure.

But sunk cost fallacy is real. Past time invested shouldn’t determine future decisions. The question is:

“What’s the fastest path to good code from here?”

Sometimes that’s refactoring. Sometimes it’s rewriting.

The Rewrite Process

Phase 1: Core Representation (1 week)

Built solid foundation:

Clean bitboard representation
Proper abstractions for pieces, squares, moves
No premature optimization
Extensively tested

Phase 2: Move Generation (2 weeks)

Implemented from scratch with correctness as primary goal:

Keeping it simple (naive move generation first)
Proper castling, en passant, promotion
Perft testing at every step
Only optimized after correctness proven

Phase 3: Search & Evaluation (2 weeks)

Clean alpha-beta implementation:

Simple, correct code first
Added features incrementally
Each addition tested thoroughly
Performance tuning only after correctness

Phase 4: Optimization (ongoing)

With solid foundation, optimization became straightforward:

Magic bitboards for sliding pieces
Profile to find hot paths
Optimize specific bottlenecks
Measure improvements
No premature optimization

What Made the Rewrite Succeed

1. Learned from Mistakes

I knew what problems to avoid:

Don’t mix representation styles
Proper abstractions from the start
Test extensively before optimizing
Keep concerns separated

2. Focused on Correctness First

v1 prioritized “getting it working” which led to shortcuts and technical debt.

v2 prioritized correctness and good code, which led to better performance (fewer bugs, simpler code that JIT can optimize).

3. Better Design Upfront

Having built v1, I understood the problem space better:

What components needed isolation
Where complexity actually lived
What abstractions were needed
What premature optimization to avoid

4. Incremental Validation

After each phase, extensive testing before moving forward:

Unit tests for core functions
Perft suites for move generation
Position tests for evaluation
Performance benchmarks

The Results

Performance

v1: ~1M nodes/sec with bugs
v2: ~10M nodes/sec at first stable release

The speed came from simpler code, not clever tricks.

Correctness

v1: Failed various perft suites, subtle bugs in edge cases
v2: Passes extensive perft suites, no known correctness issues

Maintainability

v1: Adding features meant fighting the architecture
v2: Adding features is straightforward

Development Speed

After the initial rewrite investment, development is much faster. Adding new features or optimizations doesn’t require untangling architectural issues.

When to Rewrite vs Refactor

Rewrite When:

Fundamental architectural issues that can’t be fixed incrementally
You’ve learned enough that you’d design it completely differently
Refactoring costs more than rewriting (time and complexity)
The codebase fights you on every change
Bugs are systemic rather than localized

Refactor When:

Architecture is sound, just specific implementations are wrong
System is in production and rewrite risk is too high
Team doesn’t have deep understanding yet
Changes are localized and don’t cascade
Time pressure makes rewrite infeasible

Lessons Learned

1. Sunk Cost Fallacy is Real

Past time invested doesn’t make bad code worth keeping. Judge based on future cost, not past investment.

2. “Working” Isn’t Enough

Code that “works” but fights you on every change is technical debt that compounds.

3. Sometimes Fast = Slow

The “fast” path of patching v1 would have been slower long-term than the “slow” path of rewriting v2.

4. Experience Compounds

You can’t write v2 without learning from v1. The rewrite embodied months of learning.

5. Foundation Matters

Solid architecture makes everything easier. Bad architecture makes everything harder. Time invested in foundation pays dividends forever.

The Hardest Part

The hardest part wasn’t writing the code. It was making the decision to throw away months of work.

Once I decided, the rewrite felt liberating. No fighting against bad abstractions. Clean slate with clear design.

Would I Do It Again?

Absolutely.

The v2 codebase is:

Faster
More correct
More maintainable
Easier to extend
Actually pleasant to work with

That’s worth the time investment.

The Bottom Line

Don’t be afraid to rewrite when the architecture is fundamentally wrong.

But make sure you’re rewriting because you’ve learned, not just because you’re bored with the current code.

The test: “If I started from scratch today, would I design it completely differently?”

If yes, rewrite. If no, refactor.

“The current version is not ‘v1 with patches’, it’s a full rewrite with everything I learned the hard way baked in.”

Context

Decision

Alternatives Considered

Incremental Refactoring

Selective Rewrite

Reasoning