What’s it all about
#

This is a five part series of articles where I explore how Claude can be used to create code using RSpec and Test Driven Design (TDD):

Part 1: I talk about the importance of automated testing and Test Driven Development
Part 2: Before going full TDD I get Claude to write some RSpec test for existing code and found some interesting surprises there.
Part 3: I define a Problem and Solution Spaces that I want to solve using new code written in a TDD style
Part 4: I create the tests and code for the utility class defined in my Solution Space using TDD
Part 5: I create the first tests and code for the core class defined in my Solution Space using TDD
Part 6: I complete the tests and code for the core class defined in my Solution Space using TDD

Why Claude Code & TDD ?
#

I’ve been coding for some 20 years now and worked with LLMs way before most had even heard the phrase. Despite that, when it came to using AI coding tools I struggled to find a good way to start and this wasn’t helped by all the “You just have to trust me it’s freakin’ awesome” fanboys on linkedin.

I messed around with Github co-pilot for a while but it didn’t really give me a feel of anything amazing about to happen anytime soon though my Visual Studio autocomplete began to impress me rather than just annoy me - it still does annoy me from time to time.

Not quite knowing where to start may also be down to my tool of choice - Ruby & Ruby on Rails. Rails is a web framework already optimised for rapid development with minimal boilerplate so there wasn’t much in the way of easy wins for an AI tool. Moreover, co-pilot often gave me code that didn’t make best use of the language or framework. It was Ruby code “but not as we know it” tand it worked in its own oddly artificial way.

Tinkering with Claude I found that it gave me ruby code mostly in a style similar to my own so I put more of my efforts there. That said, I was still a little unsure on the best way to start until it occured to me that Test Driven Development is not unlike using AI to develop code. A prompt is a way of telling the AI model what you want the code to do. Test Driven Development starts with a statement on the problem the code needs to solve.

I’m reluctant to allow agents to write more code than I can effectively check. I can’t prove this yet - though maybe I’ll try one day - but I’d bet good money that if an agent writes me 10 lines of code that takes me 2 minutes to check it will take me a lot longer than 4 minutes to check 20 lines of code so to me it feels faster to wrote less in one go. Writing AI code in a TDDish style felt like a good way to keep things small and ensure:

I’m able to review and critique both tests & production code quickly and effectively.
I don’t become an LLM zombie. My hope is that I can use LLMs to do the dull, laborious code that Claude has already seen a million times while giving me more time to focus on the more novel code solutions it’s unlikely to have seen before.

TLDR
#

In case you’re not familiar with the term it means ‘Too Long Don’t Read’. This article comes in six parts so if you don’t have the time to read them all then I thought I’d just share my conclusions up front.

The Good
#

First and foremost I really enjoyed the experience, especially since Claude handled some laborious tasks on large AWS documents that I know I would have found tedious so probably not done as good a job.
I learned a few eloquent railsisms that I didn’t know existed and this feels like a nice aspect of AI. I learned my core skills over a certain time so haven’t always adopted newer things if the way I already knew worked.
For sure it saved me a significant amount of time but required more time up-front thinking about the problem and what I wanted the code to do. This is good - “weeks of coding can save you hours of planning” and I’m confident the end result contributed less to technical debt than had I just gone in feet-first & muddled through, writing tests along the way.
Generally Claude wrote code in a nice Rubyesque way. Occasionally it would do something that could be better but never so, much better that I’d feel the need to change it.
I was very impressed when I forgot to give a name to a method when prompting it to generate code and it came up with an appropriate name all by itself.
It came up with some effective and novel (novel to me at least) ways of testing that some complex and highly structured data no longer existed by first checking it did exist, running the production code then checking it didn’t exist.
It wrote some nice utility methods to keep the tests DRY.

The not so Good
#

In all honesty there was nothing bad in the production code that I felt I had to change.
I had to tell it to use ‘‘subject" in my tests instead of assigning another variable name - using ‘subject’ is important to me as makes it clear what is being tested.
Some tests were really testing the underlying framework more than the code.
I had to instruct it to use fixture and factories rather than create its own test data in the spec but only had to do that once so it learned.
Some tests were testing the same thing in different ways and occasionally the underlying framework. I kept them in but would have been equally happy just deleting them.

Let’s Get Started - Talk About Testing
#

Test Code is more important than Production Code
#

I’ve have always valued test code over production code.

Well written specs tell you in plain human language what each and every part of the application should do and also help think about edge cases.
The test code ensures that each and every every part of the production code conforms to that specification so even if the code is a bit pants you can live with it knowing it works. Not all code has to be match fit.
Test code is a highly effective enemy of technical debt. When requirements change or you just can’t bear to look at your pants-code any more having tests enables you to cut out anything that’s no-longer fit for purpose quickly and at low cost.

What does an automated test look like ?
#

In case you’re not familiar with automated tests here’s a simple example in Ruby in which a fictitious class either returns laughter or nothing dependent on what is passed to Joke#deliver method.


describe Joke do

  let(:gag){ "Two fish in a tank, one says to the other - you drive I'll man the guns." }
  let(:lecture){ "When I was your age we never used to..." }

  subject{ described_class.new }

  it "returns laughter when the argument is funny" do
    expect(subject.deliver(gag)).to eq "laughter"
  end

  it "returns nil when the argument is NOT funny" do
    expect(subject.deliver(lecture)).to be nil
  end
  
  it 'raises an ArgumentError whenever nil is passed as an argument' do
    expect{
      subject.deliver(nil))
    }.to raise_error(ArgumentError)
  end
  
end

What is Test Driven Development ?
#

TDD is a process where a human defines the problem that code should solve before writing that code. For anyone outside software development that may seem painfully obvious but it’s something surprisingly few software developers actually do - it’s all too easy to start writing code without the faintest clue what finished looks like and I’m as guilty as the next person.

The risks associated with unplanned coding can be mitigated by writing automated tests. I always tell developers “I don’t care how you do it as long as you write the f*cking tests” and I do live by that myself. I tend to dip my toe into the problem by writing a little bit of production code, then a few tests, then back to production code, all the time wobbling forward in some three-legged race with my tests one side and production code the other. From my limited experience I do feel it likely that I would have been a lot quicker and code more elegant had I the discipline to practice TDD.

Done well TDD should be an elegant flow of human ideas into code:

Define the test: “returns laughter when given a gag”
Write the test: expect(subject.deliver(gag)).to return("laughter")
Write the production code.

It occured to me that this isn’t so different to using an LLM to write code.

Prompt the LLM by telling it what you want the code to achieve.
The LLM writes the test code.
The LLM writes the production code

What I like about AI coding is the way that it FORCES me to describe what I want rather than gives me the option. I suspect this is a fundamental differentiator. Devs that like to think about what they want to build will find AI adoption easier than those that start writing code and work it out as they go along.