In this part I write tests for existing code rather than attempt the full TDD cycle of problem statement > test > code. If you want to jump straight to TDD then go to the next section. Things did get interesting even with this simple start so do read if you can.
[toc]
Baby Steps - Testing without TDD #
When it comes to writing tests I’m an unashamed pedant but that doesn’t mean I always write them. Tests are at their best when you know what you want to build but when you’re exploring something new they get in the way:
- How do you write a test for something when you have no idea what that something looks like ?
- As you discover what that something does look like you’ll probably have to go back and change things quite often so tests can get in the way.
I’ve been working on a side-hustle, expensehound.com. It’s a Ruby on Rails 8.1 app that takes bulk uploads of images, runs them through amazon textract, sends the output to ChatGPT then cleans the data before putting all that through our own, home baked python based text classifier.
This functionality doesn’t sit naturally within a web framework like Rails which is designed to handle single http request rather than orchestrate a series of multiple, long running data tasks. I had to put much of the code into background jobs chained together by events but Rails 8.1 really is amazing for really rapid development so my intention was to create a POC in Rails then move the data heaving lifting out.
What started as a spike became a system that was surpisingly robust even under heavy load. I was faced with a common software paradox - my prototype was suddenly production ready but lacking essential practices like tests.
If you’re wondering why that’s such a problem, lack of tests make it hard to change anything without risk of breaking everything. Code in production with real users but without tests is hard to change without risking serious reputational damage.
The Problem #
I have a class CsvUpload, that represents a csv file uploaded to the site by a customer. Each row in the csv will generate a CsvRecord that has an attached image. Each image is downloaded through its own background job so I wrote a method that would tell me if all of the images have been downloaded - basically does the number csv records with a download match the number of csv records.
This is the method:
has_many :csv_records, dependent: :destroy
def receipts_downloaded?
csv_records.receipt_downloaded.count == csv_records.count
end
and in the child csv_record class there is a scope:
scope :receipt_downloaded, -> { joins(:expense_image_attachment) }
Prompt to write a spec for the existing receipts_downloaded? Method #
My prompt was deliberately simple just to make Claude think:
write a spec for the csv upload model that tests the receipts_downloaded? method
The results weren’t bad at all. It wasn’t perfect, I’ve written the issues below this code:
# == Schema Information
#
# Table name: csv_uploads
# Database name: primary
#
# id :integer not null, primary key
# llm_complete :boolean
# ocr_complete :boolean
# created_at :datetime not null
# updated_at :datetime not null
# user_id :integer
#
require 'rails_helper'
RSpec.describe CsvUpload, type: :model do
describe '#receipts_downloaded?' do
######################################################################
# BAD: it's hasn't used
# subject { described_class.new }
# to create an instance of the class being tested but an instance of csv_upload factory:
# let(:csv_upload) { create(:csv_upload) }
######################################################################
let(:csv_upload) { create(:csv_upload) }
######################################################################
# GOOD: it's picked up that I have csv_record factory and used that:
# create(:csv_record, csv_upload: csv_upload)
# github co-pilot seemed to prefer mocks over test fixtures
######################################################################
context 'when all csv_records have expense_image attached' do
it 'returns true' do
csv_record_1 = create(:csv_record, csv_upload: csv_upload)
csv_record_2 = create(:csv_record, csv_upload: csv_upload)
csv_record_3 = create(:csv_record, csv_upload: csv_upload)
csv_record_1.expense_image.attach(
io: StringIO.new('fake image data 1'),
filename: 'receipt1.jpg',
content_type: 'image/jpeg'
)
csv_record_2.expense_image.attach(
io: StringIO.new('fake image data 2'),
filename: 'receipt2.jpg',
content_type: 'image/jpeg'
)
csv_record_3.expense_image.attach(
io: StringIO.new('fake image data 3'),
filename: 'receipt3.jpg',
content_type: 'image/jpeg'
)
expect(csv_upload.receipts_downloaded?).to be true
end
end
########################################################################
# GREAT: it's got the mocking of attachments spot-on:
# csv_record_1.expense_image.attach(
# io: StringIO.new('fake image data'),
# filename: 'receipt1.jpg',
# content_type: 'image/jpeg'
# )
# it took me AGES to work that one out for myself first time I did it.
#########################################################################
context 'when only some csv_records have expense_image attached' do
it 'returns false' do
csv_record_1 = create(:csv_record, csv_upload: csv_upload)
csv_record_2 = create(:csv_record, csv_upload: csv_upload)
csv_record_3 = create(:csv_record, csv_upload: csv_upload)
csv_record_1.expense_image.attach(
io: StringIO.new('fake image data'),
filename: 'receipt1.jpg',
content_type: 'image/jpeg'
)
csv_record_2.expense_image.attach(
io: StringIO.new('fake image data'),
filename: 'receipt2.jpg',
content_type: 'image/jpeg'
)
# csv_record_3 has no attachment
expect(csv_upload.receipts_downloaded?).to be false
end
end
context 'when no csv_records have expense_image attached' do
it 'returns false' do
create(:csv_record, csv_upload: csv_upload)
create(:csv_record, csv_upload: csv_upload)
expect(csv_upload.receipts_downloaded?).to be false
end
end
context 'when csv_upload has no csv_records' do
it 'returns true' do
expect(csv_upload.csv_records.count).to eq(0)
expect(csv_upload.receipts_downloaded?).to be true
end
end
context 'verifying the receipt_downloaded scope behavior' do
it 'correctly counts records with attachments using joins' do
csv_record_with_image = create(:csv_record, csv_upload: csv_upload)
csv_record_without_image = create(:csv_record, csv_upload: csv_upload)
csv_record_with_image.expense_image.attach(
io: StringIO.new('fake image data'),
filename: 'receipt.jpg',
content_type: 'image/jpeg'
)
expect(csv_upload.csv_records.count).to eq(2)
expect(csv_upload.csv_records.receipt_downloaded.count).to eq(1)
expect(csv_upload.receipts_downloaded?).to be false
end
end
context 'with multiple file types' do
it 'returns true when all receipts are downloaded regardless of file type' do
csv_record_1 = create(:csv_record, csv_upload: csv_upload)
csv_record_2 = create(:csv_record, csv_upload: csv_upload)
csv_record_1.expense_image.attach(
io: StringIO.new('fake pdf data'),
filename: 'receipt.pdf',
content_type: 'application/pdf'
)
csv_record_2.expense_image.attach(
io: StringIO.new('fake png data'),
filename: 'receipt.png',
content_type: 'image/png'
)
expect(csv_upload.receipts_downloaded?).to be true
end
end
end
end
- A Csv Record will always have an attached image (Claude wasn’t to know that)
- I feel strongly about the use of the variable subject to describe instances of the class being tested NOT
let(:csv_upload) { create(:csv_upload) }when I look at a test and seesubjectnot just another variable name then I know what is being tested.
Prompt to fix the subject issue and clarify that every csv row will have a receipt image. #
A csv_record will always have an image attached, use subject instead of the variable csv_record
This is where it started to get really interesting:
The first thing it did was suggest an update to my FactoryBot csv_record factory which generates test data. The suggestion was to add an after(:create) that ensured every object had an attached image. Good practice that makes sense and something I had overlooked:
# == Schema Information
#
# Table name: csv_records
# Database name: primary
#
# id :integer not null, primary key
# amount :string
# amount_cents :decimal(, )
# category :string
# country :string
# currency_code :string
# date :date
# employee_name :string
# expense_text :string
# expense_type :string
# job_status :string
# llm_complete :boolean
# merchant :string
# ocr_complete :boolean
# status :string
# url :string
# warnings :text
# created_at :datetime not null
# updated_at :datetime not null
# csv_upload_id :integer
# employee_id :string
# expense_id :string
# job_id :string
#
FactoryBot.define do
factory :csv_record do
csv_upload
amount { "123.45" }
category { "Travel" }
country { Faker::Address.country }
currency_code { Faker::Currency.code }
date { Date.civil(2025, 11, 11) }
employee_id { Faker::Code.asin }
employee_name { Faker::Name.name }
expense_id { Faker::Code.asin }
expense_text { Faker::Lorem.paragraph }
expense_type { "Business" }
job_id { Faker::Internet.uuid }
job_status { "SUCCEEDED" }
merchant { Faker::Company.name }
status { "Approved" }
url { "https://example.com/some-receipt.pdf" }
ocr_complete { true }
llm_complete { nil }
association :customer
##########################################################################
# GREAT: once I'd told it that a csv record always has an attached image
# it added this to the factory
##########################################################################
after(:create) do |csv_record|
csv_record.expense_image.attach(
io: StringIO.new('fake receipt image data'),
filename: 'receipt.jpg',
content_type: 'image/jpeg'
)
end
end
end
It then updated the csv_upload spec it had created removing all tests that assumed some or all csv_records had no attached image reducing it to something much more succinct and to the point. It also replaced csv_upload with subject as I;d asked:
# == Schema Information
#
# Table name: csv_uploads
# Database name: primary
#
# id :integer not null, primary key
# llm_complete :boolean
# ocr_complete :boolean
# created_at :datetime not null
# updated_at :datetime not null
# user_id :integer
#
require 'rails_helper'
RSpec.describe CsvUpload, type: :model do
describe '#receipts_downloaded?' do
subject { create(:csv_upload) }
########################################################################################
# GREAT: it has removed the explicit creation of attachments now they are in the factory
########################################################################################
context 'when all csv_records have expense_image attached' do
it 'returns true' do
create(:csv_record, csv_upload: subject)
create(:csv_record, csv_upload: subject)
create(:csv_record, csv_upload: subject)
expect(subject.receipts_downloaded?).to be true
end
end
context 'when csv_upload has no csv_records' do
it 'returns true' do
expect(subject.csv_records.count).to eq(0)
expect(subject.receipts_downloaded?).to be true
end
end
context 'when csv_upload has multiple csv_records' do
it 'correctly counts all downloaded receipts' do
create_list(:csv_record, 5, csv_upload: subject)
expect(subject.csv_records.count).to eq(5)
expect(subject.csv_records.receipt_downloaded.count).to eq(5)
expect(subject.receipts_downloaded?).to be true
end
end
end
end
###################################################################################################
# GREAT: it has removed all of the tests based on an assumption that a csv record has no attachment
#
# NOT SO GREAT: it's used 'magic numbers' e.g.
# expect(subject.csv_records.count).to eq(5)
# I'd prefer a variable 'csv_record_count' that at least describes what 5 means:
# let{csv_record_count}{ 5 }
# expect(subject.csv_records.count).to eq(csv_record_count)
# But I can live with 5 so will let it go.
###################################################################################################
lastly it ran the tests for me ensuring they all passed.
The good & the bad #
I’d give Claude 9 out of 10. The good is mostly very good and the bad isn’t that bad.
Good #
- Claude detected that I had a FactoryBot factory for my csv record class rather make up its own test data
- Claude created some good test code to mock an attached image - it took me ages to work that out for myself a few weeks ago.
- When I told Claude that every csv record will have an image it moved the mock image attachment code into the csv record factory and removed the tests that assumed there was no image.
Bad #
- It didn’t use the standard ‘subject’ variable name for the test being described but did fix that when prompted to do so.
- Seems fond of magic words and numbers e.g.
expect(subject.csv_records.count).to eq(5)what does 5 mean ?
Up Next #
In the next article I start the TDD experiment by describing the Problem & Solution Space we need to solve and design.