Previously #
In the previous article we started our TDD adventure with a simple utility class for encapsulating the value and context of a string extracted by AWS Textract Expense so that we don’t just get the text value but understand what that value means eg. Name, Address.
Create the Expense Document class #
Now lets build the Expense Document class. In a nutshell this class wraps 8000+ lines of json code and will enable us to access that data in a clean, DSL kind of way.
More details on the why can be found in part 3 of this series.
Prompt to create the Expense Document spec #
Create an rspec test file for a class called ExpenseDocument. The spec file should have a subject that takes an initializer with the data from spec/fixtures/document.json parsed into a hash as the argumant. The argument should be called document and be exposed as a read only attribute
Unsurprisingly it jumped the gun and created an expense document class too. I’m OK with that.

require 'rails_helper'
RSpec.describe ExpenseDocument, type: :model do
let(:fixture_path) { Rails.root.join('spec/fixtures/document.json') }
let(:document_hash) { JSON.parse(File.read(fixture_path), symbolize_names: true) }
describe '#initialize' do
subject(:expense_document) do
described_class.new(document: document_hash)
end
##################################################################
# NOT SO GOOD: this test is a little unneccessary, it's really
# testing the test - i.e. creation of the subject.
##################################################################
it 'initializes with a document argument' do
expect(expense_document).to be_an_instance_of(ExpenseDocument)
end
it 'exposes document as a read-only attribute' do
expect(expense_document.document).to eq(document_hash)
end
it 'has the document attribute' do
expect(expense_document).to respond_to(:document)
end
it 'does not allow setting the document attribute' do
expect(expense_document).not_to respond_to(:document=)
end
end
end
class ExpenseDocument
attr_reader :document
def initialize(document:)
@document = document
end
end
Prompt to create the remove_geometry method! #
This is where we get into actual code and logic not just simple getters/setters. A large part of any TextractDocument is geometry data that describes the location of the block on the receipt that was removed. I don’t need this and it’s bloating the size of the files I store and use so I want to remove it.
Create a test and method ExpenseDocument#remove_geometry! The method must remove every instance of geometry from ExpenseDocument#document created by the fixture spec/fixtures/document.json

describe '#remove_geometry!' do
subject(:expense_document) do
described_class.new(document: document_hash)
end
################################################################
# GREAT: this isn't any easy thing to test for and I think it does
# a great job. It's created a geometry_count_before that iterates
# through the document looking for 'geometry' keys. It checks these
# keys exist before the method is called and none after.
################################################################
it 'removes all geometry keys from the document' do
# Verify geometry exists before removal
geometry_count_before = count_geometry_keys(expense_document.document)
expect(geometry_count_before).to be > 0
expense_document.remove_geometry!
# Verify all geometry keys are removed
geometry_count_after = count_geometry_keys(expense_document.document)
expect(geometry_count_after).to eq(0)
end
################################################################
# GOOD: this is a test to ensure that the method returns
# self for method chaining. I didn't ask it to do this though
# a similar class in my existing app does the same thing...
################################################################s
it 'returns self for method chaining' do
result = expense_document.remove_geometry!
expect(result).to eq(expense_document)
end
###################################################################
# NOT SO GOOD: these next two test are overkill. Effectively they're testing
# the same thing as the 'removes all geometry keys from the document' above
# but in a more explcit way.
# Normally I would favour a more explicit test but the
# 'removes all geometry keys from the document' test comprehensively
# covers the entire document and there is always a chance that geometry
# data exists outside blocks, summary_fields and line_items
###################################################################
it 'removes geometry from blocks' do
expense_document.remove_geometry!
expense_document.document[:blocks]&.each do |block|
expect(block[:geometry]).to be_nil
end
end
it 'removes geometry from nested structures' do
expense_document.remove_geometry!
# Check summary fields
expense_document.document[:summary_fields]&.each do |field|
expect(field[:geometry]).to be_nil
expect(field.dig(:value_detection, :geometry)).to be_nil
expect(field.dig(:label_detection, :geometry)).to be_nil
end
# Check line item groups
expense_document.document[:line_item_groups]&.each do |group|
group[:line_items]&.each do |item|
item[:line_item_expense_fields]&.each do |field|
expect(field[:geometry]).to be_nil
expect(field.dig(:value_detection, :geometry)).to be_nil
expect(field.dig(:label_detection, :geometry)).to be_nil
end
end
end
end
##################################################################
# GREAT: it's created a method to iterate through every key in the
# hash counting geometry keys
##################################################################
def count_geometry_keys(obj, count = 0)
case obj
when Hash
count += 1 if obj.key?(:geometry)
obj.each_value { |v| count = count_geometry_keys(v, count) }
when Array
obj.each { |item| count = count_geometry_keys(item, count) }
end
count
end
end
But wait, I spotted a problem… #
The count_geometry_keys method is looking for a key :geometry and I’m sure my Textract hash has string keys.
def count_geometry_keys(obj, count = 0)
case obj
when Hash
count += 1 if obj.key?(:geometry)
obj.each_value { |v| count = count_geometry_keys(v, count) }
when Array
obj.each { |item| count = count_geometry_keys(item, count) }
end
count
end
The problem starts here where the test chooses to symbolize_names: true not sure why it did that. It makes the code very brittle as every time a document object gets passed to ExpenseDocument the caller has to remember to symbolize the keys.
RSpec.describe ExpenseDocument, type: :model do
let(:fixture_path) { Rails.root.join('spec/fixtures/document.json') }
let(:document_hash) { JSON.parse(File.read(fixture_path), symbolize_names: true) }
I could:
- Change the code so that it doesn’t symbolize_names
- Set the document as a Rails HashWithIndifferentAccess which doesn’t care whether the key is a string of a symbol.
Option 2 is the more robust so I will tell Claude to make the change - TBH normally I would do this by hand by I’m testing how much Claude can do.
Prompt to fix by telling Claude to use the HashWithIndifferentAccess #
Change the assignment of the document argument in the ExpenseDocument initializer so that the document is a HashWithIndifferentAccess

################################################################################
# GOOD: I'm glad I asked Claude to do this rather than make the change by hand
# I was unaware that Hash in Rails has the #with_indifferent_access method.
# I would have created a less elegant
#
# @document = HashWithIndifferentAccess.new(document)
################################################################################
def initialize(document:)
@document = document.with_indifferent_access
end
It also upated the specs. TBH the second test is a bit irrelevant as it’s basically testing the HashWithIndifferentAccess class which isn’t part of our code. I’ll leave it in but it could easily be deleted.
it 'exposes document as a read-only attribute' do
expect(expense_document.document).to be_a(ActiveSupport::HashWithIndifferentAccess)
end
it 'allows accessing document keys with both symbols and strings' do
expect(expense_document.document[:blocks]).to eq(expense_document.document['blocks'])
end
In a nutshell #
- NOT SO GOOD: Claude created a test that was effectively testing its own test code.
- GREAT: Claude impressed me with a test that ensured there were geometry objects present in the hash, ran the remove_geometry! method then ensured there were none using the same code.
- GREAT: It created a utility method that iterated through all the keys in the hash counting the number of geometry objects and re-used this - see above.
- GOOD: it created a separate test to ensure the return value from the method was the subject. By making this separate Claude ensured a single test only tested for one thing.
- NOT SO GOOD: it created a couple of tests that, logically, were testing the same thing as a previous test.
- GOOD: I spotted a flaw where Claude expected the test hash to have symbolic keys when the reality is they are likely to be strings. When prompted to fix this using a HasWithIndifferentAccess it used a neat solution that I didn’t know existed.
Up Next #
I used Claude & TDD to create tests and methods to implement the #tables and #fields methods.