What is Claude Code? Tested on a Real 14,000-Line SaaS

Every "What is Claude Code" tutorial shows a snake game or a todo list. That's not the test. The test is whether the tool can walk into a real codebase — one with paying customers, tax law, and 87 files written over two years — and not break anything. I ran that test on Fakturko, my Serbian invoicing platform. Here's what worked, what didn't, and the three questions that tell you everything.
The mental model: it's not autocomplete, it's a junior dev who already read the repo
Claude Code is a terminal-based agent that lives inside your project folder. It reads files, edits files, and runs commands. No chat window, no copy-pasting stack traces, no "here's my file, what do you think?" workflow.
The shift that matters: stop thinking of it as autocomplete on steroids. Start thinking of it as a junior developer who has already read your entire repository, remembers everything, and can execute shell commands on your behalf. That framing changes how you talk to it.
Autocomplete edits where your cursor is. An agent edits where the change actually needs to happen. Those are different products solving different problems.
Install is one line:
npm install -g @anthropic-ai/claude-code
cd ~/projects/fakturko
claude
That's it. It indexes the project on first run and you're in a REPL prompt sitting inside the codebase.
The test bed: Fakturko, ~14,000 lines, 87 files, real customers
For this to be a fair test, the codebase has to be one I didn't write last week. Fakturko is my invoicing SaaS for Serbian small businesses. Quick stats:
- ~14,000 lines of code across 87 files
- Laravel backend, Blade + Alpine frontend, MySQL
- Live VAT calculation against Serbian tax rules (PDV 20% standard, 10% reduced)
- Paying customers — if VAT breaks, somebody files wrong taxes
I'm going to ask Claude Code three questions, escalating in difficulty. Read, edit, execute.
Question 1 — "Where is Serbian VAT calculated?" (the reading pass)
I open the terminal in the repo root and type:
> where is Serbian VAT calculated in this codebase?
Nine seconds later, the response comes back with file paths and line numbers:
app/Services/VatCalculator.php:42— maincalculate()functionapp/Helpers/TaxRates.php:18— rate constantsapp/Helpers/RoundingRules.php:7— Serbian-specific roundingapp/Http/Controllers/InvoiceController.php:156— the call site
Four files. Correct. With line numbers.
For context: a junior developer I onboarded last month took 40 minutes to find the same answer, and they missed RoundingRules.php entirely on the first pass.
That's the reading capability. That alone is worth installing the tool — onboarding new contributors, returning to your own code after six months, auditing where business logic lives. You ask in English, you get file paths.
What this replaces
grep -r "vat" .followed by 20 minutes of reading- Asking a teammate who wrote it 18 months ago
- Opening 12 files in your editor and scanning them
Question 2 — "Add a new reduced VAT rate for digital goods" (the editing pass)
Now I want to add a new rate for a specific product category. Watch what an agent does differently from autocomplete.
> add a new reduced VAT rate of 8% for digital goods category
It doesn't immediately open the constants file and append a line. It searches the whole repo first. Here's what it flagged before writing anything:
app/Helpers/TaxRates.php— constantsapp/Services/VatCalculator.php— switch statement that routes by categorydatabase/migrations/2023_04_12_add_vat_rates.php— needs new migrationtests/Unit/VatCalculatorTest.php— fixture file with hardcoded expected values
That fourth one I genuinely forgot existed. The test fixture had three years of expected output values hardcoded. Any edit to rates without touching that file would have broken CI silently — or worse, passed locally and failed in production.
That's the difference. Autocomplete edits where your cursor is. An agent finds every place the change needs to land before it touches anything.
It then proposed a unified diff across all four files. I reviewed, accepted three, rejected one (more on that in the caveats).
Question 3 — "Run the tests" (the execution pass)
This is where most tutorials stop. Claude Code doesn't.
> run the test suite and report what broke
It executes php artisan test. It watches the output. It reports back:
Tests: 47 passed, 1 failed
Failed: VatCalculatorTest::test_digital_goods_reduced_rate
Assertion: expected 108.00, got 108.01
Root cause: rounding applied before VAT instead of after
Suggested fix: swap order in VatCalculator::calculate() lines 51-54
It executed, observed, diagnosed, and proposed — in one loop. I did not copy a stack trace into a chat window. I did not paste error output. The agent ran the command, read the result, and connected it back to the code it had just changed.
This is the loop that matters in real work:
- Edit
- Run
- Observe
- Diagnose
- Re-edit
For years, AI tools could only do step one. Now the whole loop happens without me.
The honest caveats — where it cost me money and almost shipped a bug
Two things you need to know before installing this on a client's codebase.
It costs real money. Token usage adds up fast when an agent is reading 87 files to answer one question. My session above — three questions, four file edits, one test run — cost roughly $1.80 in API tokens. That's fine for high-value work. It's not fine if you're using it as autocomplete on every keystroke. Budget accordingly. A full afternoon of heavy refactoring on Fakturko has run me $15–25.
It is not magic, and it does not know your domain. On the VAT change, the agent proposed a solution that was syntactically perfect, passed the linter, and used the wrong rounding rule for Serbian tax law. Serbian VAT rounds to two decimals after the rate is applied to the line total, not before. The agent's first proposal rounded the rate constant itself, which gives off-by-one-cent errors on certain invoice totals.
I caught it because I know the domain. If I didn't, it would have shipped, and a customer's tax filing would have been off by a few cents — small enough to slip through review, large enough to get a phone call from an accountant.
The lesson: the agent is a fast, well-read junior. You are still the senior. You still own correctness.
When I trust it, when I don't
- Trust: navigation, finding code, refactoring across files, writing tests, running commands
- Verify carefully: anything touching money, tax, dates, timezones, or regulatory rules
- Don't outsource: business-logic decisions, schema design, security boundaries
Why bizflowai.io helps with this
A lot of the automation work I do for clients at bizflowai.io starts exactly here — walking into a codebase or operations stack somebody else built, mapping what's there, and making surgical changes without breaking what works. Claude Code is one of the tools in that workflow now, alongside the custom agents we deploy for invoicing, lead-gen, and Gmail-Telegram operations. The reading capability in particular is what makes it possible to take on legacy projects without weeks of discovery overhead.
Bottom line
For navigating, understanding, and making safe edits across files you didn't write, there's nothing else close right now. Cursor, Copilot, ChatGPT with file uploads — all useful, all toys by comparison once you've worked with an actual agent that can read your repo, edit across files, and run your test suite in one loop.
This is the first AI coding tool I've installed on a paying client's codebase. It will not be the last, but it is the one I'd start with today.
Frequently asked questions
What is Claude Code?
Claude Code is a terminal-based AI agent that reads, edits, and runs commands inside your codebase. Unlike a chat window or autocomplete tool, it operates directly in your project folder, navigating files on its own. Its strongest feature is reading and understanding existing code — locating functions, helpers, controllers, and tests across a repository — not just generating new code at your cursor position.
How is Claude Code different from autocomplete tools?
Autocomplete edits code where your cursor is. Claude Code edits where the change actually needs to happen. When asked to add a new VAT rate, it searched the entire repo, identified the rate constants file, validation logic, database migration folder, and test fixtures before writing any code. It also executes commands like running tests, then diagnoses failures and proposes fixes in a single loop.
Why does code reading matter more than code writing for AI agents?
On production projects, the real bottleneck isn't typing — it's understanding what already exists before changing it. In a 14,000-line invoicing codebase, Claude Code located the Serbian VAT calculation across four files with line numbers in nine seconds. A junior developer onboarded the prior month took 40 minutes to find the same answer. Reading speed is what makes safe edits possible in unfamiliar code.
What are the limitations of using Claude Code on a real project?
Two main caveats: cost and domain knowledge. API tokens add up quickly on large codebases, so budget accordingly. Second, it's not infallible — on a Serbian VAT change, it produced syntactically perfect code that used the wrong rounding rule for local tax law. You still need a human who understands the business domain to catch logic errors the agent can't recognize.
When should I use Claude Code instead of other AI coding tools?
Use Claude Code when you need to navigate, understand, and make safe edits across files you didn't write — especially in large or unfamiliar codebases. It's suited for production work where finding all affected locations matters, including tests and migrations. For simple line-by-line completions inside a file you already know, traditional autocomplete tools may be cheaper and sufficient.
Want more like this?
I publish practical AI automation, GenAI engineering, and faceless content workflows on YouTube every week.
Subscribe to bizflowai.io on YouTube — never miss a new tutorial.
Planning an AI automation project or need a second opinion on your architecture?
Connect with me on LinkedIn — Lazar Milicevic, GenAI Engineer & bizflowai.io Founder.
Visit bizflowai.io for our services, case studies, and AI consulting.
Frequently asked questions
What is Claude Code?
Claude Code is a terminal-based AI agent that reads, edits, and runs commands inside your codebase. Unlike a chat window or autocomplete tool, it operates directly in your project folder, navigating files on its own. Its strongest feature is reading and understanding existing code — locating functions, helpers, controllers, and tests across a repository — not just generating new code at your cursor position.
How is Claude Code different from autocomplete tools?
Autocomplete edits code where your cursor is. Claude Code edits where the change actually needs to happen. When asked to add a new VAT rate, it searched the entire repo, identified the rate constants file, validation logic, database migration folder, and test fixtures before writing any code. It also executes commands like running tests, then diagnoses failures and proposes fixes in a single loop.
Why does code reading matter more than code writing for AI agents?
On production projects, the real bottleneck isn't typing — it's understanding what already exists before changing it. In a 14,000-line invoicing codebase, Claude Code located the Serbian VAT calculation across four files with line numbers in nine seconds. A junior developer onboarded the prior month took 40 minutes to find the same answer. Reading speed is what makes safe edits possible in unfamiliar code.
What are the limitations of using Claude Code on a real project?
Two main caveats: cost and domain knowledge. API tokens add up quickly on large codebases, so budget accordingly. Second, it's not infallible — on a Serbian VAT change, it produced syntactically perfect code that used the wrong rounding rule for local tax law. You still need a human who understands the business domain to catch logic errors the agent can't recognize.
When should I use Claude Code instead of other AI coding tools?
Use Claude Code when you need to navigate, understand, and make safe edits across files you didn't write — especially in large or unfamiliar codebases. It's suited for production work where finding all affected locations matters, including tests and migrations. For simple line-by-line completions inside a file you already know, traditional autocomplete tools may be cheaper and sufficient.