
“Our global audit is due in three months. Nobody has done a bank reconciliation in the last year in any of our seven countries where we employ 3,000 employees. Oh, and we may be missing some inventory.”
This was the introduction from the CFO when I took my first job at an international NGO coming out of my Master’s in Finance. I approached it with the excitement of a fresh-faced graduate out in the world for the first time. Little did I know the seven-year journey that awaited.
Over a long career, I realized this wasn’t unique to NGOs. I’ve noticed data quality patterns across SaaS migrations, inventory turnarounds, and ERP implementations across corporates, tech, and mid-size companies. And similarly, how a company can implement automations quickly using AI to move on to greener pastures.
First, let’s get to the different levels of data issues. I call these the five circles of data integrity hell.
Identifying the Five Circles of Data Integrity Hell
Circle 1 - Incomplete Data: You open up your computer and log into… an Access database with only 10% of the dimensions filled out and many missing a debit or credit. Not a true accounting system.
Circle 2 - Best Guess Data: The accounting system is there. But the data is completely jumbled and each transaction has a description like: CREDIT INCOMING TRANSFER, TRANSFER SPEI TO: J. GARCIA.
Circle 3 - No Subledgers: Just one big G/L.
Circle 4 - Lurking Holding Accounts: Everything looks okay, until you run a trial balance and it only balances on account of a HOLDING BALANCE ACCOUNT.
Circle 5 - Opaque Numbers: The numbers balance but you have no idea what makes them up. You can’t get a report to drill down without running separate queries that require insider knowledge or arcane tables and exceptions to every rule.
You can’t make decisions in any of these stages.
In this article, we walk through how today’s companies escape this scourge by leveling up their AI automation.
This is the first part of our series on building finance tools to go From No Data to a Top 10% FP&A team. What took many years in the past can be done in months with proper AI use. We will continue building upon our previous guide focusing on AI with tactical advice on using tools now. “AI won’t solve everything,” but it can make many tasks quicker now.
Let’s get to work.
Getting out of the Five Circles of Data Integrity Hell
Circle 1 - Incomplete Data
When transactional data misses key information you are in this stage. This is the stage at which I arrived at my first company. So, most transactions lacked departments, countries, etc. This means you need to check whether the data has:
Missing dimension classifications
Discontinued data dimensions
Negatives where there should be positives
N/As
Duplicate transactions
When I usually start looking at a data mess, of course, the answer to this analysis is yes. I used to spend months cleaning it up.
AI Quick Win: You take your information and feed it to one of the big LLMs (Claude, ChatGPT, etc.) and check before closing. Make separate queries for each of the questions above and then fix the transactions manually.
Next Step: Once you have a platform in place, each one of these checks should be automated on the frontend through data validation and outliers should be checked daily through an AI review that connects directly into your system.
Circle 2 - Best Guess Data
When people are just coding data off bank statements, you are in this stage. The descriptions aren’t matching the purpose or giving you any recognizable information (much less vendor). People are just guessing at dimensions and purpose.
Practically, this means manual follow-ups. Chased receipts. Excel filters. Sorted top to bottom, side to side. Just trying to get as close as possible to have any confidence in transactions.
To fix this, I would rinse and repeat until the entire G/L lines had dimensions that matched descriptions that matched support. That’s your starting point. Sure, you want to match sub-ledgers, and we will get to this next, but correct transaction coding saves an incredible amount of time going forward.
AI Quick Win: You gather the receipts in your system. Then, you run the standard LLM queries to see if the receipts match the transaction. Of course, this is easier if you have a tool to facilitate this.
Next Step: All of your transactions have attachments that are checked upon upload by an AI tool to verify accuracy. Many modern A/P and expense tools have this feature.
Circle 3 - No Subledgers
When you don’t have subledgers, you are stuck in the third layer. This means you have to build pseudo-subledgers manually through Pivot Tables by carving the G/L into AP, AR, Inventory/COGS, Fixed Assets, and Payroll/Tax activity. To do this, I would create a new “counterparty/vendor” field and produce rollups and aging-style reports so we could support the balance sheet (who we owed, who owed us, what inventory we had, etc.). It was slow, but it was the only way to create tie-outs without a real subledger.
AI Quick Win: You work with one of the many coding agents, and ask it to build you code in Python or Excel that does this automatically. Tell it what you want, iterate. You want to make this a Python script or VBA code so that it produces consistent data over time.
Next Step: You use AI to code transactions (invoices, POs, etc) on the front-end to a true subledger. Many of the cutting edge invoicing/expense systems already offer this. (Or skip-level to an upgraded ERP such as Light)
Circle 4 - Lurking Holding Accounts
When the only reason your financial statements balance is because they are in accounts nobody understands, doesn’t want to look at, or will deal with later, you are stuck here. I would have to monitor these accounts like a hawk and discard all the movements in and out to isolate the dangerous transactions. Only by going to this detail can you really check if the activity deserves its status as temporary.
AI Quick Win: Provide an automated summary of your holding accounts each day and what conditions need to be met to clear the account. Act accordingly.
Next Step: Set up notifications to remind you when to clear each transaction in the holding account and then clear them out. You don’t necessarily need AI for this, but it can help you set up the system to do so.
Circle 5 - Opaque Numbers
You are here when the numbers are a decent quality, but it takes forever to understand what lurks beneath. I used to take the information out of the system and pivot it around in an Excel spreadsheet, clicking on each holding account to see the transactions that were underneath. This worked for datasets at smaller companies. But once I was dealing with hundreds of thousands of transactions a month, it required painful data stitching.
Reconciliation: every view tied back to the trial balance (same totals, same periods).
Quality & consistency: drill-down transactions matched the label, with vendors and transaction types coded consistently and dimensions fully populated.
Exceptions: outliers, negatives, wrong-period activity, and clearing/suspense balances that should have been shrinking.
Eventually, the cycle would get tighter. Build the report, drill down, fix coding, refresh, repeat. Each iteration was less work than the previous.
AI Quick Win: Automatically generate the reporting pack with drill-downs baked in, ensure every report ties back to the trial balance, and flag exceptions with a prioritized fix list.
Final Step: You have one system that automatically reviews your entire G/L, notifies you, and summarizes the details underneath in clear language.
In conclusion.
You can’t evolve without a strong data foundation. It took me years to get there at my first company. Now it can be done in months.
The next task is to use these numbers to create effective management reports. See you next week.









