We are no longer in the POCs phase, 2026 is the year of AI in production. According to Anthropic's recent research, 57% of enterprises deployed AI agents to production in 2025. Another 81% plan to tackle more complex use cases in 2026.
And when you look at where organizations are seeing the highest impact, data analysis tops the list at 60%, followed by process automation at 48%. This makes sense, AI agents excel at synthesizing information, identifying patterns, and executing workflows that would take humans hours or days.
Additionally, data analysis is a fundamental building block. Think about agents that need to make decisions in the organization, operate on their own and perform tasks. Same as humans, they need data and insights to act. High quality AI data analysis will power a endless valuable agentic workflows.
The potential is clear.But here's the question too few organizations are asking: What happens when AI agents inherit your data problems?
Because they will inherit them. Every single one.
AI Is New. The Problems Aren't.
If you've worked in data for any length of time, you've heard the same challenges described under different names: data fragmentation, low data adoption, or my favorite so far - “the analytics graveyard”. Organizations invest heavily in data infrastructure, warehouses, pipelines, transformation layers, BI tools, but struggle to get ROI from that investment.
The root causes are familiar:
Low trust. People don't believe the data is accurate. They've been burned before by reports that didn't match reality, so they build their own spreadsheets, run their own queries, double-check everything manually.
Poor discovery. The data exists somewhere, but people can't find it. Tribal knowledge determines who knows what's available. New team members spend weeks figuring out which tables to use.
Silos. Different teams have different "truths." Marketing's revenue number doesn't match Finance's. The data warehouse has three customer tables, and nobody remembers why.
Persona gaps. The data platform was built by engineers, for engineers. Business users can't access what they need without filing a ticket and waiting.
These problems have persisted for years. Organizations learned to work around them. Analysts developed intuition about which data to trust. Stakeholders accepted that getting answers takes time. Data teams became skilled at navigating the mess.
But here's the critical point for your AI strategy: Humans can work around the mess. AI agents can't.
When an analyst encounters a suspicious number, they pause. They investigate. They ask a colleague, "Does this look right to you?" They apply judgment born from experience.
AI agents don't do that. They take the data at face value and act on it. Fast, and at scale.
What Changes with AI
The problems aren't new. But AI fundamentally changes their impact.
Consider the amplification:
Bad data used to mean a wrong report. A dashboard showed incorrect numbers. Someone caught it in a meeting. The data team fixed it. Embarrassing, but contained.
Bad data now means hundreds of wrong AI decisions. An agent automating customer communications sends messages based on stale data. An agent managing inventory reorders products based on corrupted demand signals. An agent pricing deals uses metrics that haven't been updated.Each decision compounds. Each error propagates.
Discovery gaps used to mean slow answers. A business user had a question. It took him time to find the right dataset, or he was dependent on getting an answer from the data team. Frustrating, but manageable.
Discovery gaps now mean hallucinating agents. When an AI agent can't find the authoritative data source, it doesn't wait patiently for help. It makes something up. It infers. It confidently presents fabricated answers as fact. Or it uses the wrong table entirely, one that's been deprecated for months but still exists in the warehouse.
Governance holes used to mean compliance risk. Somewhere in your data platform, PII exists in places it shouldn't. Access controls are inconsistent. A future audit might flag it. A problem for later.
Governance holes now mean automated privacy violations. An AI agent, given broad data access to do its job, inadvertently incorporates sensitive information into its outputs. It shares customer data it shouldn't have access to. It distributes protected information. At scale. Automatically. With no human reviewing each action.
The formula is simple: same problems, but at scale, machine speed, and higher stakes.
AI doesn't create new data problems. But it dramatically amplifies the ones you already have.
The Evidence Is Already Clear
This isn't theoretical. Anthropic's research on enterprise AI adoption reveals that the top barriers organizations face are exactly what you'd expect:
- Data integration challenges: 46%
- Data quality issues: 42%
Data problems as primary obstacle. Not model capabilities. Not compute costs. Not talent gaps. The ability to bring together information from across the organization into a coherent reliable whole that AI agents can use.
These numbers tell a story: organizations are racing to deploy AI agents, then hitting walls their data infrastructure wasn't built to handle.
The shift from the analytics era to the AI era isn't just about new technology. It's about new consumers of your data. In the analytics era, data flowed to humans: analysts, executives, business users. Humans could interpret, question, and compensate for imperfections. In the AI era, data flows to agents: systems that will consume it literally, act on it immediately, and scale its implications exponentially.
The chief of software at Vercel put it bluntly: "If your data layer is a mess of legacy naming conventions and undocumented joins, giving Claude raw file access won't save you. You'll just get faster bad queries."
Faster bad queries. Faster bad decisions. Faster bad outcomes.
That's the amplification effect.
The Responsible Path Forward
Let me be direct: deploying AI agents without strong data foundations is irresponsible.
This isn't about being conservative or slow. Organizations should absolutely pursue AI. The competitive advantages are too significant to ignore. But pursuing AI means taking data foundations seriously in a way that "nice to have" dashboards never demanded.
What does "strong data foundations" actually mean? At minimum:
Availability. Your data is consolidated, discoverable, and accessible to the systems that need it. AI agents can't work across fragmented silos any more than humans can, they just fail faster.
Quality. Freshness is monitored. Issues are detected and resolved quickly. The agents get live visibility into data health, not just data existence.
Governance. Access controls are defined and enforced programmatically. Sensitive data is classified and protected. There are audit trails for what data was used, when, and by whom.
Modeling for AI. Naming conventions are clear. Relationships between entities are documented. Business terms have definitions that agents can understand. Your semantic layer, whether that's dbt models, a metrics layer, or well-documented tables, is rich enough that an AI can navigate it.
This isn't an exhaustive list, and getting there isn't a weekend project. But the organizations succeeding with AI agents share a common trait: they invested in these foundations before scaling AI, not after.
Not Slowing Down. Setting Up to Succeed.
The message here isn't "pump the brakes on AI." The message is "don't set AI up to fail."
The enterprises that will win with AI agents aren't necessarily the ones that deploy fastest. They're the ones whose agents produce reliable results. Whose automated decisions can be trusted. Whose AI-powered workflows actually deliver the promised value instead of creating new categories of problems.
Those enterprises will have addressed their data foundations. Not perfectly, but sufficiently. They'll have moved from tolerating data problems to resolving them. From working around the mess to cleaning it up.
Because AI agents amplify everything. The data ROI and the data problems.The question is: what do you want amplified?
This is the first post in a series on data foundations for AI. Next: what "AI readiness" actually means, a practical framework for preparing your data for autonomous agents.
.avif)


