It’s Tuesday, June 16th: Welcome to another edition of The Byte.

In this essay, Sumon Saha and Balaji Sundara argue that the main reason enterprise AI agents fail is not weak models, poor prompts, or insufficient guardrails, but missing context. For Saha and Sundara, agents produce unreliable or generic outputs when they cannot access the full range of organizational knowledge spread across CRMs, support systems, product analytics, call transcripts, Slack threads, and spreadsheets. The essay frames connected context as the real foundation for enterprise AI success: without it, agents reason from partial signals; with it, they can support faster decisions, improve trust, and create compounding value across workflows.

Editor’s Note

The Real Reason Your AI Agents Keep Getting It Wrong

Introduction

When an AI agent gives a wrong answer, the instinct is to blame the model. Swap in a newer one, tune the prompt, add guardrails. Sometimes that helps. But in enterprise settings, it usually doesn't fix the underlying problem, because the underlying problem isn't the model.

Most enterprise AI failures trace back to something more mundane: the agent was asked a question it couldn't actually answer, because the information needed to answer it was sitting in a system it couldn't reach. A Salesforce record, a Zendesk thread, a Slack message from last Tuesday's incident review, a spreadsheet someone emailed after the last renewal call. The model did exactly what it was built to do. It just worked with incomplete inputs and produced incomplete outputs.

This is the context problem. And it's worth understanding clearly, because it changes what organizations should actually be investing in.

Why Agents Fail When Data Is Siloed?

Enterprise knowledge is distributed by design. Sales uses one CRM. Support uses another ticketing system. Finance has its own warehouse. Product analytics lives somewhere else. When those systems don't talk to each other, the people who need to synthesize across them spend most of their time doing exactly that, manually pulling and reconciling data instead of doing the higher-order work they were hired for.

Writing SQL queries for Relational database, API and JSON based mechanism for NoSQL databases becomes a chore. Agents can ease the job by executing these queries, getting relevant and current information and creating context to provide relevancy and reasoning.

AI agents inherit that same constraint. An agent that only sees Salesforce data will reason like someone who only reads the Salesforce notes. It'll miss the churn signal buried in a support transcript. It'll draft a renewal pitch that ignores what the customer actually complained about in the last QBR.

Consider a concrete scenario: a customer success rep asks an agent why account X is churning. Without connected context, that question triggers a partial answer, maybe it surfaces CSAT scores and open tickets. But it misses the usage drop that product analytics caught two months ago, the sales note about executive turnover, and the transcript from a call where the customer mentioned a competitor. The agent isn't wrong because the model is bad. It's wrong because it's reasoning from a fraction of the available signal.

Multiply that across hundreds of workflows, dozens of agents, and thousands of decisions per day, and you get the pattern most enterprises are actually living with: AI that requires constant human correction, produces generic outputs, and gradually loses organizational trust."I spent so much energy just performing 'professional' that by 3pm I was empty. I wasn't burned out from the work. I was burned out from pretending to be someone who could do the work in the way they expected."

The Infrastructure Trap and Why It's Costing More Than Expected

Most organizations know they have a data fragmentation problem. Many have been trying to solve it for years through ETL pipelines, data lakes, and migrations to platforms like Snowflake or Databricks. These are legitimate investments. They're also slow. Multi-year programs don't align well with the pressure to show AI ROI in the next two quarters.

This creates a trap: organizations delay AI deployments until the data infrastructure is "ready," but the infrastructure readiness goalposts keep moving, and the business need keeps compounding. Teams build agents against whatever data they can access quickly, those agents underperform, and leadership concludes that AI isn't delivering when the real issue is that the agents were deployed against an incomplete foundation.

The actual tradeoff here is between precision and speed. Waiting for a unified data layer delivers cleaner context but defers value. Deploying against connected but imperfect context delivers early wins and organizational learning at the cost of needing to manage edge cases and improve incrementally. Neither choice is obviously right. But the second path is often undervalued because its downsides are visible (agents that sometimes get it wrong) while its upsides are slower to surface (compounding returns as context improves).

One countervailing argument: better models, better prompting, and better retrieval can compensate for context fragmentation. To a point, this is true. Retrieval-augmented generation (RAG) helps. Chain-of-thought prompting reduces hallucination on structured tasks. But these are force multipliers on whatever context the agent can access. They don't create information that isn't there. An agent with a better retrieval system and no access to the renewal call transcript still doesn't know what was said on the renewal call.

What Connected Context Actually Delivers

The organizations getting the most from AI in 2025 aren't the ones with the most agents. They're the ones that invested early in making their organizational knowledge coherent and accessible. That investment compounds in a specific way: every new workflow automation inherits the same context foundation, which means each new agent starts better than the last one, and improvements to the shared context layer make all existing agents smarter simultaneously.

The scale of the underlying problem helps explain why the gains are so pronounced when context is connected. A 2023 McKinsey Global Institute study found that knowledge workers spend roughly 19% of their workweek searching for and gathering information, time that has nothing to do with the judgment calls they were actually hired to make. Salesforce's State of Sales research has consistently found that sales representatives spend less than 30% of their time on actual selling, with the rest absorbed by data entry, research, and internal coordination. These aren't AI-era problems. They're structural inefficiencies that AI agents can eliminate but only if those agents can reach the systems where the relevant data lives.

One deployment that illustrates the gap: a 680-person B2B SaaS company serving mid-market financial services clients ran a 90-day pilot connecting their CRM, support platform, product analytics, and deal room notes into a shared context layer, then deployed agents across customer success and sales operations. Before the pilot, their customer success team averaged 47 minutes to assemble a complete account brief ahead of a renewal call pulling from four separate systems, often finding conflicting data. After, the same brief took under four minutes and included signals the team had previously missed entirely, including product usage trends and unresolved support threads that correlated strongly with churn. Renewal call preparation time dropped by 91%. More concretely: their Q3 gross retention improved by 6 percentage points against the prior year's same cohort, which the CS leadership attributed primarily to earlier and better-informed intervention on at-risk accounts.

The same pattern appeared in their sales org. Reps reclaimed an average of 1.8 hours per day that had previously gone to assembling pre-call research. Pipeline coverage didn't change, but conversion at the proposal stage improved by 14% over the following two quarters, a result the team tracked directly to reps entering late-stage calls with richer account context than competitors who relied on manual prep.

These figures are from a single deployment. Results vary by context quality, workflow complexity, and how mature an organization's underlying data hygiene is. But they're not outliers in direction, they're consistent with what the broader research on information retrieval costs would predict. The table below summarizes the pattern across workflow categories.

The MTTR and compliance numbers in the table deserve particular attention because they're not just efficiency metrics. They're trust signals. When agents consistently route incidents correctly and catch compliance issues systematically, operations teams stop auditing every output by hand. That shift in posture is what actually enables broader automation and not a policy decision, but an earned confidence in the system's reliability. Organizations that reach it find their deployment velocity accelerates: each new workflow moves faster because the skepticism that slowed the earlier ones has been replaced by documented track record.

Compounding Returns Argument and Its Limits

There's a case to be made that context infrastructure is the highest-leverage investment an AI-forward organization can make right now. The argument: agent capabilities are improving rapidly and will continue to improve regardless of what any individual company does. Context, by contrast, is proprietary. Your customer history, your institutional knowledge, your workflow-specific signals which compound over time and can't be replicated by a competitor deploying the same model.

This is mostly right, but it comes with a real caveat. Context infrastructure is only as useful as the agents that can consume it. An organization that builds an excellent context layer but doesn't invest in agent quality, human-in-the-loop design, or escalation pathways will hit a ceiling quickly. A well-connected context foundation paired with agents that have no structured escalation logic produces a different failure mode: confident wrong answers, delivered fast. The two investments aren't either/or, they are sequentially dependent and skipping the second because the first went well is a common and costly mistake.There is a blind spot, however.

Most employer conversations about AI adoption are focused on training, change management, and productivity measurement. They are not asking which employees are already ahead of the curve or why. The result is that neurodivergent workers who have quietly built sophisticated AI workflows are often not recognized as the early adopters and internal experts they are. Their fluency is invisible, in part because they learned it outside the formal structures their employers recognize.

This is not new. Neurodivergent workers have long developed compensatory systems such as elaborate note-taking architectures, time-blocking rituals, and environmental hacks that remain invisible to managers because they don't look like the conventional signals of competence. AI is simply the latest version of this pattern. The accommodation is happening, yet the recognition is not.

Where to Start

The practical implication isn't to wait for perfect data infrastructure before deploying agents. It's to identify the two or three workflows where the context requirements are well-understood, the data is already accessible even if imperfect, and the cost of a wrong answer is visible enough to motivate fast iteration. Those conditions matter more than workflow size or organizational prestige.

Customer success and sales operations are common points for exactly this reason. The context requirements are relatively well-defined: account history, product usage, support interactions, call transcripts and the value of getting it right shows up in metrics the business already tracks retention, pipeline conversion, time to close. That creates a tight feedback loop that's useful for improving context quality quickly.

Compliance review is a different but equally valid anchor. The cost of missing an issue is concrete, the agent's systematic coverage is genuinely hard to replicate manually at scale, and the before/after measurement is straightforward. Organizations that start there often find that the compliance use case earns internal trust faster than productivity-focused workflows, because the stakes are legible to stakeholders who might otherwise be skeptical of AI-driven decisions.

From those anchors, the context layer expands. Each new system connected to the shared foundation contributes signal that benefits every agent already running on it. That compounding dynamic where investments in context infrastructure pay dividends across the entire agent fleet is what separates organizations building durable AI capability from those chasing point solutions. The model is already good enough. The constraint is always the context, and it always has been.

The AI Collective is built by volunteers across 180+ chapters in 40 countries.

Thank you to the thousands of volunteers around the world who make this work possible. We truly could not do this without you.

🧑‍💻 About the Authors

Sumon Saha is the Founder and CEO of FlowGenX AI, an agentic integration platform for no-code and low-code data and application orchestration. He has 20+ years of product and technology leadership experience across iPaaS, APIs, streaming, cloud, and security. Previously, he held leadership roles at Axway, Informatica, MuleSoft, Intel, and IBM.

About Balaji Sundara

Balaji Sundara is a B2B product and GTM leader with over two decades of experience in SaaS, partnerships, and enterprise technology. He currently serves as GTM and Evangelist at FlowGenX AI, focusing on agentic AI workflows and data-centric productivity tools. He also advises Composio and helps turn emerging technologies into scalable market strategies.

✍️ About the Editorial Team

About Josh Evans

Josh is a Managing Editor at The AI Collective Newsletter and leads content for The Byte. Outside of AIC, Josh works in Content Protection at Spotify.

Add Your Thoughts

Avatar

or to participate

Keep Reading