Alex Salazar
Contributor

The talent blueprint for getting AI agents to production

Opinion
Sep 18, 202513 mins

Cool demos aren’t enough — your team needs ML chops and context skills to actually get AI agents into production.

déficit talento
Credit: FOUNDRY.

I’ve been having the same conversation with enterprise CIOs over and over. Recently, it happened again during a panel discussion at a Fortune 500 roundtable. A CIO raised her hand and asked the question I’ve come to expect: “I have my team. What skills do they need? Who do I train? Who do I hire?” Earlier that day, I’d spoken with a CTO at a large retailer. He told me he was personally owning all AI initiatives since he was not sure who on his team could handle them. These are not isolated occurrences. CTOs and CIOs across industries are becoming accidental AI program managers because they don’t know how to staff these projects.

Invariably, I offer the same advice. The skills gap is the biggest barrier stopping your AI pilots from reaching production. It’s not the technology and, frankly, it’s not even the budget; building AI Agents is a completely different paradigm than traditional software. From my experience, around 90% of the work happens between demo and production, and the challenge is that most teams don’t have the right people to navigate that gap for their agents. 

Why traditional software teams hit a wall with AI agents

The pattern is clear. AI agent teams without real machine learning expertise inevitably stall out during implementation, and their projects never reach production. To paraphrase Donald Rumsfeld, these teams don’t know what they don’t know. The problem starts from the get-go since it’s so easy to create brilliant demos. Developers can stand up impressive prototypes in days. Then the spiral starts. The demo works great in the conference room, and everyone gets excited, only for reality to sink in later.

The question isn’t whether your agent works. Rather, it’s how often it works. This is a fundamentally different challenge, and one that’s much harder to solve. “Working” means something entirely different with AI agents because they are non-deterministic. Traditional software developers are used to deterministic systems: input A always produces output B. But AI agents are probabilistic. The same input can produce different outputs. This trips up even experienced developers who haven’t worked with ML systems.

This is where evaluations become critical. You can’t just test an agent once and call it done. You need systematic evaluation frameworks that test hundreds or thousands of scenarios to understand your agent’s true reliability rate. Most traditional developers have never built evaluation systems like this. They’re used to unit tests that either pass or fail, not statistical analysis of success rates across edge cases. Without proper evals, teams have no idea if their agent works 90% of the time or 60% of the time, and that difference determines whether you have a production system or an expensive demo.

Building your modern AI agent team doesn’t need to be hard

After hundreds of conversations with teams shipping agents to production, I’ve identified the core roles teams need to make everything work. The foundation is an experienced AI/ML engineer who becomes your agent lead. This person doesn’t need to own everything. Instead, they need to be the expert who guides the team through non-deterministic challenges. The ideal AI/ML engineer deeply understands how models actually work, along with a clear grasp of the balance between determinism and flexibility and knows how to build and interpret evaluations (evals for short). Experience with agent orchestration frameworks like LangGraph and Flagship models like OpenAI and Anthropic is a plus.

Teams also need a standard complement of backend and frontend developers. Outside the agent-specific features, it’s all just user applications and workflow automation software after all. It needs to run on containerized infrastructure, have a web or mobile interface and handle security properly. Luckily, most organizations have these engineers on staff.

Beyond the core team, we’re seeing entirely new skill requirements that didn’t exist in traditional software development. The most obvious example is what we used to call prompt engineering, but is evolving into something more sophisticated: context engineering. While most people know the term prompt engineering, the field has advanced beyond just writing better prompts. Context engineers design the entire information environment that surrounds the AI model – the prompts, yes, but also the data retrieval systems, the tool selection logic and the reasoning frameworks that help agents make decisions.

While this shouldn’t be a separate job title, every engineer working on agents needs to master this skill. Think of it as the new “writing SQL queries,” or a fundamental capability that spans roles. These engineers need to build context that works reliably across thousands of variations, not just the happy path scenarios they tested in development. They’re designing how agents think, not just what they say.

These aren’t just nice-to-have skills. Without them, your agents will work in demos but fail unpredictably in production, exactly like the pattern we see with teams that skip the ML expertise.

The technical skills alone won’t get you there. You also need people who understand the actual work being automated. A finance expert designing an accounting agent will consistently outperform a brilliant AI engineer who doesn’t understand debits and credits. This isn’t about having domain experts review the final product – they need to be involved in designing how the agent thinks through problems and what tools it uses.

Finally, and this might be the most important point: you need people who thrive in ambiguity. Unlike web development or mobile apps, there’s no established playbook for production agents. Your team will spend months figuring out why an agent works perfectly on Tuesday but fails on Wednesday, or why it handles simple requests flawlessly but breaks on edge cases you never considered. The engineers who succeed in this environment are the ones who get energized by unsolved problems rather than frustrated by them. They’re willing to experiment, measure results and iterate based on what they learn, even when that means throwing away weeks of work.

Most importantly, you need iterators and tinkerers. The playbook for production agents barely exists so your team is going to need to learn on the job. To solve for this. I look for “learn-it-alls” who are genuinely excited about working on agents. People who are eager to tinker– try new things, fail and try again, happily.

Without buy-in and support from the broader organization, even the best team will struggle. I’ve seen brilliant agent teams get stuck because they can’t access the data they need, or because security teams block their API integrations, or because business stakeholders won’t participate in the iterative testing process that agents require. Unlike traditional software projects, where you can build in isolation and deploy when ready, agents need ongoing collaboration with the people whose work they’re automating.

You need legal teams willing to review new AI policies, security teams who understand the unique risks of LLM-based systems, and business users who will provide honest feedback during the messy early phases when your agent gets things wrong half the time. You need engineers who can tackle new problems, like how to delegate authorization or how to safely manage the tools LLMs can utilize. But most importantly, you need executive air cover when things don’t go smoothly. And, rest assured, they won’t always go smoothly. The organizations succeeding with agents have leadership that treats early failures as learning opportunities rather than reasons to shut down the program.

Your team is ready to be upskilled for the AI era

You don’t need to hire an entirely new team. Repeat that. I’ve seen successful transitions happen in 3-6 months when organizations approach upskilling strategically. Start with your best software developers and any existing ML engineers. Skip the formal third-party training courses. Let’s face it: with the speed AI is moving, they’re outdated before you even start them. Instead, bring in the teams actually building in your domain, who understand the business process or consumer experience you’re trying to automate, and then give them the space they need to learn and iterate towards a working agent.

My team has run workshops where practitioners share what’s actually working in production. Internal workshops where teams share learning are a great way to upskill for this transition. Internal hackathons focused on agent problems are another option. Ideally, you can have your team build actual prototypes that integrate with your existing systems. There is no reason why learning these new skills can’t produce something tangible and helpful. Get out of the PowerPoint presentation mindset and teach your team through experimentation.

Successful role transitions follow predictable patterns. Data engineers transition well to ML engineering since they already understand how machine learning, models and non-determinism work. Software engineers can become great agent engineers with the right support because they already know how to build scalable, stable production software. Business analysts often excel at prompt design and workflow planning because they understand the processes you’re trying to automate. Consider the fact that 57% of enterprises are planning targeted upskilling for AI initiatives, according to McKinsey’s latest AI survey. The ones succeeding are doing hands-on learning and focusing less on theoretical courses.

Why the no-code/low-code shortcut leads nowhere

The biggest trap I see is teams starting with no-code platforms like N8N or Zapier, thinking they can prototype quickly then transition to real code later. This never works. These platforms are great for simple internal workflows, but they can’t handle the production requirements that matter: accuracy optimization, security controls and cost management. Worse, the underlying technologies are so different that “transitioning” means starting over completely.

Teams get stuck in what I call the demo cycle. They build impressive workflows that work sometimes, assume making them work consistently is just more of the same effort, and then spend months trying to bridge an unbridgeable gap. Even teams adopting modern protocols like model context protocol (MCP) encounter significant limitations in enterprise environments, as MCP’s tool connectivity standard lacks the granular, post-prompt authorization capabilities required for production-ready agents. This gap between current protocol capabilities and enterprise security requirements has prompted active industry collaboration, with teams like ours working to evolve MCP’s specification based on real-world deployment learnings. For CIOs developing agent strategies, this evolution highlights a critical timing consideration: organizations must either wait for these emerging standards to mature or partner with teams actively shaping the future of enterprise agent security protocols.

 In the SaaS era, going from Figma to production was straightforward. With agents, the real work starts after you have a functional prototype. Skip the platforms. Start with custom development using established frameworks like LangGraph. It requires more upfront investment in talent, but it’s the only path to agents that actually work in production and deliver ROI.

The ROI math of getting talent right

Let’s look at the talent investment math. I’ve worked with dozens of enterprise agent deployments, and the pattern is consistent: teams with proper ML expertise from day one ship agents that work. Teams without that expertise get stuck in endless iteration cycles.

Here’s a concrete example from the financial services sector. A team without an ML background built what seemed like a working loan processing agent. It handled straightforward applications fine, but failed on edge cases about 40% of the time. Each failure created cascading work: human reviewers had to understand what the agent attempted, figure out where it went wrong and then complete the original task. The “automation” actually increased total processing time.

When they brought in an ML engineer who understood how to build proper evaluation frameworks and tune model behavior, the same use case achieved 85% reliability. The difference? The ML engineer knew how to systematically identify failure modes, design tests for edge cases and implement the monitoring needed to catch problems before they reached customers.

This talent gap creates a multiplier effect. Teams without ML expertise don’t just build worse agents. They make fundamental architecture mistakes like routing simple data transformations through expensive LLM calls, or failing to build proper MCP servers that handle authentication and external integrations reliably. They build agents that create more work than they eliminate. That’s why executives conclude “agents don’t work” when the real issue is they didn’t staff the project correctly from the beginning.

How to build long-term agentic capabilities

The agent landscape changes monthly, and it will continue at this pace for the foreseeable future. New frameworks are constantly emerging as models improve and reasoning capabilities advance. Teams that build constant learning into their DNA are best positioned to succeed in this climate. Figuring out how to capitalize on this learning climate doesn’t take a mystic.

Success can come from simple weekly architecture reviews where teams share what’s working and what isn’t, or more detailed documentation of lessons learned, especially failures. Whatever your process is: share, document, test and adapt. If budget allows, spin up multiple small teams working on different agent problems. This portfolio approach lets you see what works and adjust accordingly. Different team compositions will excel at different types of problems.

Boards aren’t going to stop asking about AI agents. The pressure will only intensify as competitors ship successful agents. CIOs who solve the talent equation first will build agents that make it to production. These agents deliver real value and justify continued investment. Those who don’t solve it will watch their pilots fail while competitors pull ahead.

Take solace in the fact that the blueprint exists and the frameworks are already mature. The only question is whether you’ll invest in the people needed to execute it. Your board is asking about agents. Your competitors are building them. The choice isn’t whether to build agents. It’s whether you’ll have the right people to build them.

This article is published as part of the Foundry Expert Contributor Network.
Want to join?

Alex Salazar

Alex Salazar is the CEO and co-founder of Arcade.dev, the unified agent action platform that makes AI agents production-ready. Previously, Salazar co-founded Stormpath, the first authentication API for developers, which was acquired by Okta. At Okta, he led developer products accounting for 25% of total bookings and launched a new auth-centric proxy server product that reached $9M in revenue within a year. He also managed Okta's network of over 7,000 auth integrations. Alex holds a computer science degree from Georgia Tech and an MBA from Stanford University.