Success with customer service AI agents hinges on how thoughtfully you design and manage your pilot. But not all pilots prove the value of an AI agent. At a recent Customer Response Summit event, ASAPP teamed up with PTP to provide practical guidance on why generative AI pilots fail—and how to increase your chances for success.
Lance Child, ASAPP Account Director, shared the session with Strategic Consultant Crystal Collier and Principal AI and CX Consultant Chance Whitley, both of PTP. Here are the key takeaways from their session.
Why most generative AI pilots fail
Despite the raft of generative AI solutions being built and deployed, the majority of generative AI initiatives don’t make it past the pilot phase. A recent MIT report found that a staggering 95% of pilots fail to make a measurable impact on the bottom line. But the reasons for these failures aren’t just technical—operational choices and cultural conditions often deserve the lion’s share of the blame.
Three core obstacles stand out:
- Learning gap in tools and teams: Spinning up an AI agent demo is easy. But successfully deploying one that learns, adapts, and consistently improves customer experience is much harder. Mismatches between chosen tools and the actual problem, paired with unprepared teams, lead to disappointing results.
- Resource misalignment: Far too many organizations chase high-profile, revenue-generating use cases. But more substantive gains are often realized in functions with less visibility, like the contact center.
- Avoiding friction: Many pilots are designed as “happy path” demos, rather than real-world trials meant to tease out failures and tough issues early, before scaling up. But identifying sources of friction early in the process is key to creating agents that eventually work at scale.
What sets successful AI pilots apart
ASAPP’s Lance Child noted that the ASAPP team has helped a number of iconic brands realize value with a generative AI agent pilot, setting the stage for even greater returns with expansion. Based on that experience, he highlighted some consistent best practices:
- Tool and partner alignment: According to the MIT report, two-thirds of the successful pilots partnered with specialized external experts, rather than attempting to build a solution in-house. These partners bring essential perspective, mature frameworks, and robust capabilities for live simulation, testing, and ongoing tuning.
- Empowering line managers: Success with an AI deployment is far more likely when those closest to the workflows take the lead. When a central AI team drives the rollout without crucial input from line managers, that leads to blind spots and missed insights that put success out of reach.
- Deep integration and adaptation: Successful pilots go beyond basic adoption to absorption. They aren’t focused on simply adding a new tool to your kit. They embed new capabilities into everyday workflows, training, and continuous improvement processes. Adoption is just a checkbox of whether you have AI. Absorption means the AI agent is actually changing how work gets done.
The innovation tolerance model
The PTP team introduced a helpful framework for guiding your strategy and decision-making with an AI agent pilot. As they explained, before launching your pilot, it’s important to assess your organization’s innovation tolerance. In other words, what’s your capacity for change and risk? The answer is rooted primarily in your company culture.
You’ll need to consider a few key elements:
- Culture: How comfortable is your organization with change, experimentation, and occasional failure? Are your executives open to incremental, iterative gains, or are they more likely to expect big wins from the outset? Tuning the pilot’s scale and ambition to match culture minimizes resistance and increases your ability to demonstrate true value.
- Expectation management: It’s better to land small wins that build momentum than to aim too high and fail publicly. Right-sizing the pilot requires more than technical planning. You’ll also need to secure internal buy-in and clear alignment on what success looks like.
- Organizational readiness: How much appetite is there in your organization to tackle the challenges that will inevitably arise? Are the teams that will be involved prepared technically, operationally, and culturally to learn and adjust as failures occur?
If you’ve honestly assessed your organization’s innovation tolerance, and you’re disappointed with the results, don’t be overly concerned. You can still deploy an AI agent and realize significant value by aligning your pilot with your organization’s innovation tolerance profile. That alignment is a crucial lever for maximizing learning and ROI.
Use case selection: Start small, but think big
The session emphasized the critical importance of selecting the right use case for your AI pilot program. While it’s tempting to automate simple, low-risk tasks, those don’t always deliver the learning or measurable ROI needed for scale.
Key criteria for smart use case selection:
- Low risk, high impact: Choose customer intents that the AI agent can solve and that will make a noticeable impact on scenarios where pilot errors won’t be catastrophic, but where successfully solving them will be noticeable—such as basic FAQs that still drive operational costs and customer satisfaction.
- Alignment on outcomes: Ensure agreement across executive and operational levels on how you’ll measure the success of the pilot. Success metrics should reflect contact center and business KPIs, like first contact resolution (FCR), customer satisfaction (CSAT), and cost to serve, rather than AI performance benchmarks.
- Commitment to data quality: Many pilots falter because underlying data and knowledge bases aren’t up to par. Disparate, outdated, or competing data can undermine an AI agent’s ability to provide accurate answers or perform tasks necessary to help the customer. Broad commitment to ongoing data cleanup and maintenance is essential.
- Cross-functional buy-in: Avoid pilots led solely by engineering, even if that includes a team of AI experts. You’ll need broader input and oversight to keep the project aligned with business goals. Engage relevant line managers, customer experience teams, and change management experts to ensure the pilot addresses the full range of needs and concerns.
Testing, measurement, and iteration: Musts, shoulds, and nevers
Rigorous, context-sensitive testing can make the difference between a pilot bound for production and one that’s doomed to fail. Testing a generative AI solution is very different from testing traditional software. So, it’s useful to approach it with a different mindset. Getting a generative AI agent ready to serve your customers is a bit like getting a new human agent ready. You’ll need to feed the AI agent relevant training material, run live simulation tests based on real customer conversations, and layer in continuous tuning.
The practical framework described by PTP’s Chance Whitley divides the process of ongoing testing and optimization into three categories:
- Musts: Critical compliance, accuracy, and safety requirements. For instance, an AI agent handling billing inquiries must never give a customer the wrong total or account number. These are “do not harm” criteria, and failures require immediate intervention.
- Nevers: Actions that absolutely cannot happen, such as sharing protected personal information or providing advice beyond compliance guidelines.
- Shoulds: Improvement opportunities, things that are desirable but don’t reach the critical level of a must. These include clarity in responses, conversational nuance, and improved empathy, which would improve the customer experience. But they’re not showstoppers if they don’t happen yet.
Prior to launch, you should focus on the musts and nevers. Those are about ensuring your AI agent operates safely, accurately, and remains compliant with regulations and company policy. Once you’ve addressed the musts and nevers, you can turn your attention to the shoulds. The shoulds are about enhancing the customer experience, improving resolution rates, and increasing the AI agent’s ability to handle exceptions.
The process of identifying and addressing the shoulds will continue post-launch. It’s an ongoing effort. As you continue to monitor and measure your AI agent’s performance, you’ll want to pay close attention to these key metrics:
- First contact resolution (FCR)
- Conversation quality scores
- Containment rates
- Transfer and abandonment rates
- Repeat customer interactions
Keep in mind that monitoring customer feedback through surveys might not be enough, especially as customer behavior changes with automated service. You’ll also need to mine conversation data and continuously refine agent performance through real-time review and cross-functional input.
Change management and customer enablement
Beyond technical deployment, pilots only succeed when everyone impacted is engaged. That includes your front-line human agents and even your customers. That means change management can’t be an afterthought. The viability and value of your AI agent depend on it.
Here are a few best practices to keep in mind:
- Ongoing communication: Regular updates and training for internal teams will keep processes aligned and expectations realistic.
- Customer enablement: When launching an agent externally, it can be helpful to treat it as a new product launch. Announce changes, guide users, and smooth transitions as you would with significant branding or website updates.
- Iterative improvement: The most impactful pilots treat shoulds as fuel for iteration, folding feedback into ongoing tuning rather than relying exclusively on lagging indicators like customer surveys.
Friction is your friend: Designing pilots for improvement
When it comes to the customer experience, you might be aiming for frictionless CX. But for an AI agent pilot, friction is good. You should be actively seeking it out. Challenges uncovered during deployment, testing, and even close monitoring immediately after launch all open the door to rapid improvement. Problems that surface only after you scale up are riskier. They could damage customer relationships and create costly problems for your customer service operation.
Our best advice is to steer your AI agent pilot toward use cases aligned to your organization’s innovation tolerance, surface issues early before they become costly public problems, and treat the cycle iteration as an investment instead of a setback. With that approach, your pilot has a good chance of delivering significant value that you can build on through scaling and expansion.
Final takeaways
Generative AI agent pilots aren’t run-of-the-mill technology evaluations. At their best, they’re also cultural and operational inflection points. Success relies on understanding your organization’s innovation tolerance, taking a pragmatic approach to choosing use cases, focusing on data readiness, and rigorously measuring musts, shoulds, and nevers. Engage your whole team, invest in change management and customer enablement, and actively pursue sources of friction. That’s how you achieve continuous improvement and real returns.
By Stefani Barbero, Writer, ASAPP
ASAPP creates AI solutions that solve the toughest problems in customer service. With native AI at our core, our solutions get beyond basic automation to dramatically increase contact center capacity. At the center of our approach is GenerativeAgent®—an AI agent platform that autonomously and safely resolves complex customer interactions over voice and chat.
PTP delivers innovative self-service and contact center solutions that cut costs, enhance investments, and improve customer satisfaction. We offer comprehensive support from strategy and design to integration and implementation.
deep implementation expertise to tackle the most complex challenges in enterprise contact centers.












































TELUS Digital
ibex delivers innovative BPO, smart digital marketing, online acquisition technology, and end-to-end customer engagement solutions to help companies acquire, engage and retain customers. ibex leverages its diverse global team and industry-leading technology, including its AI-powered ibex Wave iX solutions suite, to drive superior CX for top brands across retail, e-commerce, healthcare, fintech, utilities and logistics.





















Trista Miller




























