Introduction
Welcome to 2024, conversational AI has finally arrived. With it, a host of new companies, including Bland AI, Retell AI, Vapi AI, and Air AI have entered the scene, offering a range of different services to companies looking to automate their phone calls. For organizations navigating the landscape of offerings, understanding the differences between each provider, as well as their strengths and weaknesses, is crucial for making an informed build-and-buy decision.
With that in mind, we present the following guide to picking an AI phone agent provider. This comprehensive article focuses on the top four providers: Bland AI, Vapi.ai, Retell AI, and Air.ai, and provides a breakdown of each. Read on to learn more!
Summary of the Conversational AI Landscape
In the emerging conversational AI space, companies can be divided into three segments: infrastructure providers, middleware, and SaaS.
Bland AI is the only company building the infrastructure level. Bland manages the end-to-end phone agent experience - from transcription, inference, and text-to-speech, to building a decision-making graph and configurable function calls and API requests. Bland does the heavy lifting to make the AI phone agent performant while providing full configurability for developers and enterprises to customize the agent and make it effective for their use case. Bland also self-hosts its entire fine-tuned model stack in clusters next to each other, guaranteeing low latency calls (by not being reliant on OpenAI and other services). Learn more about deploying enterprise-grade phone agents with Bland, or build your first Bland phone agent for free, right now.
At the middleware layer, companies like Vapi and Retell force developers to bring their own models and connect to their ‘internal pipes’. The flexibility provided is a double-edged sword. To build on either platform developers need to train their own LLM, host it, and inherently deal with higher latency because the LLM won’t be hosted near the rest of the infra. Additionally, out of the box, the agent’s capabilities will be limited – rather than having an infrastructure provider create functionality for function calling, dynamic data, etc., your organization has to build that all from scratch. This creates unreliability and additional work for organizations to deliver excellent call experiences.
Finally, at the application level, companies like Air AI build the actual SaaS around the phone agent for companies to send calls from within a dashboard. The lack of API access, inability to configure function calling, and the inability to fully personalize the phone agent turns off many potential customers. Additionally, the high entry-point pricing (rather than pay-as-you-go) model of infrastructure and middleware-layer companies prevents many teams from starting.
Now, let’s dive into the specific companies, their product offerings, and how they compare.
#1: Bland AI, The Platform for AI Phone Agents
Bland is a Y Combinator backed platform for AI phone calling that’s built for developers and the world’s largest enterprises. Rather than outsourcing model infrastructure, Bland hosts the entire model stack in-house, providing reliably low phone call latency and higher call quality at lower cost.
Using Bland you can build AI phone agents to automate any task, from answering tier-one customer support calls to qualifying inbound leads and setting appointments. Unlike platforms like Retell and Vapi that connect external APIs, introducing unreliability and latency, Bland handles the entire infrastructure to enable the best phone calls.
When to Choose Bland.ai over Vapi, Retell, and Air AI
Here’s when Bland might be a better choice than Vapi, Retell, and Air AI:
- You need to deploy your phone agent into an enterprise-grade setting (where call quality and reliability are of the utmost importance)
- You want an agent that by default is low latency, has an incredibly human-sounding voice, and comes with built-in capabilities to integrate with other APIs, pull in live data, and do function calls
- You’d prefer to have rigorous control over your AI phone agent’s outputs and want pre-built guardrails plus a graph interface to quickly program effective phone calls
- You want easy abstractions and heuristics - like first_sentence, transfer_phone_number - and tooling for analyzing calls built into the platform you build on
Key Info on Bland AI
Scalability: Bland’s phone agent infrastructure can dispatch and receive hundreds of thousands of calls per minute, enabling the largest enterprises in the world to easily handle large volumes of calls.
Functionality: Bland’s phone agent has incredible functionality provided out of the box to connect data sources, take live actions, and build the best possible agents to deploy in enterprise settings.
Model quality: Bland owns the entire end-to-end model stack, self-hosting and fine-tuning the transcription, language, and text-to-speech models to deliver fantastic outputs with low latency.
Host your own LLM: Bland will host your own language model, and will provide fine-tuning capabilities, ensuring you understand and
#2: Vapi AI, Middleware Option One
Vapi is a middleware layer designed for companies that have their own transcription, language, and text-to-speech models that they want to integrate with their phone agent. Vapi provides the models at cost, while also taking a $0.05/minute premium on top, providing an expensive alternative to other platforms that own the end-to end-infra.
While most companies don’t care about the specific models at play - and care significantly more about the underlying quality, speed, and reliability of their phone agent, for those willing to pay the premium, Vapi can be a good option.
When to Use Vapi.ai over Other Options:
- You have your own transcription model, language model, and text-to-speech model, and care more about flexibility than call quality and consistency
- You want to build your own home-grown functionality (that may be less reliable) instead of using a more functional alternative
- You’re building an application that isn’t customer-facing, where guardrails around your phone agent’s behavior are less consequential
Key Figures on Vapi.ai
Cost: aside from forcing you to pay for your own telephony, transcription, language, and text-to-speech models, Vapi also charges a $0.05/minute premium. Without optimized infrastructure, costs can scale as high as $0.30/minute, reaching the same level as human callers (which many times defeats the purpose of building an AI phone agent in the first place).
Latency: because Vapi connects a wide set of external API providers, the built-in network latency between services can be significantly higher, especially during periods of high traffic when OpenAI and other LLMs experience heavy loads. Three or even four-second latency can completely ruin call quality, hurting the experience of customers and losing their trust.
Functionality: the lack of built-in functionality creates additional engineering time and cost for anyone building on top of Vapi. Your organization could spend anywhere from 1-2 sprints to build functionality into your phone agent that otherwise would exist out of the box on other platforms.
#3: Retell AI, Middleware Option Two
Retell, like Vapi, is a middleware layer – they do transcription and voice but fail to handle the underlying model infrastructure. Once again, for hobbyists and early-stage companies looking for a quick solution, Retell makes it easy to connect an LLM to do phone calls. However the latency and unreliability created by not owning the end-to-end infra makes it a difficult choice for enterprises evaluating solutions.
Similar to Vapi, Retell charges its own premium, except it’s even more expensive (starting at $0.14/minute).
When to choose Retell over Bland, Vapi, and Air
- You have a quick project to put up and have an assistants API or other LLM created that you want to connect
- Not putting your phone agent in front of customers - and happy to just have your LLM generate outputs - instead of a guard-railed model that produces more quality outputs
- Less price sensitive and willing to pay the premium for connecting your own LLM
Key figures on Retellai.com
New company: Retell is an early-stage startup and is only a few months old. As the feature set matures they might eventually provide the solution enterprises require, and also for now have room for refinement.
Expensive middleware: even though Retell fails to provide a comprehensive end-to-end solution, they simultaneously charge a premium over other providers, charging $0.14/minute at the time of this writing.
#4: Air.ai’s SaaS Platform
Air is a SaaS application providing out-of-the-box AI phone agents. While Air was one of the earliest phone agents to launch to market, they’ve faced heavy criticism from customers for charging large up-front license fees and failing to provide refunds, leading customers to leave negative reviews and even to consider suing them.
Air’s platform has limited API functionality, causing small business owners to enjoy using their no-code tools and limited pre-built integrations, while larger enterprises and developers have shied away, due to the lack of customizability, and lack of deep integration capability.
When to use Air.AI over Bland AI, Retell, and Vapi
- You’re a small business that lacks a proper tech team to build integrations and connect APIs
- You’re doing a relatively small level of calls, and don’t require comprehensive monitoring and quality assurance to ensure high reliability
- You’d prefer to pay a high upfront fee than to pay as you go with other providers
Key figures on Air.ai
Pricing: Air sometimes charges anywhere from $25,000-$100,000 for agency license holders and customers to get started on their platform.
Reliability: by wrapping around other companies’ infrastructure (e.g. OpenAI), the call quality can sometimes be lower than other providers who self-host
Customer satisfaction: generally when people research Air’s customer reviews, sentiment can be negative due to frustrating experiences customers have faced throughout their customer journey.
Conclusion
The AI phone agent landscape is filled with players delivering value at different parts of the stack. While most potential builders and buyers prefer the end-to-end infrastructure, quality, and reliability of Bland AI, other platforms like Vapi, Retell, and Air AI provide intriguing offerings that may appeal to some buyers.
Get started on Bland today by visiting the developer portal and joining the Discord community. Until next time!