Hume AI has unveiled a new conversational AI called the Empathic Voice Interface (EVI), which incorporates emotional intelligence into interactions with users. EVI stands out by discerning the user’s tone of voice, enriching every exchange and tailoring responses accordingly, creating an experience that almost mimics human conversation.
EVI represents a breakthrough in AI technology, equipped with the ability to comprehend and generate expressive speech based on extensive training on millions of human conversations. Developers now have the opportunity to seamlessly integrate EVI into various applications using Hume’s API, providing users with a unique voice interface experience.
The distinguishing empathic capabilities of EVI include responding with tones resembling human expressions, adapting language based on user expressions to effectively address their needs, accurately detecting the end of a conversation turn using the user’s tone, and handling interruptions seamlessly while resuming from where it left off. Moreover, EVI continuously learns from user reactions to enhance satisfaction over time, showcasing self-improvement capabilities.
In addition to its empathic features, EVI offers fast and reliable transcription and text-to-speech capabilities, making it adaptable to diverse scenarios. It seamlessly integrates with any Language Model Library (LLM), adding to its versatility and utility.
EVI is slated to be publicly available in April, offering developers an innovative tool to create immersive and empathetic voice interfaces. Developers eager for early access to the EVI API can express their interest by filling out a form on Hume’s website.
Founded in 2021, Hume is a research lab and technology company with a mission to ensure that artificial intelligence serves human goals and emotional well-being. Founded by Alan Cowen, a former researcher at Google AI, Hume is committed to advancing AI technology while prioritizing human needs.
Alan Cowen emphasized the significance of voice interfaces, stating that speech is faster and carries more information than typing, making it the preferred mode of interaction with AI. Cowen highlighted EVI’s ability to understand voice beyond words, enabling it to predict when to speak, what to say, and how to say it based on the user’s voice.
Hume recently secured a $50 million Series B funding round from EQT Group, Union Square Ventures, Nat Friedman, Daniel Gross, Northwell Holdings, Comcast Ventures, LG Technology Ventures, and Metaplanet, indicating strong support and interest in its innovative AI technology.
On the other hand, OpenAI is also making strides in AI technology with the development of its Voice Engine, which includes features such as voice and speech recognition, processing voice commands, and converting between text and speech. Additionally, OpenAI is working on GPT-5, a model that emphasizes multimodality and personalization, enabling it to process video input and generate new videos while customizing responses based on user preferences.
Last year, OpenAI introduced a voice assistant feature in the ChatGPT app on Android and iOS, enabling users to engage in conversational exchanges. Furthermore, OpenAI partnered with Figure AI to develop generative AI-powered Humanoids, showcasing advancements in AI technology across various applications.
The integration of emotional intelligence into conversational AI represents the future of human-computer interaction, with potential implications for business success. Pushpak Bhattacharyya, an IIT Bombay professor and computer scientist, emphasized the importance of chatbots that understand sentiment and emotion, suggesting that they can lead to better businesses and increased commercial profits.