OpenAI Announces ChatGPT’s New Abilities: Speaking, Listening, and Image Processing!


OpenAI has introduced a significant update to ChatGPT, equipping it with the ability to understand spoken words, respond with a synthetic voice, and process images. This marks the most substantial enhancement since the introduction of GPT-4. Users can now engage in voice conversations via ChatGPT’s mobile app, selecting from five different synthetic voices for the bot’s responses. Additionally, users can share images with ChatGPT and even highlight specific areas of interest or request analysis, such as identifying types of clouds.

These updates are set to be gradually rolled out to paying users over the next two weeks. While voice functionality will be limited to the iOS and Android apps, image processing capabilities will be accessible on all platforms.

This development comes in the midst of intensifying competition in the chatbot arena, with industry leaders like OpenAI, Microsoft, Google, and Anthropic vying to enhance their offerings. Major tech companies are not only launching new chatbot applications but also introducing new features. For instance, Google has unveiled numerous updates to its Bard chatbot, and Microsoft has integrated visual search into Bing.


Earlier this year, Microsoft significantly bolstered its investment in OpenAI with an additional $10 billion, marking it as the largest AI investment of the year. In April, OpenAI reportedly concluded a $300 million share sale, valuing the company between $27 billion and $29 billion, with backing from firms like Sequoia Capital and Andreessen Horowitz.

There are concerns raised by experts regarding AI-generated synthetic voices. While they enhance user experience, they also have the potential to facilitate more convincing deepfakes. Deepfakes have been explored by cyber threat actors and researchers for their potential in breaching cybersecurity systems.

OpenAI has acknowledged these concerns in its announcement, clarifying that the synthetic voices were “created with voice actors we have directly worked with,” rather than sourced from unknown individuals.

However, the release provides limited details about how OpenAI plans to use consumer voice inputs and how they intend to secure this data if utilized. OpenAI has not immediately responded to requests for further information. According to the company’s terms of service, consumers own their inputs “to the extent permitted by applicable law.”