ChatGPT update enables AI chatbot to ‘see, hear, and speak’

Share
OpenAI has announced new voice and imagery capabilities for ChatGPT
OpenAI ChatGPT AI chatbot will soon be able to have voice conversations with users and interact using images, the company has said

OpenAI's wildly popular large language model AI chatbot ChatGPT will soon be able to have voice conversations with users and interact using images, the company has revealed.

The company’s release of ChatGPT last year has rapidly accelerated interest in generative AI, with the tool capable of interacting conversationally, answering follow-up questions, admitting its mistakes, challenging incorrect premises, and rejecting inappropriate requests. 

In March OpenAI announced the launch of GPT-4, the latest iteration in its deep learning model, which it says ‘exhibits human-level performance’ on various professional and academic benchmarks from the US bar exam to SAT school exams.

“We are beginning to roll out new voice and image capabilities in ChatGPT,” OpenAI said in a blog post. “They offer a new, more intuitive type of interface by allowing you to have a voice conversation or show ChatGPT what you’re talking about.”

ChatGPT voice and image capabilities

According to OpenAI, users of ChatGPT will soon be able to engage in a back-and-forth conversation with the chatbot. The new voice capability, the company says, is powered by a new text-to-speech model, capable of generating human-like audio from just text and a few seconds of sample speech. OpenAI collaborated with professional voice actors to create each of the voices, and uses Whisper, the company’s open-source speech recognition system, to transcribe spoken words into text.

Meanwhile, with images support, users can take pictures of things around them and ask the chatbot to "troubleshoot why your grill won't start, explore the contents of your fridge to plan a meal, or analyse a complex graph for work-related data".

Image understanding is powered by multimodal GPT-3.5 and GPT-4. These models, OpenAI says, apply their language reasoning skills to a wide range of images, such as photographs, screenshots, and documents containing both text and images.

According to OpenAI, these voice and images capabilities in ChatGPT will be rolled out to Plus and Enterprise users over the coming weeks.

“OpenAI’s goal is to build AGI that is safe and beneficial,” it said. “We believe in making our tools available gradually, which allows us to make improvements and refine risk mitigations over time while also preparing everyone for more powerful systems in the future. This strategy becomes even more important with advanced models involving voice and vision.”

******

For more insights into the world of Technology - check out the latest edition of Technology Magazine and be sure to follow us on LinkedIn & Twitter.

Other magazines that may be of interest - AI Magazine | Cyber Magazine.

Please also check out our upcoming event - Cloud and 5G LIVE on October 11 and 12 2023.

******

BizClik is a global provider of B2B digital media platforms that cover Executive Communities for CEOs, CFOs, CMOs, Sustainability leaders, Procurement & Supply Chain leaders, Technology & AI leaders, Cyber leaders, FinTech & InsurTech leaders as well as covering industries such as Manufacturing, Mining, Energy, EV, Construction, Healthcare and Food.

BizClik – based in London, Dubai, and New York – offers services such as content creation, advertising & sponsorship solutions, webinars & events.

Share

Featured Articles

Ox Horn: The Faux ‘European’ Campus Homing Asia’s R&D Leader

Operating out of an amalgamated town of Europe’s most beautiful cities, this Disney-esq town conceals the fact it is the campus of Asia’s R&D leader

Is Quantum Tech Key to Unlocking UN Sustainability Goals?

WEF explores quantum technologies' potential to accelerate UN sustainability goals, highlighting applications and ecosystem challenges for global impact

Women in STEM: Retention Crisis Amidst World Talent Shortage

New report highlights strategies for retaining female talent in STEM fields, addressing global workforce challenges during National Inclusion Week

Cloudera: Unlocking Real Business Value from Data Analytics

Enterprise IT

Microsoft's Investment in Brazil Boosts Tech and Economy

AI & Machine Learning

OpenAI in Transition Period as Mira Murati Steps Down as CTO

AI & Machine Learning