Papercut AI
Posts
ChatGPT Can Now See, Hear, and Speak

ChatGPT Can Now See, Hear, and Speak

Papercut
October 19, 2023

Greetings, fellow AI apprentices

Breaking Boundaries: ChatGPT's game-changing multimodal capabilities has been unveiled. With all the potentials unlocked, let's get into it...

In this update we got:

ChatGPT latest update in voice and image capabilites
Weekly productivity hack
2 quick snips for AI updates

Read time: 2 minutes

                       FOCUS OF THE WEEK

ChatGPT can now see, hear, and speak

Users can now engage in voice conversations with ChatGPT, enabled by a text-to-speech model. Images can also be shared for discussions. Voice capabilities will roll out for iOS and Android users, while images will be available on all platforms.

Voice:

Whisper, open-source speech recognition system, is to transcribe spoken words into text
The text-to-speech model has five different voice options for chats, created in collaboration with professional voice actors.

Image:

ChatGPT's language reasoning skills can now understand images, photographs, screenshots, and text documents.

Insights: Visual Question Answering (VQA) and image captioning application is useful in enhanced customer support in a handful of domains, such as social media, e-commerce and news. Multimodal Content Creation that enables the combination of text, image and voice is bringing content generation to the next stage.

                          QUICK SNIPS

AI startup AlphaSense valued at $2.5 billion after latest funding round

Amazon and Anthropic announce strategic collaboration to advance generative AI