- Papercut AI
- Posts
- ChatGPT Can Now See, Hear, and Speak
ChatGPT Can Now See, Hear, and Speak
Greetings, fellow AI apprentices
Breaking Boundaries: ChatGPT's game-changing multimodal capabilities has been unveiled. With all the potentials unlocked, let's get into it...
In this update we got:
ChatGPT latest update in voice and image capabilites
Weekly productivity hack
2 quick snips for AI updates
Read time: 2 minutes
FOCUS OF THE WEEK
ChatGPT can now see, hear, and speak
Users can now engage in voice conversations with ChatGPT, enabled by a text-to-speech model. Images can also be shared for discussions. Voice capabilities will roll out for iOS and Android users, while images will be available on all platforms.
Voice:
Whisper, open-source speech recognition system, is to transcribe spoken words into text
The text-to-speech model has five different voice options for chats, created in collaboration with professional voice actors.
Image:
ChatGPT's language reasoning skills can now understand images, photographs, screenshots, and text documents.
Insights: Visual Question Answering (VQA) and image captioning application is useful in enhanced customer support in a handful of domains, such as social media, e-commerce and news. Multimodal Content Creation that enables the combination of text, image and voice is bringing content generation to the next stage.
QUICK SNIPS