ChatGPT Can Now See, Hear, and Speak

Greetings, fellow AI apprentices

Breaking Boundaries: ChatGPT's game-changing multimodal capabilities has been unveiled. With all the potentials unlocked, let's get into it...

In this update we got:

  • ChatGPT latest update in voice and image capabilites

  • Weekly productivity hack

  • 2 quick snips for AI updates

Read time: 2 minutes

                       FOCUS OF THE WEEK

ChatGPT can now see, hear, and speak

Users can now engage in voice conversations with ChatGPT, enabled by a text-to-speech model. Images can also be shared for discussions. Voice capabilities will roll out for iOS and Android users, while images will be available on all platforms.

Voice:

  • Whisper, open-source speech recognition system, is to transcribe spoken words into text

  • The text-to-speech model has five different voice options for chats, created in collaboration with professional voice actors.

Image:

  • ChatGPT's language reasoning skills can now understand images, photographs, screenshots, and text documents.

Insights: Visual Question Answering (VQA) and image captioning application is useful in enhanced customer support in a handful of domains, such as social media, e-commerce and news. Multimodal Content Creation that enables the combination of text, image and voice is bringing content generation to the next stage.

                          QUICK SNIPS