GPT-4 will arrive next week according to Microsoft. And the big news is that it will be “multimodal”

This week, four Microsoft engineers in their German division organized an event dedicated to the revolution that LLM (Large Language Models) like GPT plant in the company. As part of that conference, I will surprise you with details of the expected new version of the OpenAI model.

GPT-4. When GPT-3 appeared in 2020 it did so in the form of a private beta. That prevented that model from demonstrating its capacity, but in 2022 the appearance of ChatGPT – based on an iteration of GPT-3 – changed everything. It’s been months since we talked about what we hope for with GPT-4, and the CTO of Microsoft in Germany, Andreas Braun, said according to Heise Online that this engine will arrive next week.

For the glory of Turing, why is it so difficult to define what is artificial intelligence?  (Captcha 1x01)

Cosmos-1. The arrival of GPT-4 seemed especially imminent after Microsoft’s announcement at the beginning of March of the launch of Kosmos-1, a Multimodal Large Language Model (MLLM) that not only responds to text prompts, but also to images. This makes it behave like Google Lens in a certain way and is capable of extracting information and context from an image.

Bigger, better. One of the clear characteristics expected from GPT-4 is that it has a larger size than GPT-3. While it has 175,000 million parameters, it is said that GPT-4 will have 100 trillion parameters, something that Sam Altman, CEO of AI, explained that “is a complete stupidity”. Even so, what is certain is that it will be bigger, and that will allow it to be able to respond to more complex situations and generate even more “humane” responses.

Multimodal? This is one of the great innovations – if not the biggest – of GPT-4, a multimodal model that, as already described in Kosmos-1, will allow the input to be from diverse sources or “modalities” such as text – what if used in ChatGPT—, images, video, spoken voice or other formats.

Dadme datos, que ya los analysis yo. It will be models using deep learning and processing of natural language to understand the relationships and correlations between these different types of data. By combining multiple “modalities”, an artificial intelligence model can improve precision and provide complex data analysis.

An example: from the video. An immediate practical application of these models is from the video. With GPT-4, it is theoretically possible to input a video and its associated audio so that the engine can understand the conversation, including the emotions of those involved in it. You can also recognize objects (or people) and extract information. So, one could obtain a summary of a film or a YouTube video as we now obtain summaries of meetings.

Ahorrando tiempo. One of the Microsoft engineers indicated how this type of engine would be of great help in customer service centers, in which GPT-4 could transcribe the calls and then summarize them, something that human agents normally have to do. According to his estimates, this could save 500 hours of work a day for a Microsoft client in Holland who receives 30,000 calls a day: the prototype was created in two hours, a developer dedicated a couple of weeks to it, and the result was apparently a success

GPT-4 will continue to commit errors. Although the new model will undoubtedly be more powerful, Microsoft wanted to make it clear that the artificial intelligence will not always answer correctly and it will be necessary to validate the answers.

Just in case, we’ll be careful. The expectation with GPT-4 is enormous, and in fact even Sam Altman, CEO of OpenAI, made it clear a few weeks ago that the industry and users should lower their expectations because “people are crying out for disappointment, and that that’s what will happen”.

In Xataka | “I couldn’t go to sleep seeing that it grew so much”: we talked with the creator of Abbreviame, a viral bot based on ChatGPT

Latest articles

Domina el Desarrollo de Interfaces Gráficas de Usuario con el Course Gratuito en Python Flask y HTML

Learn to create a native desktop application using Python and HTML/CSS/JS. In this free Udemy course, Zenahr Barzani teaches how to create desktop...

How to Create Strategy Video Games in 2023

Share on social networksIn the exciting world of video games, strategy games have achieved outstanding...

Learn to lead projects successfully with this Free Course of Scrum, Agile and Project Delivery!

The free Udemy course, 'Basic Concepts of Scrum, Agile and Project Delivery', is given for SCRUMstudy Certification and offers an introduction to the world...

10 bad photos that, if you post them on Facebook, your account will be blocked

Social networks have become the main way of communication for many users with the whole world, not only with our closest environment. Sin...

Related articles


Please enter your comment!
Please enter your name here