Last February 1, the Samsung Unpacked 2023 event was held and I didn’t want to spend an hour watching it. What happened? Take advantage of the new wave of artificial intelligence services and tools to “assist” an event and become my assistants.
The test was a success. Using Bing with ChatGPT and a couple of additional services obtuve un resumen detailing y muy claro de todo lo que se dijo in ese event. This showed me that this technology has a remarkable practical use, and this is only from the beginning.
Artificial intelligence as a weapon to win time after time
We are in the era of the prisas. We don’t have time for almost anything. Y como no lo tenemos, se robamos a todo lo que podemos. We watch series and listen to podcasts at 1.5x, we pay to send everything and so we can continue doing things at home, and TikTok and its short videos have revolutionized this segment.
The obsession to save time and gain it for other things—even if, ironically, watching more short videos from TikTok—is clear in this era in which there is a technology very oriented to that.
Artificial intelligence promises to help us in this battle, and since Xataka we have wanted to verify one of its promises: the de no tener que “tragarnos” largas meetings and chats because she is capable of recognizing, transcribing and summarizing the content to then present it.
That’s why we wanted to do this experiment with chats like the one Samsung offered a few days ago. I must say here that in reality he told a liar: sí que you from an event.
After all, it’s part of my work and everything from Xataka’s team, which was very attentive to that coverage to talk about what was presented there, which was much and very relevant. Sin embargo, that in reality was an advantage for this experiment, because it allowed me to verify that what they really offered me were applications in the final summary, which was true to what Samsung commented in their event.
Cómo la IA is basically a one-hour talk in ten paragraphs
The first step consisted of transcribing the event and converting it into text. There are diverse tools that help to do it, but I wanted to try one of those that I discovered recently and that presume to make use of modern artificial intelligence systems.
That tool is not another than Gladia, which in reality is an API that precisely performs diverse conversions of content between which is, of course, the convert audio to text. After a brief previous registration, he had access to the so-called “tasks” that allow him to execute these conversions with some variants.
For the conversion it was necessary to first count on the audio file of the event. It is easy to extract the original YouTube video with online services or with locally installed applications. When I did this, I had before me an MP3 file of something less than 57 MB and almost an hour long. perfect
After importing it into Gladia, the service started doing the transcription. Normally, other solutions are either more sophisticated, or impressively limited in the duration of the audio to be transcribed. Otter.ai, one of my favorites, allows for example files of 30 minutes in its free version, which was discarded for this quick test.
The problem with Gladia is that it doesn’t transcribe directly by generating a text, but it generates from a text full of metadata. This information is very useful for certain scenarios like subtitling the event -including time stamps with the beginning and end of each fragment of the transcription-, but I wanted the raw audio, without metadata. How to achieve it?
Asking, of course. As an old Linux user it was clear that here tools like ‘sed’ or ‘awk’ could help me a conseguir lo que buscaba, pero el problema es que para usarlas uno debe tener cierta (o mucha) soltura con regulare expresiones y gestion de patrones.
No, it was not my case -no las uso casi nunca-, and the normal way up until now would have been to search for the solution in Google or go to forums such as Superuser or Reddit to seek help from some expert. But here it was about looking for artificial intelligence solutions, and that’s just what I took advantage of with Bing with ChatGPT as the protagonist.
The new conversational motor behaved spectacularly here. After a brief conversation explaining to him what he wanted, indicating to him that he thought it could be resolved with ‘but’ and giving him an example, Bing gave me the answer. It’s curious, because I asked him in English and he continued to answer in Spanish, but he gave the same: the answer was perfect.
I had all the text in a file called “grabacion.txt” and generated a new file called “resultado.txt” with the command indicated in its response, which I introduced in a (taped) terminal in my Mac mini M1 with macOS. Time reversed? One or two minutes.
Now it was the last step: lograr que con ese texto se generada el resumen searchado. Both ChatGPT and Bing with ChatGPT can summarize texts, but they have a problem: their character limits, which for example in Bing is 2,000 characters of entry. This text was much larger in size, so it needed an alternative.
This is where new alternatives have arisen. In recent days, diverse tools have appeared to summarize from video conferences to scientific studies. I needed an option to summarize long textsand among them I found Casper AI, an extension for Chrome capable of offering summaries of websites we visit and that has some more attractive options, such as generating tweets with an attractive headline generated from the content visited.
To function, yes, the text of Unpacked 2023 had to be on some web site, so I copied it and pasted it to a new post in my other blog —los experimentos, mejor con gaseosa— that I didn’t even need to publish: it was enough to preview it in from the Chrome browser so that Casper AI could do its job.
In just 10 seconds had in the lateral bar generated by Casper AI from resume of this event, distributed in small paragraphs with the main points that were offered during this hour of talk.
The text was in English and the summary too, but it is logical to think that there are similar alternatives in our language or that those who have will support it (like others) in a short time. The functioning of Casper AI in this sense was impeccable, but the question is, was it a good summary of the event?
The truth is that the summary was practically perfect, something that really left me in awe. This system showed a tour for the small introduction of the CEO of Samsung to then summarize the main features of the Galaxy S23 Ultra and options such as its “nightography” – he understood “nitography” – for photos with low light. It also showed details about QuickShare and the mention that Samsung made of its new Ultrabooks.
It is true that por el camino left some details about the rest of the devices in the Galaxy S23 range and also about those ultraportables. It also confused some numbers, like the new Snapdragon of those mobiles, but still the summary was spectacular for its accuracy.
What does it demonstrate? Que ciertamente este type of tools pueden ser muy utiles como ayuda a la hora de ahorrar tiempo to be able to invest it in more priority tasks for us. Without a doubt, a fantastic practical demonstration of what they can do is solutions.
Image: Priscilla Du Preez