ChatGPT may be the absolute reference in the world of chatbots, but the competition in this segment is increasingly lively. In addition to Bing Chat – which is based on GPT-4 – and Google Bard, more and more striking alternatives are appearing, and one of the most notable is Claude, from a chatbot created for Anthropic.
This startup wants to become the next OpenAI, and since then it has papers to plant the car. Entre otras cosas because its founders are exemplified by OpenAI itself. Dissatisfied with the strategic vision of their company, they created their own company and in January 2023 launched Claude, their chatbot.
We have already obtained access to this initial version that is currently available for invitation – you can request one here -, and after a few hours we have been asking you about various questions, we wanted to analyze his behaviorits performance and its options to become a true rival of the most relevant chatbots today.
A careful chatbot
Claude’s operation is identical to other chatbots such as ChatGPT: after starting a session, a conversational interface is presented with a side bar on the left with the history of conversations, and a dominant central part that is dedicated to being able to write text (prompts), after which this chatbot will answer the best it can.
Nothing but starting to talk with Claude queda claro que we are facing a fairly formal chatbot y que no trata de tener un trato más cercano como el usuario, como es el case de Bing Chat.
When introducing us as Xataka, Claude asked us how to help us and we immediately asked him if he knew what Xataka was. The response, short and concise, was adequate.
Like other similar chatbots, Claude es capaz de seguir la conversación. Es decir, tras preguntarle “¿sabes qué es Xataka?”, uno podia preguntar “¿qué tráfico monthly tiene?” sin necessidad de decir que nos referíamos a Xataka: sobreentiende certadamente que es así.
The answers do not appear suddenly like Bard, and they are not as fast as ChatGPT (GPT-3.5), but neither are they as slow as ChatGPT Plus (GPT_4). The speed at which the text is being generated is a bit of a horse between these two models, which is acceptable unless we expect a release of many characters, something that can make this response take a little longer to complete.
We took the opportunity to quickly verify that Claude is also similar to ChatGPT in its limitations: no es capaz de decirnos la fecha y la horaand also does not offer references (for example, links) with the sources from which it obtains its information.
When Bard appeared we wanted to compare his behavior to that of Bing Chat and GPT-3/GPT-4, and here we want to do the same thing and check Claude’s answers to those questions.
We first asked him to explain the theory of relativity to us as if he were a 5-year-old child, to which Claude gave an answer that was initially adequate, but then mentioned the equation E=mc^2 and explained it briefly, something that a five-year-old child would hardly understand. None of the chatbots were especially good at trying to understand what could be done to understand such a small child, but the response was relatively clear.
When we asked Claude what he was like, he answered quickly clarifying that “I’m fine, thanks for asking” but quickly made it clear that “no experiment estados de ánimo o emociones como los humanos”.
We also wanted to verify if we were capable of it act as a Linux terminal, something that could effectively be done. We introduced some basic commands, and responded to all of them by giving data on the hardware configuration of the server -probably virtual- on what was running from the service.
The answers to philosophical questions (“Why are we here?”) are also surprisingly coherent in Claude. The same thing happened with his rivals, who in one way or another were trying to give more and more clues to answer this impossible question, and no concrete answer.
In this battery of small tests we ask Claude que create a table with the 10 countries with the most titles and subtitles of the Copa del Mundo de Fútbol. It’s strange, because we answered the question in English after having made it vary in Spanish and some others in that language, and Claude answered initially in English but then quickly switched to Spanish.
Here his answer was not formatted in a simple way for the user, but the data was correct except for one detail: it did not take into account the last World Cup in Qatar 2022, which France won and in which Argentina was runner-up: these data did not form part of the statistics thrown by Claude.
We also wanted to see how it behaved at the time of writing a brief analysis of four paragraphs of the iPhone 14 Pro Max, and the result was decent. El tone era algo plan y se limitaba a a description of its main hardware features, something that their competitors had also done in our previous tests. There were some other adjectives that might sound exaggerated, but the writing was impeccable and the technical data was correct at almost all times (the optical zoom is 3x, not 2.5x, for example).
Claude accepts super long texts
We asked Anthropic why it had been trained, and it gave a generic answer to that question about the number of parameters used – the more, the more powerful the model usually is – or sobre su ventana de contexto (how many tokens admit as maximum as entry).
GPT-4 has for example a context window that reaches 32K tokens, and although Claude does not directly confess the figure, Anthropic recently announced that it had expanded its original window to reach an impressive 100K tokens, which allows for example demos (theoretically) as an introduction to a novel of good size (unas 75,000 words) and from there we can ask you anything.
We wanted to test those limits by informing Claude that we were going to copy a long text and then make questions about it. Qué mejor que aprovechar nuestro Quijote, disponible en texto plano in a repository de GitHub, y del cual translados un buen fragmenta a Claude. Nos pasamos, de hecho, because the first fragments were of 98,000 and 51,000 words approximately and Claude remained “thinking” of an indefinite form.
La cosa cambió cuando acortamos un poco el fragmento, que even so it was enormous with 32,500 words. After doing it and studying it for a few seconds, Claude wrote with apparent surprise that “Wow, there’s much content here” to then briefly summarize the chapter (sin que se lo hubiéramos pedido).
We asked Claude to briefly describe the four main characters of the chapter, and after a few seconds – it seems to require a bit more time with each request when the starting text is so long – he gave an apparently coherent response.
So, in nuestras breves pruebas desde luego Claude demonstrated his ability to “ingest” large amounts of data sobre las cuales luego poder hacer analysis of that information. It is an especially interesting option for summarizing documents and extracting conclusions and important data from all of them, and without a doubt it is one of the most striking features of this chatbot.
A good rival for ChatGPT
In view of the results, we are facing one excellent rival for ChatGPT, Bard or Bing Chat. The speed in the answers is adequate – although not flashy – and although it is equivocal – like its competitors – from time to time, its behavior is notable.
He is also capable of adapting to problems that he himself can create. We verified it by asking him to create a small program in Python to generate snowflakes drawn in a window. It generated code, but when trying to execute it in a terminal (in macOS) an error appeared: we didn’t have a library (Tkinter) installed that was used in your code.
After mentioning it, Claude showed us how to install that library, and after doing what he suggested, the Python code that he had suggested in the first place worked perfectly.
Estamos pues a la espera de que este chatbot pueda estar disponible grand escala, pero desde luego Claude is a worthy rival for a segment that every time is animated more and in which the proposals are really striking.
In Xataka | Google is facing two wars in the AI field: the first against OpenAI and the second against itself