Thursday, December 7, 2023
HomeMake Money OnlineIceland lleva años luchando para que su idioma no se extinga. ...

Iceland lleva años luchando para que su idioma no se extinga. Now he has found a solution: GPT-4

An article from The Guardian from 2018 predicted a black future for Icelanders as idiomatic: “fight against the threat of ‘digital extinction'”. In that report he explained how the mother tongue and the cultural identity of Iceland se está estancando en un Internet que es casi completamente en inglés. The same government also warned a few years ago that in a few generations, if it could not continue to be the country’s default language in the face of rapid digitization, it would disappear completely.

To protect it, the country even has a Department of Linguistic Planning that coines Icelandic terms for new concepts instead of borrowing words from other languages. computerfor example, es tölvaque es una mezcla entre tala (number) y völva (prophetic). The objective is that the idiom remains linguistically “pure” and maintains the essence of its ancient Nordic roots.

But in spite of that, the Icelanders solamente es utilizado hoy en día por there are 340,000 people. And we don’t even accept Siri or Alexa. And, in a moment in which Netflix, YouTube and voice assistants have become the globalized world day by day, Icelanders are drowning in an ocean of English. This is also the case when a majority language in the real world becomes a minority language in the digital world.

So the Government had an idea: GPT-4.

A few hours ago, in an unexpected announcement, OpenAI launched the long-awaited GPT-4 model, an update of the technology behind its popular ChatGPT, the fastest growing application in history. The company claims that GPT-4 is its “most advanced system that produces more secure and useful responses”. In fact, in their blog, they claim that “superior to ChatGPT in its advanced reasoning capabilities” and “take advantage of more data and more computing to create more sophisticated and capable language models”.

Young Icelandic people are starting to speak Icelandic.  Guilty?  Internet

And in this new launch Iceland has seen a solution to its problems. From the president of the country, Guðni Th. Jóhannesson, has partnered with OpenAI to use GPT-4 to preserve Icelandic. “We have to introduce our language into the software and applications that people use every day,” explained Jóhanna Vigdís Guðmundsdóttir, executive director of Almannarómur, a for-profit language technology center.

And how can IA help Iceland?

To answer this question, it must be understood that OpenAI’s GPT models are largely trained for text on the Internet. Eso quiere decir that the major part of the knowledge and the capacity of the technology is in English (because the major part of the Internet is in English). Eso, a su vez, se traduce en que GPT does not have the same comprehension skills in smaller languages. Y, aunque ha ido mejorando con el tiempo, no siempre produce clear and correct Icelandic translations. In the following example, some basic reasoning errors are clearly seen:

Promptly in Icelandic and in English.
What is Donald Duck in Icelandic?
What is Donald Duck called in Icelandic?

GPT-3 answer in Icelandic and English:
Donald Duck is called Donaldi Kjáni in Icelandic
Donald Duck is called in Icelandic Donaldi the Fool

ChatGPT’s response in Icelandic and English:
Donald Duck has the same name in Icelandic and English
Donald Duck has the same name in Icelandic and English

GPT-4 answer in Icelandic and English:
Donald Duck is called Andrés Önd in Icelandic
Donald Duck is called Andrés Önd in Icelandic


Although GPT-4 behaves much better than its previous versions, it still has some problems grammatical, “translation” and cultural errors. To solve this, the Icelandic language technology company Miðeind ehf has assembled a team of 40 volunteers to train GPT-4 on the appropriate Icelandic grammar and cultural knowledge.

How? As detailed on his website, with a process called “Reinforcement learning from human feedback”, or RLHF. Consiste en que los humans le dan a GPT-4 un promptly and if they generate four possible finalizations. Then, the evaluators select the best answer from the four and edit it to create the one that would be ideal. The data from this process are then used to further train GPT-4 to produce better answers in the future.

It’s something that the Icelandic team tried to refine in GPT-3 with 300,000 examples in Icelandic, but the results were very bad. The ability to generate correct or grammatical Icelandic simply did not work with GPT-3. Now, sin embargo, son capaces de habilitar what before required much manual work, data preparation and resource compilation for each case of use.

In fact, GPT-4 is already capable of giving a different answer to the same question according to the language in which the question is asked and the cultural context.

Prompt in Icelandic:
Hver er forseti now?

Answer in Icelandic:
Forseti Íslands nuna (2021) er Guðni Th. Jóhannesson.

Prompt in English:
Who is president now?

Answer in English.
As of 2021, the President of the United States is Joe Biden.

Therefore, the task now is to enable GPT-4 with sufficient examples so that the model can be used more complex and creative applications in Icelandic or other minority languages, instead of using the default English language. In addition, when using GPT-4 as backendEmbla, voice assistant of Miðeind or others of different languages, will be able to hold conversations with users in their languages ​​in a more fluid manner.

In fact, Duolingo is already incorporating GPT-4 into its language learning application to create two new functions supported by AI, “Role play” and “Explain my answer”. With GPT-4, it offers students the ability to converse freely on various topics in niche contexts in Spanish and French and plans to expand to more languages.

Images: Unsplash

More info: OpenAI

In Xataka | GPT-4 has just made traditional exams obsolete (and that includes university level ones)

Neil Barker
Neil Barker
Hi there! I am Neil Barker, a tech enthusiast who believes in the power of open-source software.


Please enter your comment!
Please enter your name here

Most Popular

Recent Comments