Everything was amazing and very open when a few years ago they began to advance in the development of practical systems of artificial intelligence. Companies like OpenAI not only brought to light amazing engines like DALL-E 2 or GPT-3, but they published extensive studies detailing how they had been created. That’s over.
GPT-4. This week OpenAI announced the new version of its conversational AI model, GPT-4, and although in the official announcement it showed the advantages and its capacity, no internal details of its development were given. It is not known with how many data the system has been trained, what is the energy cost or what hardware and methods have been followed to create it.
I think we can call it shut on ‘Open’ AI: the 98 page paper introducing GPT-4 proudly declares that they’re disclosing *nothing* about the contents of their training set. pic.twitter.com/dyI4Vf0uL3
— Ben Schmidt / @[email protected] (@benmschmidt) March 14, 2023
OpenAI is everything but ‘Open’. Ben Schmidt, an engineer at an IA cartographic company called Nomic, pointed out an important detail in the 98-page technical report on GPT-4: its managers declared that they were not going to reveal anything about how they had trained the model.
The reason? The competition. On the second page of this report they talked about the scope and limitations of this report, warning that although they would give data on their capabilities, they would not disclose internal details of the development in order not to facilitate that other competitors could match their development:
“Taking into account the competitive landscape and the implications for the security of large-scale models like the GPT-4, this report does not contain more details about the architecture (including the size of the model), the hardware, the calculation of training, the construction of the set of data, from the method of training or other similar data, the construction of the set of data, from the method of training or similar”.
Las empresas se ponen serias. Ilya Sutskever, one of the co-founders of OpenAI, explained in The Verge that “there is much competitiveness out there. […] Many companies want to do the same, so from a competitive point of view, you can see this as the maturation of this field”.
But there is a danger. The problem with this new secretive posture is the security of these models, which without transparency lose the ability to be audited by other experts or independent organizations. For Sutskever the approach was precisely the opposite: these models can end up causing “a lot of damage”, and as the capacities improve, “there is a sense that we want to reveal [cómo funcionan por dentro]” so bad actors can’t take advantage of it.
“We were wrong”. The attitude of OpenAI in this sense has changed completely, and in the past they shared ample information about their models. According to Sutskever “we were mistaken”. For him if the IA ends up being powerful “it doesn’t make sense to open the source code. It’s a bad idea… I hope that within a few years it will be completely obvious to all that open code IA is not a good idea”.
Opinions for all tastes. While OpenAI is closing in on GPT-4, Meta has just launched LLaMA, an essentially open competitor that we can actually install on our laptop. The approach is from the moment that OpenAI originally had, but for others the decision of the creators of GPT-4 makes sense from the business point of view. That’s what William Falcon, creator of the PyTorch Lightning tool, confessed, that in VentureBeat he explained how “If it’s a bad model, it’s going to be bad, you’ve seen it with hallucinations and given false information, how do you suppose the community will react ?”
Y además is out of copyright. There is also a legal aspect to everything: the sets of data or ‘datasets’ with which models are trained are gigantic and much of this information is collected from the web. It is likely that part of these contents are protected by copyright, so we do not say how they have trained to protect them (initially) against potential demands for copyright violations.
In Xataka | I have no idea of programming but thanks to GPT-4 he created a clone of Flappy Bird