Not even halfway through the year, we have already seen back-to-back Large Language Models (LLMs) being released by tech giants.
A month ago, Google’s 540-billion parameter Pathways Language Model (PaLM) came out, followed by DeepMind’s Chinchilla.
Now, Meta AI has released Open Pretrained Transformer (OPT-175B), a language model with 175 billion parameters trained on publicly available data sets.
Meta says that with this release, it aims to build more community engagement in understanding LLMs.
In terms of performance, Meta claims that OPT-175B is comparable to GPT-3 but requires only 1/7th of the carbon footprint to develop.
Is it just another LLM or does it set itself apart in some way? OPT is generating a lot of buzz globally as the model release is under a noncommercial license.
GPT-NeoX-20B by EleutherAI – It is a 20 billion parameter autoregressive language model whose weights, training and evaluation code are open-source.
When such LLMs are released by tech leaders, most like to give access to their innovations to “selected” people, organisations and big research labs.
“Training a single BERT base model (without hyperparameter tuning) on GPUs was estimated to require as much energy as a trans-American flight,”