Tea@programming.dev to

Artificial Intelligence @lemmy.sdf.orgEnglish · 6 hours ago

OLMo 2 32B sets a new standard for true open-source LLMs with public code, weights, and data, outperform GPT 3.5 and GPT 4o mini.

1

5

OLMo 2 32B sets a new standard for true open-source LLMs with public code, weights, and data, outperform GPT 3.5 and GPT 4o mini.

Tea@programming.dev to

Artificial Intelligence @lemmy.sdf.orgEnglish · 6 hours ago

1

OLMo 2 32B: First fully open model to outperform GPT 3.5 and GPT 4o mini | Ai2

Introducing OLMo 2 32B, the most capable and largest model in the OLMo 2 family.

Today we release OLMo 2 32B, the most capable and largest model in the OLMo 2 family, scaling up the OLMo 2 training recipe used for our 7B and 13B models released in November. It is trained up to 6T tokens and post-trained using Tulu 3.1. OLMo 2 32B is the first fully-open model (all data, code, weights, and details are freely available) to outperform GPT3.5-Turbo and GPT-4o mini on a suite of popular, multi-skill academic benchmarks. It is comparable to the leading open-weight models while requiring only a fraction of training compute. For example, OLMo 2 32B takes only one third of the cost of training Qwen 2.5 32B while reaching similar performance. The OLMo 2 family of models—now available in 7B, 13B, and 32B parameter sizes, all can be finetuned on a single H100 GPU node, and all models are available on the Ai2 playground.

Chat

Turbonics@lemmy.sdf.org
link
fedilink
English
arrow-up
3·
4 hours ago
Damn Gemma 3 has been out for 3 days and it is already wrecked.