laitimes

Apple released OpenELM, an efficient language model based on an open-source training and inference framework

author:IT House

IT Home reported on April 24 that before WWDC24, Apple released an "efficient language model with an open-source training and inference framework" on the Hugging Face platform, called OpenELM.

Apple released OpenELM, an efficient language model based on an open-source training and inference framework

Of course, this is an open-source language model, and its source code and pre-trained model weights and training recipes are available in Apple's Github repository.

Apple released OpenELM, an efficient language model based on an open-source training and inference framework

IT Home translates the official profile as follows:

The reproducibility and transparency of large language models are critical to advancing open research, ensuring confidence in results, and investigating data and model bias and potential risks. To that end, we released OpenELM, a state-of-the-art open-source language model.

OpenELM uses a hierarchical scaling strategy that efficiently distributes parameters for each layer of the Transformer model, improving accuracy. For example, with about 1 billion parameters, OpenELM is 2.36% more accurate than OLMo, and the number of pre-trained tokens required is only 50%.

Apple released OpenELM, an efficient language model based on an open-source training and inference framework

Unlike previous practices that only provide model weights and inference code and pre-train on private datasets, our release includes a complete framework for training and evaluating language models on public datasets, including training logs, multiple checkpoints, and pre-training configurations.

We've also released code to convert models to MLX libraries for inference and fine-tuning on Apple devices. This general release aims to strengthen and strengthen the open research community and pave the way for future open research efforts.

Resources:

Read on