Produced by|Open Source China
The UAE research team recently announced the open-source Arabic large model Jais.
Jais is a large bilingual Arabic and English language model pretrained with 13 billion parameters, trained on a dataset containing 72 billion Arabic blocks and 279 billion English/code blocks. The model was developed in collaboration between Cerebras, the UAE University of Artificial Intelligence and Inception, a subsidiary of G42.
Jais is named after the UAE's highest peak, and Timothy Baldwin, a professor at the UAE University of Artificial Intelligence, said that because there is not enough Arabic data to train a Jais-sized model, the computer code in the English data helps train the model's reasoning ability.
The model is now open source and can be obtained from HuggingFace.