BLOOM (language model)

BigScience Large Open-science Open-access Multilingual Language Model (BLOOM[1]) is a transformer-based large language model. It was created by over 1,000 AI researchers to provide a free large language model for large-scale public access. Trained on around 366 billion tokens over March through July 2022, it is considered an alternative to OpenAI's GPT-3 with its 176 billion parameters. BLOOM uses a decoder-only transformer model architecture modified from Megatron-LM GPT-2.

The BLOOM project[2] was started by a co-founder of Hugging Face. Six main groups of people were involved, including HuggingFace's BigScience team, the Microsoft DeepSpeed team, the NVIDIA Megatron-LM team, the IDRIS/GENCI team, the PyTorch team, and the volunteers in the BigScience Engineering workgroup.[2] BLOOM was trained using data of 46 natural languages and 13 programming languages. In total, 1.6 terabytes of pre-processed text was converted into 350 billion unique tokens as BLOOM's training datasets.[3][4]

References

  1. "BigScience Large Open-science Open-access Multilingual Language Model". Retrieved 1 October 2022.
  2. "The Technology Behind BLOOM Training". Retrieved 1 October 2022.
  3. Teven Le Scao; Wang, Thomas; Hesslow, Daniel; Saulnier, Lucile; Bekman, Stas; M Saiful Bari; Biderman, Stella; Elsahar, Hady; Muennighoff, Niklas; Phang, Jason; Press, Ofir; Raffel, Colin; Sanh, Victor; Shen, Sheng; Sutawika, Lintang; Tae, Jaesung; Zheng Xin Yong; Launay, Julien; Beltagy, Iz (2022). "What Language Model to Train if You Have One Million GPU Hours?". arXiv:2210.15424.
  4. Arteaga, Cristian (2022-08-25). "Understand BLOOM, the Largest Open-Access AI, and Run It on Your Local Computer". Medium. Retrieved 2023-07-24.


This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.