Bigscience Workshop’s scientific contributions

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (3)


BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Major Contributors Prompt Engineering BigScience Workshop Architecture and Objective Engineering Evaluation and Interpretability Broader Impacts
  • Article
  • Full-text available

June 2023

·

206 Reads

·

16 Citations

Bigscience Workshop

·

Teven Le Scao

·

Angela Fan

·

[...]

·

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License. 1

Download

Figure 1: Organization of BigScience working groups.
Figure 2: Creation Pipeline of the ROOTS Corpus. The purple-colored sourcing stage of the pipeline and the yellow-colored processing stage are described respectively in Section 3.1.2 and Section 3.1.3.
Figure 5: The BLOOM architecture. The k head slope parameters for ALIBI are taken as 2 −8i n
Figure 6: DP+PP+TP combination leads to 3D parallelism.
Figure 7: Performance of various LLMs on subset of tasks from SuperGLUE benchmark in zero-and one-shot prompt-based setting.
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Major Contributors Prompt Engineering BigScience Workshop Architecture and Objective Engineering Evaluation and Interpretability Broader Impacts

March 2023

·

822 Reads

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License. 1


BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Major Contributors Prompt Engineering Architecture and Objective Engineering Evaluation and Interpretability Broader Impacts

November 2022

·

679 Reads

·

119 Citations

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License. 1

Citations (2)


... The OpenGPT-X project initially adopted the Megatron-DeepSpeed codebase 6 , developed by NVIDIA, extended by Microsoft researchers and further adapted during BigScience research workshop [47]. Other codebases, such as Meta's Open Pretrained Transformer (OPT) [31], also emerged, promising potential advantages in abstraction and usability. ...

Reference:

Training LLMs on HPC Systems: Best Practices from the OpenGPT-X Project
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Major Contributors Prompt Engineering BigScience Workshop Architecture and Objective Engineering Evaluation and Interpretability Broader Impacts

... It is a popular choice for research and development projects, supported by a large developer community. The flexibility of customization makes it applicable to a wide variety of projects [24]. ...

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Major Contributors Prompt Engineering Architecture and Objective Engineering Evaluation and Interpretability Broader Impacts