Must Reads On Language Model
Dive into the world of generative AI with some prominent papers of Language Model, unlocking the secrets of natural language processing.
Paper • 1810.04805 • Published • 14Note BERT is one of the pioneering model, developed by Google.
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 7Note Model was developed by Meta AI (formerly known as Facebook AI).
Language Models are Few-Shot Learners
Paper • 2005.14165 • Published • 11Note GPT 3 model was developed by OpenAI.
OPT: Open Pre-trained Transformer Language Models
Paper • 2205.01068 • Published • 2Note Model was developed by Meta AI (formerly known as Facebook AI).
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
Paper • 2203.13474 • Published • 1Note Model code was public on Salesforce page.
GPT-NeoX-20B: An Open-Source Autoregressive Language Model
Paper • 2204.06745 • Published • 1Note Model code was public by Eleuther AI.
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Paper • 2211.05100 • Published • 28Note Model was developed by Big Science.
Llama 2: Open Foundation and Fine-Tuned Chat Models
Paper • 2307.09288 • Published • 242Note Model was developed by Meta AI. Llama 1 paper: "LLaMA: Open and Efficient Foundation Language Models".
Mixtral of Experts
Paper • 2401.04088 • Published • 157Note MoE is the main architecture of Mixtral 8x7B, which is the model developed by Mistral AI. Predecessor of Mixtral 8x7B is Mistral 7B (paper name is model name also).