Constitutional AI
A collection of datasets and models that accompany the Constitutional AI recipe. See hf.co/blog/constitutional-ai for more details.
- Running on A1007📜
mistralai/Mistral-7B-v0.1
Text Generation • Updated • 374k • • 3.45kNote The base model we aligned with Constitutional AI
HuggingFaceH4/mistral-7b-grok
Text Generation • Updated • 918 • 43Note A fine-tuned version of Mistral 7B that was aligned to mimic the style of xAI's Grok assistant.
HuggingFaceH4/mistral-7b-anthropic
Text Generation • Updated • 1.59k • 8Note A fine-tuned version of Mistral 7B that was aligned with Anthropic's constitution to mimic the style of their assistants.
mistralai/Mistral-7B-Instruct-v0.1
Text Generation • Updated • 183k • 1.53kNote The chat model we used to generate the Constitutional AI datasets via self-critique
Anthropic/hh-rlhf
Viewer • Updated • 169k • 9.2k • 1.21kNote The source of prompts used to generate Constitutional AI datasets
HuggingFaceH4/cai-conversation-harmless
Viewer • Updated • 44.8k • 183 • 14Note The SFT and preference dataset that was generated via Anthropic's constitution.
HuggingFaceH4/grok-conversation-harmless
Viewer • Updated • 44.8k • 87 • 21Note The SFT and preference dataset that was generated by tweaking Anthropic's constitution to produces responses similar to Grok.
Constitutional AI: Harmlessness from AI Feedback
Paper • 2212.08073 • Published • 2Note The original recipe from Anthropic on Consitutional AI. Note they used PPO for alignments, while we chose to use DPO as we find it much simpler to use in practice.