ben burtenshaw's picture

ben burtenshaw

burtenshaw

·

AI & ML interests

None yet

Articles

Low Code Large Language Model Alignment

Argilla 2.4: Easily Build Fine-Tuning and Evaluation datasets on the Hub — No Code Required

How to build a custom text classifier without days of human labeling

How to optimize your data labelling project with custom interfaces

⚗️ 🔥 Building High-Quality Datasets with distilabel and Prometheus 2

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

Organizations

Posts 1

Post

1417

SFT + Quantisation + Unsloth is a super easy way of squeezing extra performance out of an LLM at low latencies. Here are some hand y resources to bootstrap your projects.

Here's a filtered dataset from Helpsteer2 with the most correct and coherent samples: burtenshaw/helpsteer-2-plus
This is a SFT finetuned model: ttps://huggingface.co/burtenshaw/gemma-help-tiny-sft
This is the notebook I use to train the model: https://colab.research.google.com/drive/17oskw_5lil5C3jCW34rA-EXjXnGgRRZw?usp=sharing
Here's a load of Unsloth notebook on finetuning and inference: https://docs.unsloth.ai/get-started/unsloth-notebooks

Collections 3

Papers 1

arxiv:2408.16961

spaces 24

Martinique

Create Dataset Ui

My Argilla

Argilla Fosllms

Argilla UI Demo Space (login: argilla/1234)

Argilla Llamaindex Monitor

models 10

burtenshaw/code-llama-3-2-1b-commerce

Text Generation • Updated about 23 hours ago

burtenshaw/code-smol2-text-to-sql

Updated 3 days ago • 4

burtenshaw/Qwen2.5-3B-Instruct-GGUF

Updated 22 days ago • 5

burtenshaw/gemma-help-tiny-sft

Text Generation • Updated Aug 9 • 17 • 1

burtenshaw/Qwen1.5-0.5B-dpo-mix-7k

Text Generation • Updated Apr 3 • 5

burtenshaw/notus-merged-with-code-mistral-so-its-better-at-coding

Updated Apr 2 • 3

burtenshaw/Qwen1.5-0.5B-dpo-mix-7k-GGUF

burtenshaw/Qwen1.5-0.5B-dpo-mix-7k-5000

Text Generation • Updated Mar 29 • 14

burtenshaw/Qwen1.5-0.5B-dpo-mix-7k-3000

Text Generation • Updated Mar 29 • 13

burtenshaw/setfit_food_annotated

Text Classification • Updated Mar 2, 2023 • 7

datasets 20

burtenshaw/dataset-diff-test-changed

Viewer • Updated 23 days ago • 3 • 28

burtenshaw/dataset-diff-test

Viewer • Updated 23 days ago • 3 • 25

burtenshaw/most_used_models

Viewer • Updated 29 days ago • 250 • 43 • 1

burtenshaw/exam_questions

Viewer • Updated 30 days ago • 7 • 34

burtenshaw/pc-components-reviews-vectors

Viewer • Updated Oct 17 • 200 • 44

burtenshaw/fosllms-week-1-demo

Viewer • Updated Oct 16 • 12 • 33

burtenshaw/yahoo_answers_topics

Viewer • Updated Oct 3 • 100 • 41

burtenshaw/image-search-queries

Viewer • Updated Sep 10 • 199 • 38

burtenshaw/document-similarity

Viewer • Updated Sep 5 • 20 • 35

burtenshaw/helpsteer-2-plus

Viewer • Updated Sep 2 • 8.88k • 44 • 2