Arthur Zucker's picture

Arthur Zucker

ArthurZ

·

AI & ML interests

None yet

Recent Activity

reacted to Xenova's post with 🔥 about 19 hours ago

reacted to davidberenstein1957's post with 👀 about 19 hours ago

reacted to LukeNeumann's post with 🤯 about 19 hours ago

Articles

Fixing Gradient Accumulation

Improving Hugging Face Training Efficiency Through Packing with Flash Attention

Fine-Tuning Gemma Models in Hugging Face

Code Llama: Llama 2 learns to code

Organizations

ArthurZ's activity

New activity in mistralai/Pixtral-Large-Instruct-2411 2 days ago

Upload transformers version

#3 opened 3 days ago by

New activity in huggingface/documentation-images 6 days ago

Upload Meta-Llama-3-8B-Instruct, seqlen = 512, python, w_ compile.png

#392 opened 6 days ago by

New activity in mistral-community/pixtral-12b about 1 month ago

Update model weight

#13 opened about 1 month ago by

Update hidden_act to silu

#14 opened about 1 month ago by

New activity in rhymes-ai/Aria about 1 month ago

llama.cpp support

#1 opened about 1 month ago by

New activity in google/gemma-2-2b-jpn-it about 2 months ago

tokenizer_config.json is different from gemma-2-2b-it

#8 opened about 2 months ago by

New activity in mistral-community/pixtral-12b about 2 months ago

How can i use the full 24GB model instead of this separated safetensors files?

#8 opened about 2 months ago by

New activity in meta-llama/Llama-3.2-11B-Vision-Instruct about 2 months ago

hidden_activation vs hidden_act in config.json

#10 opened 2 months ago by

New activity in mistral-community/pixtral-12b-240910 2 months ago

How to use safetensors?

#13 opened 2 months ago by

New activity in mistral-community/pixtral-12b 2 months ago

lamma cpp ht to gguf not working

#2 opened 2 months ago by

New activity in meta-llama/Llama-3.1-405B-Instruct-FP8 3 months ago

8-kv-heads

#14 opened 3 months ago by

New activity in meta-llama/Llama-3.1-405B-FP8 3 months ago

Update config.json

#17 opened 3 months ago by

Config KV Heads should be 8 now?

#16 opened 3 months ago by

New activity in meta-llama/Llama-3.1-405B-Instruct-FP8 3 months ago

8 kv heads

#13 opened 4 months ago by

New activity in meta-llama/Llama-3.1-405B-FP8 3 months ago

8-kv-heads

#15 opened 4 months ago by

New activity in meta-llama/Llama-3.1-405B 3 months ago

8-kv-heads

#21 opened 4 months ago by

New activity in meta-llama/Llama-3.1-405B-Instruct 3 months ago

8-kv-heads

#17 opened 4 months ago by

New activity in meta-llama/Llama-3.1-405B-FP8 4 months ago

Updated eos_token to include multiple IDs

#14 opened 4 months ago by

Update tokenizer to prepend special token

#12 opened 4 months ago by

New activity in meta-llama/Llama-3.1-70B 4 months ago

Update tokenizer to prepend special token

#11 opened 4 months ago by