research title suggestion
#40 opened about 1 month ago
by
AbdulsamadW
togethercomputer/LLaMA-2-7B-32K Can be used for zeroshot classification?
#39 opened 6 months ago
by
Gonzalomoreno01
Adding `safetensors` variant of this model
#38 opened 7 months ago
by
SFconvertbot
Adding `safetensors` variant of this model
#36 opened 10 months ago
by
SFconvertbot
Adding Evaluation Results
#34 opened 12 months ago
by
leaderboard-pr-bot
How to use on GPU
#33 opened about 1 year ago
by
p2991459
Could you please explain more details about fine-tuning LLaMA-2-7B to LLaMA-2-7B-32k? Such as the fine-tuning steps and batch size. Thanks!
#32 opened about 1 year ago
by
Mooler
Adding `safetensors` variant of this model
#30 opened about 1 year ago
by
efy9002
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'
4
#29 opened about 1 year ago
by
shubhamagarwal92
Using the Accelerate API to train models on multiple GPUs
8
#28 opened about 1 year ago
by
ajash
Keep getting error while loading tokenizer = AutoTokenizer.from_pretrained("togethercomputer/LLaMA-2-7B-32K")
5
#27 opened about 1 year ago
by
AIHero123
[AUTOMATED] Model Memory Requirements
#26 opened about 1 year ago
by
model-sizer-bot
Installing ! pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/rotary but flah_llama still erroring out
4
#25 opened about 1 year ago
by
ajash
Adding `safetensors` variant of this model
#24 opened about 1 year ago
by
brigs
Quantizations for llama.cpp
4
#23 opened about 1 year ago
by
rozek
ENDPOINT CONFIGURATION ON AWS SAGEMAKER
1
#21 opened about 1 year ago
by
NABARKA
Adding `safetensors` variant of this model
#20 opened about 1 year ago
by
efy9002
protofile.proto: A file with this name is already in the pool
1
#19 opened about 1 year ago
by
surya-narayanan
Is LLaMA-2-7B-32K already fine-tuned for answering questions from long text?
1
#18 opened about 1 year ago
by
MathewOpt
Fix RuntimeError: pad attn scores back to original query sequence length, instead of unpadded sequence length (i.e. no change).
1
#17 opened about 1 year ago
by
Birchlabs
How can specific information be eliminated in a LLM?
#16 opened about 1 year ago
by
kiopuy
!pip install flash-attn --no-build-isolation
3
#15 opened over 1 year ago
by
NivYO
Instead of flash_attn it should be flash_attn_2_cuda . This is causing a deployment issue in TGI/DJL
1
#14 opened over 1 year ago
by
monuminu
RoPE scaling and max_position_embeddings
2
#12 opened over 1 year ago
by
ag0
getting strange tokens after finetuning on Qlora
2
#11 opened over 1 year ago
by
monuminu
Training diverges when used with Llama 2 70B and 4-bit QLoRA
3
#10 opened over 1 year ago
by
alyssavance
Do you have plans on making a chat model(LLama-2-7B-32k-Chat)? If so, any idea when it would come out?
3
#9 opened over 1 year ago
by
barpy
How to training a llama-2-7B-32k from llama-2-7B?
7
#8 opened over 1 year ago
by
Sayoyo
Upload introduction_cn.ipynb
#7 opened over 1 year ago
by
DORA1222
Problem with generating anything
3
#6 opened over 1 year ago
by
wempoo
Plans for 13b version?
3
#5 opened over 1 year ago
by
rombodawg
GGML Version
8
#4 opened over 1 year ago
by
s3nh
Model on your API Playground
7
#3 opened over 1 year ago
by
1littlecoder
how to fine tune peft qlora and SFTTrainer?
12
#2 opened over 1 year ago
by
NickyNicky
What is the VRAM requirement of this model?
5
#1 opened over 1 year ago
by
Said2k