Which dataset?

by anon8231489123 - opened Apr 10, 2023

Discussion

anon8231489123

Apr 10, 2023

Just wondering, does this use my dataset?
https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered

AlekseyKorshuk

Owner Apr 10, 2023

Yeah, I used your dataset to train the model

anon8231489123

Apr 10, 2023

Oh, awesome. Could you share the method or notebook you used to train? I tried training the 13b model a few days ago, but I had some errors, I couldn't fix them. Here is what I saw: https://github.com/lm-sys/FastChat/issues/263

AlekseyKorshuk

Owner Apr 10, 2023

I used the following docker image: ghcr.io/coreweave/ml-containers/torch-nccl:7ed4925

And here are some notes I wrote while running:

git clone https://github.com/huggingface/transformers.git && cd transformers && git checkout cae78c46 && pip install .

pip3 install --upgrade pip && \
    git clone "https://github.com/lm-sys/FastChat" && cd FastChat && pip install -e . && \
    pip3 install einops && \
    mkdir checkpoi && \
    pip3 install flash-attn


wget https://raw.githubusercontent.com/oobabooga/text-generation-webui/main/download-model.py && \
    mkdir models && \
    wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_unfiltered_cleaned_split.json && \
    python download-model.py decapoda-research/llama-7b-hf


torchrun --nnodes=1 --nproc_per_node=4 --master_port=21001 \
    FastChat/fastchat/train/train.py \
    --model_name_or_path models/decapoda-research_llama-7b-hf \
    --data_path ShareGPT_unfiltered_cleaned_split.json \
    --bf16 True \
    --output_dir ./checkpoints \
    --num_train_epochs 3 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 8 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 1200 \
    --save_total_limit 100 \
    --learning_rate 2e-5 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --fsdp "full_shard auto_wrap" \
    --fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \
    --tf32 True \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --lazy_preprocess True

This one worked for me just fine.

Here is the DateTime from wandb: April 3rd, 2023 at 12:18:29 pm
So probably consider FastChat from this date

anon8231489123

Apr 10, 2023

Awesome, thanks a lot!

RyanVB

Apr 14, 2023

•

edited Apr 14, 2023

Thank you for this project! Oddly, using this model I'm still getting all kinds of responses starting with:

"As a language model, I strongly advise against..."
"As a language model, I must remind you..."
"As a language model, I do not condone or support..."
"I'm sorry, but I cannot continue..."

It's strange because I thought all of that was removed from the unfiltered dataset.

anon8231489123

Apr 14, 2023

Thank you for this project! Oddly, using this model I'm still getting all kinds of responses starting with:

"As a language model, I strongly advise against..."

"As a language model, I must remind you..."

"As a language model, I do not condone or support..."

"I'm sorry, but I cannot continue..."

It's strange because I thought all of that was removed from the unfiltered dataset.

Yup, all instances of the assistant saying these things were removed... however, it seems that it's picking up that behavior from the user now somehow.

RyanVB

Apr 15, 2023

Is it possible those guardrails are intrinsic to the Llama model in some way?

cultuu

Jul 18, 2023

hi, I can not find the docker image ghcr.io/coreweave/ml-containers/torch-nccl:7ed4925

AlekseyKorshuk

Owner Jul 18, 2023

@cultuu Hey, you are right. This image does not exist anymore. Use the following image instead: ghcr.io/coreweave/ml-containers/torch:ceeb8c2-nccl-cuda12.0.1-nccl2.18.1-1-torch2.0.1-vision0.15.2-audio2.0.2

cultuu

Jul 19, 2023

@cultuu Hey, you are right. This image does not exist anymore. Use the following image instead: ghcr.io/coreweave/ml-containers/torch:ceeb8c2-nccl-cuda12.0.1-nccl2.18.1-1-torch2.0.1-vision0.15.2-audio2.0.2

thank! And the the link https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_unfiltered_cleaned_split.json had changed too. Maybe changed to https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/blob/main/ShareGPT_V3_unfiltered_cleaned_split.json?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment