Edit model card

Pretrained LM

Training Dataset

Prompt

  • Template:

      prompt = f"Translate this from {src_lang} to {tgt_lang}\n### {src_lang}: {src_text}\n### {tgt_lang}:"
    
      >>> # src_lang can be 'English', '한국어'
      >>> # tgt_lang can be '한국어', 'English'
    

    Mind that there is no "space (_)" at the end of the prompt (unpredictable first token will be popped up).

  • Issue: The tokenizer of the model tokenizes the prompt below in different way with the prompt above. Make sure to use the prompt proposed above.

      >>> # DO NOT USE the prompt like this
      prompt = f"""Translate this from {src_lang} to {tgt_lang}
      ### {src_lang}: {src_text}
      ### {tgt_lang}:"""
    

Training

  • Trained with QLoRA
    • PLM: NormalFloat 4-bit
    • Adapter: BrainFloat 16-bit
    • Adapted to all the linear layers (around 2.2%)
  • Merge adapters and upscaled in BrainFloat 16-bit precision

Usage (IMPORTANT)

  • Should remove the EOS token (<|endoftext|>, id=46332) at the end of the prompt.
      # MODEL
      model_name = 'traintogpb/llama-2-enko-translator-7b-qlora-bf16-upscaled'
      model = LlamaForCausalLM.from_pretrained(
          model_name,
          max_length=768,
          torch_dtype=torch.bfloat16
      )
    
      # TOKENIZER
      tokenizer = LlamaTokenizer.from_pretrained(plm_name)
      tokenizer.pad_token = "</s>"
      tokenizer.pad_token_id = 2
      tokenizer.eos_token = "<|endoftext|>" # Must be differentiated from the PAD token
      tokenizer.eos_token_id = 46332
      tokenizer.add_eos_token = False
      tokenizer.model_max_length = 768
    
      # INFERENCE
      text = "NMIXX is the world-best female idol group, who came back with the new song 'DASH'."
      src_lang, tgt_lang = 'English', '한국어'
      prompt = f"Translate this from {src_lang} to {tgt_lang}\n### {src_lang}: {src_text}\n### {tgt_lang}:"
    
      inputs = tokenizer(prompt, return_tensors="pt", max_length=max_length, truncation=True)
      # REMOVE EOS TOKEN IN THE PROMPT
      if inputs['input_ids'][0][-1] == tokenizer.eos_token_id:
          inputs['input_ids'] = inputs['input_ids'][0][:-1].unsqueeze(dim=0)
          inputs['attention_mask'] = inputs['attention_mask'][0][:-1].unsqueeze(dim=0)
    
      outputs = model.generate(**inputs, max_length=max_length, eos_token_id=tokenizer.eos_token_id)
    
      input_len = len(inputs['input_ids'].squeeze())
      
      translated_text = tokenizer.decode(outputs[0][input_len:], skip_special_tokens=True)
      print(translated_text)
    
Downloads last month
40
Safetensors
Model size
6.86B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train traintogpb/llama-2-enko-translator-7b-qlora-bf16-upscaled

Collection including traintogpb/llama-2-enko-translator-7b-qlora-bf16-upscaled