README.md · webbigdata/C3TR-Adapter_gptq at 25c55178a7daeca54dfa776b99e0ad0197d6b681

metadata

library_name: gptq
base_model: google/gemma-7b
language:
  - ja
  - en
tags:
  - translation
  - gptq
  - gemma
  - text-generation-inference
  - nlp

Model card

英日、日英翻訳用モデルC3TR-AdapterのGPTQ量子化版です。
This is the GPTQ quantized version of the C3TR-Adapter model for English-Japanese and Japanese-English translation.

install

AutoGPTQの公式サイトをご確認下さい
Check official AutoGPTQ page

私はソースからインストールしないと動かす事ができませんでした。
I couldn't get it to work without installing from source.

git clone https://github.com/PanQiWei/AutoGPTQ.git && cd AutoGPTQ
pip install -vvv --no-build-isolation -e .

Sample code

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
model_name = "webbigdata/C3TR-Adapter_gptq"

# thanks to tk-master
# https://github.com/AutoGPTQ/AutoGPTQ/issues/406
config = AutoConfig.from_pretrained(model_name)
config.quantization_config["use_exllama"] = False
config.quantization_config["exllama_config"] = {"version":2}

# adjust your gpu memory size. 0 means first gpu.
max_memory={0: "12GiB", "cpu": "10GiB"}

quantized_model = AutoModelForCausalLM.from_pretrained(model_name
        , torch_dtype=torch.bfloat16  # change torch.float16 if you use free colab or something not support bfloat16.
        , device_map="auto", max_memory=max_memory
        , config=config)
tokenizer = AutoTokenizer.from_pretrained(model_name)
tokenizer.pad_token = tokenizer.unk_token

prompt_text = """You are a highly skilled professional Japanese-English and English-Japanese translator. Translate the given text accurately, taking into account the context and specific instructions provided. Steps may include hints enclosed in square brackets [] with the key and value separated by a colon:. Only when the subject is specified in the Japanese sentence, the subject will be added when translating into English. If no additional instructions or context are provided, use your expertise to consider what the most appropriate context is and provide a natural translation that aligns with that context. When translating, strive to faithfully reflect the meaning and tone of the original text, pay attention to cultural nuances and differences in language usage, and ensure that the translation is grammatically correct and easy to read. After completing the translation, review it once more to check for errors or unnatural expressions. For technical terms and proper nouns, either leave them in the original language or use appropriate translations as necessary. Take a deep breath, calm down, and start translating.

### Instruction:
Translate English to Japanese.
When translating, please use the following hints:
[writing_style: web-fiction]
[Madoka: まどか]
[Madoka_first_person_and_ending: だね, よね]
[Mami: マミ]
[Mami_first_person_and_ending: 私, わね]
[Sayaka: さやか]
[Sayaka_first_person_and_ending: 私, かな]
[Kyubey: キュゥべぇ]
[Kyubey_first_person_and_ending: 僕, てよ]

### Input:
Madoka: "Thank you all for watching! You might've seen a bit of my dark side, but... don't mind that, okay?"
Sayaka: "Well, thanks! Did my cuteness come across 100%?"
Mami: "I'm glad you watched, but it's a bit embarrassing..."
Kyubey: "Make a contract with me, and become a magical girl."
### Response:
"""

tokens = tokenizer(prompt_text, return_tensors="pt",
        padding=True, max_length=1600, truncation=True).to("cuda:0").input_ids

output = quantized_model.generate(
        input_ids=tokens,
        max_new_tokens=800,
        do_sample=True,
        num_beams=3, temperature=0.5, top_p=0.3,
        repetition_penalty=1.0)
print(tokenizer.decode(output[0]))

webbigdata
/

C3TR-Adapter_gptq

Model card

install

Sample code

See also