Apply for community grant: Personal project (gpu)

#2
by cofeg - opened

This project is to convert modern Chinese sentences into classical Chinese sentences and make them more elegant. Many people want to try it out but currently the inference time of the demo is too long. It will be cool if we can run it with GPU to speed up the generation and make more people enjoy this interesting project!

Hi @cofeg , we've assigned ZeroGPU to this Space. Please check the compatibility and usage sections of this page so your Space can run on ZeroGPU.

Hi @cofeg , we've assigned ZeroGPU to this Space. Please check the compatibility and usage sections of this page so your Space can run on ZeroGPU.

Hello! I saw the project is running on ZeroGPU now, but there seems to be some compatibility issues now, and the conversion results are like this:

convert.png

The problematic version of app.py is as follows (it works well on my local GPU):

import gradio as gr
from transformers import AutoModelForCausalLM, AutoTokenizer
import re
import spaces

model_path = "cofeg/Finetuned-Xunzi-Qwen2-1.5B-for-ancient-text-generation"
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_path)

@spaces.GPU
def generate_answer(prompt, tokenizer, model):
    model_inputs = tokenizer([prompt], return_tensors="pt").to(model.device)
    generated_ids = model.generate(
        model_inputs.input_ids,
        max_new_tokens=128
    )
    generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]

    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    return response

def split_and_generate(modern_text, progress=gr.Progress()):
    progress(0, desc="开始处理")
    # Split the input text into sentences for the model is trained on sentence pairs
    sentences = re.findall(r'[^。!?]*[。!?]', modern_text)
    
    # If no sentences are found, treat the entire input as one sentence
    if not sentences:
        sentences = [modern_text]
    
    responses = ""
    for sentence in progress.tqdm(sentences, desc="生成中……"):
        input = "现代文:" + sentence + " 古文:"
        response = generate_answer(input, tokenizer, model)
        responses += response
    return responses

demo = gr.Interface(fn=split_and_generate,
                    inputs=[gr.Textbox(label="现代文", lines=10)],
                    outputs=[gr.Textbox(label="古文", lines=10)],
                    title="现代文转古文大模型",
                    description="请在左边对话框输入你要转换的现代文并点击“Submit”按钮,耐心等待十几秒钟,右边的对话框将显示转换后的古文。<br>一个句子不要太长,如果文本很长,可多分几个句子,模型会逐句转化。<br>详情请访问本项目[GitHub主页](https://github.com/JianXiao2021/ancient_text_generation_LLM)。"
)
demo.launch()

I tried some fixes but couldn't get it to work. Now I'm over my quota and couldn't test it anymore. Could you undo the assignment so I can revert my code to run the original version without ZeroGPU? Thank you.

OK, I switched the hardware back to cpu-basic. BTW, the @spaces.GPU decorator does nothing on non-ZeroGPU environment, so you don't have to revert your code.

I tried some fixes but couldn't get it to work. Now I'm over my quota and couldn't test it anymore.
Sry to hear that you exceed the quota. The quota is halved every two hours. You can switch the hardware on the settings page on your own :)

Hi @cofeg I added requirements.txt and the ZeroGPU part is working now. You can find it on my space: https://huggingface.co/spaces/xianbao/ancient_Chinese_text_generator_1.5B/blob/main/app.py

The output is not desired unfortunately, I think this is related to how you call the model via transformers. You can test that on local machine and the same code would work on ZeroGPU. :)

Sign up or log in to comment