Issue with using the codellama-7b model
#13
by
RyanAX
- opened
I have set up the codellama-7b model locally and used the official example, but the final result does not meet expectations. Here is the code:
codeLlama_tokenizer = CodeLlamaTokenizer.from_pretrained("./CodeLlama-7b-hf", padding_side='left')
codeLlama_model = LlamaForCausalLM.from_pretrained("./CodeLlama-7b-hf")
codeLlama_model.to(device='cuda:0', dtype=torch.bfloat16)
text = '''def remove_non_ascii(s: str) -> str:
""" <FILL_ME>
return result
'''
start_time = time.time()
input_ids = codeLlama_tokenizer(text, return_tensors="pt")["input_ids"]
input_ids = input_ids.to('cuda')
generated_ids = codeLlama_model.generate(input_ids, max_new_tokens=200, do_sample=True, top_p=0.9, temperature=0.1, num_return_sequences=1, repetition_penalty=1.05, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id)
filling = codeLlama_tokenizer.batch_decode(generated_ids[:, input_ids.shape[1]:], skip_special_tokens=True)[0]
print(filling)
The output of the code is:
Remove non-ascii characters from a string. """
result = ""
for c in s:
if ord(c) < 128:
result += c
}
public void setId(String id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getDescription() {
return description;
}
public void setDescription(String description) {
this.description = description;
}
public String getType() {
return type;
}
There are two issues with the generated code that don't meet expectations:
1、It doesn't consider suffixes and seems to ignore everything after <FILL_ME>
.
2、After completing the desired part of the code, it adds a lot of unnecessary additional code.
Is this behavior normal? Is there any way to improve it?