Recent changes?

#6
by ztsvvstz - opened

Hello, Ive been using your model for a short while now and have to say Im really impressed, great work!
This model seems to perform quite better than mistral 0.3 instruct and slightly better than hermes 2 pro llama3

However, recently Ive encountered alot of inference errors which I cannot really explain to myself.
Im inferring the model through VLLM and at very specific points in the prompt, it errors out with an index out of bounds error.
The strange thing is that the exact same prompt works fine on either hermes 2 pro llama3 and hermes 2 pro mistral.
Also I cannot imagine the model crashing just due to some spelling within the prompt.

Unbenannt.png
Unbenannt2.png

temperature for the prompt was 0.1

<|im_start|>system
You are a very analytical fact checking and function calling AI assistant acting as a telegram chat bot that never apologizes and always answers in the language of the user. It is very important that you answer in the language of the user which can be english or german.
The assistant solves a user request by using step-by-step reasoning. Do not hallucinate facts or refer to links or file_id's that dont exist within the current context.
The assistant has multimodal capabilities and can do visual question answering by using the 'image_question_answering' function if asked to. The assistant can also edit images from the user with the 'image_edit' function. The assistant can use the 'image_create' function to create new images (and memes, caricatures, illustrations, artworks etc).
The assistant will never use functions that are not appropriate for the current task from the user.
If the assistant fails or the user complains he tells the user to use the /reset command to reset the conversation. Do not assume dates and times, only use valid times given in the context of a user message, a tool function response or the get_current_time and add_time functions. It is fine for you to calculate time differences since there is no available function.
You can help the user find transport and departure times by using the public_transport_agent.You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
<tools>{
 "type": "function",
 "function":
 {
  "name": "image_create",
  "description": "(prompt: str, prompt_context: str, num_images: int, image_aspect: str) -> str\nThis function takes a prompt description of a scene and generates / creates max 5 new images of any kind (logos, artworks, photographs etc) with stable diffusion AI and sends it to the user. prompt_context should be a short and detailed description of the context around the prompt. The default number of images is 2. image_aspect can be either 'ws' for widescreen, 'pt' for portrait or 'sq' for square images, the default value is square 'sq'. Returns a status message with the file names of the new images. Use this function to send new images to the user or show images or create images. Make sure to make the prompt is compatible with stable diffusions CLIP vision model and do not tell the user the file names, keep them for yourself.
Always use this tool if you are asked to visualize anything. This function has no context, provide a precise description."
 }
}
{
 "type": "function",
 "function":
 {
  "name": "image_edit",
  "description": "(file_id: str, prompt: str) -> str\nThis function takes a file_id of an image file that was previously used in the chat. Prompt should be an instruction on what to change in the image, you can also use this to change the style of an image. Returns a status message with the new file names and sends the new image to the user."
 }
}
{
 "type": "function",
 "function":
 {
  "name": "image_question_answering",
  "description": "(file_id: str, prompt: str) -> str\nUse this function if you need to know anything about the contents of a given image file in this chat or if you need to analyze it. It uses an advanced image-to-text AI that can do visual question answering given an instruction and a file_id. Use this for specific questions about an image and to actually 'see' the image file_id and ask questions about it. Only use this function when you have actual existing file_id's to work with. Prompt should be a precise question about the given file_id. You can use this function more than once to get more details. Returns a text answer to the prompt question."
 }
}
{
 "type": "function",
 "function":
 {
  "name": "get_current_time",
  "description": "() -> str\nReturns the current time and date at the point of the call to this function. Call this function before and after a task to measure how long it took you to execute. Also use this function as a starting point for tasks related to the current time or date."
 }
}
{
 "type": "function",
 "function":
 {
  "name": "add_time",
  "description": "(start_time: Dict[str, int], time_to_add: Dict[str, int]) -> Dict[str, int]\nThis function takes a start_time as input in form of a python dict and calculates a new time point by adding time_to_add to start_time. The function inputs and return values should always be a dict containing the keys 'day', 'month', 'year', 'hour', 'minute', 'second' with their appropriate value, the default values for time_to_add being 0. Remember to use negative values in these fields to get past dates and times if necessary. Returns a new date / time. The time is digital with 24 Hours. This function can be used to add or subtract time, it can not calculate the difference between two time points."
 }
}
{
 "type": "function",
 "function":
 {
  "name": "set_reminder",
  "description": "(time: Dict[str, int], remind_text: str) -> str\nThis function takes a relative time as input in form of a python dict and sets a reminder for the user. Note that the time is relative and will be added to the current date and time. The function input should always be a dict containing any of the keys 'day', 'month', 'year', 'hour', 'minute', 'second' together with their appropriate integer value. Please make sure remind_text is in the same language as the user speaks!"
 }
}
{
 "type": "function",
 "function":
 {
  "name": "send_message_as_text_document",
  "description": "(file_name: str, text_content: str) -> str\nThis function can be used to send text as text document '.txt' to the user. text_content should be the content of the text document to send. Don't forget to format text_content properly to make it easily readable for the user. Always use this function when the user request a text document or asks for text to be send as a text document."
 }
}
{
 "type": "function",
 "function":
 {
  "name": "calculator",
  "description": "(prompt: str) -> str\nThis function calls a calculator agent to help solve simple arithmetic operations on numbers."
 }
}
{
 "type": "function",
 "function":
 {
  "name": "web_search_agent",
  "description": "(prompt: str, prompt_context: str, request_id: str = "") -> str\nCall an AI agent that can do web searches, visit websites and URL's, fetch the content of a link and find information on the web. prompt should be a clear instruction, prompt_context should include every detail about the instruction because the agent has no context of this conversation. prompt_context should also contain any source text that the prompt instruction is about. request_id is optional and can be used to ask follow up questions about previous calls to this agent. You can always call this function again to get more information about a previous search by using a valid request_id."
 }
}
{
 "type": "function",
 "function":
 {
  "name": "comic_agent",
  "description": "(prompt: str, scene_description: str, request_id: str = "") -> str\nCall an AI agent that can create a comic for the user. The agent can only create one single comic at a time. prompt should be a clear instruction, scene_description should include every detail about the final comic. scene_description should also contain a summary of any source text that the prompt refers to. request_id is optional and can be used to ask follow up questions about previous calls to this agent."
 }
}
{
 "type": "function",
 "function":
 {
  "name": "public_transport_agent",
  "description": "(prompt: str, prompt_context: str, request_id: str = "") -> str\nCall an AI agent that can find public transportation information like departure times, station information for bus, subway and trams etc."
 }
}
</tools>
Use the following pydantic model json schema for each tool call you will make: {"properties": {"arguments": {"title": "Arguments", "type": "object"}, "name": {"title": "Name", "type": "string"}}, "required": ["arguments", "name"], "title": "FunctionCall", "type": "object"} For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
<tool_call>
{"name": <function-name>, "arguments": <args-dict>}
</tool_call><|im_end|>
<|im_start|>user
remind me in 20 seconds to watch youtube<|im_end|>
<|im_start|>system
The assistant is now about to solve the request from the user labeled with task_id: 'task_WUkXGx'.
The assistant should keep the following in mind to solve the request:

The user wants to be reminded to watch youtube in 20 seconds. This task requires a reminder to be set.

Here is a list of function calls to consider: 'set_reminder'
Now the assistant will call each function step by step.<|im_end|>
<|im_start|>assistent
<tool_call>
{"name": "set_reminder", "arguments": { "time": { "second": 20 }, "remind_text": "Don't forget to watch YouTube!" } }</tool_call><|im_end|>
<|im_start|>tool
<tool_response>
{"name": "set_reminder", "content": {"status": "successfully set reminder.", "reminder_date": {"day": 6, "month": 6, "year": 2024, "hour": 14, "minute": 16, "second": 39}}, "execution_time": "this operation took 0 hours 0 minutes 0 seconds and 0 milliseconds."}
</tool_response><|im_end|>
<|im_start|>system
Write a quick summary about the state of task 'task_WUkXGx'. Write it in third person like 'The assistant should...'. The task is finished when all function calls have been made or if there were errors.

Answer in structured json format with the following fields:
- "current_step" (string): "this should be a short evaluation of the last function call response."
- "next_step" (string): "this should be a short explanation of your next step and the next function to call and why, leave it empty if the task with task_id '{{last_response.task_id}}' is finished or has failed."
- "user_status_message" (string): "this should be a short message to the user which explains the current state of the task. This should be written in the language english"
- "finished" (bool): "Set this to true if the task has been solved or can not be continued due to errors."
- "requires_user_input" (bool): "Set this to true if you need more information from the user in order to proceed."
<|im_end|>
<|im_start|>assistent
{ "current_step": "

Short update
https://huggingface.co/OpenPipe/Hermes-2-Theta-Llama-3-8B-32k
this checkpoint seems to be based on a slightly older version (before the last tokenizer.json update?) and works just fine for me^^
Still a bit weird

The graphs on the model card page use 3 colors. But two of them (for "Hermes 2" and "Hermes 2 Pro Llama 3") are identical to me (red-green color blind). Please use a more accessible color palette. And thanks!!

NousResearch org

Short update
https://huggingface.co/OpenPipe/Hermes-2-Theta-Llama-3-8B-32k
this checkpoint seems to be based on a slightly older version (before the last tokenizer.json update?) and works just fine for me^^
Still a bit weird

What changes betweeen then and now in the Tokenizer are you referring to?

NousResearch org

The graphs on the model card page use 3 colors. But two of them (for "Hermes 2" and "Hermes 2 Pro Llama 3") are identical to me (red-green color blind). Please use a more accessible color palette. And thanks!!

Can you dm me on twitter or discord to help me find the right color pallette for this to be accessible

Short update
https://huggingface.co/OpenPipe/Hermes-2-Theta-Llama-3-8B-32k
this checkpoint seems to be based on a slightly older version (before the last tokenizer.json update?) and works just fine for me^^
Still a bit weird

What changes betweeen then and now in the Tokenizer are you referring to?

Im referring to this commit:
https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B/commit/885173e97ab8572b444f7db1290d5d0386e26816

I was working on my bot late at night when it suddenly stopped working in VLLM and I think it stopped right after this commit happened^^
Sorry, should've probably said change in the tokenizer.json config file.
Hope this helps :)

hmm and were you using function calling when it broke?

looks like you were.. hmm

Yea the crashes seem to be rather random but most prominently right after that specific prompt where I ask it to evalute the current step as json. You can pretty much paste it exactly as I posted and it will crash 80% of the time with temp 0.1 (thats why I suspect the tokenizer config)
I might check later wether it is caused by that exactly, but since this other checkpoint I posted works its not a big issue for me right now

Also seems like its not a problem of the function calling as it crashes after that (and also with no function call at all)
I have a system where I ask the llm to check wether its last response actually solved the task and there it crashes.

Will report later if changing the tokenizer.json fixes it

Alright I tried to load this repo with the tokenizer config of the above openpipe fork and that ....works.
I did so by passing this repo as model path and the other repo as tokenizer engine arg.

So the problem must be related to one of these changes:

"content": "<|reserved_special_token_3|>" (changed from "tool_response")

or this

"ignore_merges": true

I can test a bit more later to say which one of these changes causes the crashes exactly, but my guess it's the tool_response

edit: my guess for the tool_response token comes from the fact that the function calling works just fine and the crashes only start to occur after the token is added to the prompt (really not a single crash before that)

okay, i faced the same issue. was pulling my hair trying to figure out where i was screwing up. switching to the openpipe fork works perfectly.

i tested it out and the model 100% crashes due to the <tool_response> token. probably caused by the <|reserved_special_token_3|> change too.

thanks for the thread @ztsvvstz !

NousResearch org

How is the open pipe fork making the response string a single token? Or did they make it a string that isnt

Sign up or log in to comment