---
license: apache-2.0
---

# Model Card for MediaTek Research Breeze-7B-FC-v1_0


## 🏆 Performance

| Models                                                                                     | #Parameters | Organization | License    | 🧰 Function Calling? | 💬 Instrustion Following? |
|--------------------------------------------------------------------------------------------|-------------|------------|------------|-------------------|----------|
| [Breeze-7B-Instruct-v1_0](https://huggingface.co/MediaTek-Research/Breeze-7B-Instruct-v1_0)| 7B          | MediaTek Research | Apache 2.0 | ❌  | ✅       |
| [**Breeze-7B-FC-v1_0**](https://huggingface.co/MediaTek-Research/Breeze-7B-FC-v1_0)        | 7B          | MediaTek Research | Apache 2.0 | ✅ | ✅      |
| [Gorilla-OpenFunctions-v2](https://huggingface.co/MediaTek-Research/Breeze-7B-FC-v1_0)     | 7B          | Gorilla LLM       | Apache 2.0 | ✅ | ❌       |
| [GPT-3.5-Turbo-0125](https://openai.com)                                                   |             | OpenAI            | Proprietary| ✅ | ✅      |

**Evaluate function calling on EN benchmark**

Berkeley function-calling leaderboard

| Models                            | ↑ Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple  | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple  | 
|-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
| **Breeze-7B-FC-v1_0 (FC)**        | 86.01 |  74.58 | 90.00 | 93.00 | 82.00 | 83.00 | 98.00 | 92.00 | 88.00 | 75.00 |
| Gorilla-OpenFunctions-v2 (FC)     | 85.95 |  60.00 | 94.25 | 95.50 | 86.50 | 86.00 | 97.00 | 96.00 | 80.00 | 75.00 |
| GPT-3.5-Turbo-0125 (FC)           | 72.77 |  4.58  | 87.75 | 90.50 | 88.50 | 82.50 | 91.00 | 82.00 | 78.00 | 52.50 |

![](misc/radar_chart_en.png)

**Evaluate function calling on ZHTW benchmark**

function-calling-leaderboard-for-zhtw

| Models                            | ↑ Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple  | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple  | 
|-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
| **Breeze-7B-FC-v1_0 (FC)**        | 77.70 |  71.67 | 82.00 |	86.50 |	76.00 |	65.50 |	87.00 |	88.00 |	80.00 |	57.50 |
| Gorilla-OpenFunctions-v2 (FC)     | 75.68 |  53.75 | 84.75 |	86.50 |	72.50 |	68.00 |	92.00 |	92.00 |	62.00 |	72.50 |
| GPT-3.5-Turbo-0125 (FC)           | 66.15 |  7.50  | 83.75 |	83.50 |	73.00 |	65.50 |	88.00 |	84.00 |	72.00 |	40.00 |

![](misc/radar_chart_zhtw.png)


 **Evaluate instrustion following on EN benchmark**

MT-Bench

| | Win | Tie | Lose |
|---|---|---|---|
| **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 25 (15.6%) | 72 (45.0%) | 63 (39.4%) |


**Evaluate instrustion following on ZHTW benchmark**

MT-Bench-TC

| | Win | Tie | Lose |
|---|---|---|---|
| **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 36 (22.5%) | 81 (50.6%) | 43 (26.9%) |


## 👩‍💻 How to use

**Dependiency**

Install `mtkresearch` package

```
git clone https://github.com/mtkresearch/mtkresearch.git
cd mtkresearch
pip install -e .
```

**Hosting by VLLM**

```python
from vllm import LLM, SamplingParams

llm = LLM(
    model='MediaTek-Research/Breeze-7B-FC-v1_0',
    tensor_parallel_size=num_gpu, # number of gpus
    gpu_memory_utilization=0.7
)

instance_end_token_id = llm.get_tokenizer().convert_token_to_ids('<|im_end|>')
params = SamplingParams(
    temperature=0.01,
    top_p=0.01,
    max_tokens=4096,
    repetition_penalty=1.1,
    stop_token_ids=[instance_end_token_id]
)

def _inference(prompt, llm, params):
    return llm.generate(prompt, params)[0].outputs[0].text

```

**Instruction following**

```python
from mtkresearch.llm.prompt import MRPromptV2

sys_prompt = 'You are a helpful AI assistant built by MediaTek Research. The user you are helping speaks Traditional Chinese and comes from Taiwan.'

prompt_engine = MRPromptV2()

conversations = [
    {"role": "system", "content": sys_prompt},
    {"role": "user", "content": "請問什麼是深度學習？"},
]

prompt = prompt_engine.get_prompt(conversations)


output_str = _inference(prompt, llm, params)
result = prompt_engine.parse_generated_str(output_str)

print(result) # 
```

**Function Calling**

```python
from mtkresearch.llm.prompt import MRPromptV2

sys_prompt = 'You are a helpful AI assistant built by MediaTek Research. The user you are helping speaks Traditional Chinese and comes from Taiwan.'

functions = [
    {
      "name": "get_current_weather",
      "description": "Get the current weather in a given location",
      "parameters": {
        "type": "object",
        "properties": {
          "location": {
            "type": "string",
            "description": "The city and state, e.g. San Francisco, CA"
          },
          "unit": {
            "type": "string",
            "enum": ["celsius", "fahrenheit"]
          }
        },
        "required": ["location"]
      }
    }
]

prompt_engine = MRPromptV2()

# stage 1: query
conversations = [
    {"role": "user", "content": "台北目前溫度是攝氏幾度？"},
]

prompt = prompt_engine.get_prompt(conversations, functions=functions)

output_str = _inference(prompt, llm, params)
result = prompt_engine.parse_generated_str(output_str)

print(result) #

# stage 2: execute called functions

# stage 3: put executed results

```