YC-Chen commited on
Commit
c9b58fa
β€’
1 Parent(s): e9794a0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -17,7 +17,7 @@ license: apache-2.0
17
 
18
  **Evaluate function calling on EN benchmark**
19
 
20
- Berkeley function-calling leaderboard
21
 
22
  | Models | ↑ Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
23
  |-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
@@ -31,7 +31,7 @@ Berkeley function-calling leaderboard
31
 
32
  **Evaluate function calling on ZHTW benchmark**
33
 
34
- function-calling-leaderboard-for-zhtw
35
 
36
  | Models | ↑ Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
37
  |-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
@@ -50,7 +50,7 @@ MT-Bench
50
 
51
  | | Win | Tie | Lose |
52
  |---|---|---|---|
53
- | **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 27 (16.9%) | 63 (39.4%) | 70 (43.8%) |
54
 
55
 
56
  **Evaluate instrustion following on ZHTW benchmark**
@@ -59,7 +59,7 @@ MT-Bench-TC
59
 
60
  | | Win | Tie | Lose |
61
  |---|---|---|---|
62
- | **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 40 (25.0%) | 69 (43.1%) | 51 (31.9%) |
63
 
64
 
65
  ## πŸ‘©β€πŸ’» How to use
 
17
 
18
  **Evaluate function calling on EN benchmark**
19
 
20
+ [Berkeley function-calling leaderboard](https://gorilla.cs.berkeley.edu/blogs/8_berkeley_function_calling_leaderboard.html)
21
 
22
  | Models | ↑ Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
23
  |-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
 
31
 
32
  **Evaluate function calling on ZHTW benchmark**
33
 
34
+ [function-calling-leaderboard-for-zhtw](https://github.com/mtkresearch/function-calling-leaderboard-for-zhtw)
35
 
36
  | Models | ↑ Overall | Irrelevance<br/>Detection | AST/<br/>Simple | AST/<br/>Multiple | AST/<br/>Parallel | AST/<br/>Parallel-Multiple | Exec/<br/>Simple | Exec/<br/>Multiple | Exec/<br/>Parallel | Exec/<br/>Parallel-Multiple |
37
  |-----------------------------------|----------|---------------------|------------|--------------|--------------|------------------------|--------------|---------------------|---------------------|-------------------------------|
 
50
 
51
  | | Win | Tie | Lose |
52
  |---|---|---|---|
53
+ | **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 29 (18.1%) | 55 (34.3%) | 76 (47.5%) |
54
 
55
 
56
  **Evaluate instrustion following on ZHTW benchmark**
 
59
 
60
  | | Win | Tie | Lose |
61
  |---|---|---|---|
62
+ | **Breeze-7B-FC-v1_0** *v.s.* Breeze-7B-Instruct-v1_0 | 35 (21.9%) | 73 (45.6%) | 52 (32.5%) |
63
 
64
 
65
  ## πŸ‘©β€πŸ’» How to use