File size: 8,485 Bytes
2f92d85 cf419d1 2f92d85 01666d6 cf419d1 01666d6 1f18ffa 55a8f50 01666d6 ffdc02f 01666d6 9b994ca 01666d6 1f18ffa 01666d6 e199d90 01666d6 2037238 01666d6 3e9c897 01666d6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 |
---
license: bigcode-openrail-m
---
# starcoder-toolbench
<!-- Provide a quick summary of what the model is/does. -->
starcoder-toolbench is a 15 billion parameter model used for api based action generation. It is instruction tuned from [starcoder](https://huggingface.co/bigcode/starcoder) on api based action generation datasets.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** [SambaNova Systems](https://sambanova.ai/)
- **Model type:** Language Model
- **Language(s):** English
- **License:** bigcode-openrail-m
- **Finetuned from model:** [starcoder](https://huggingface.co/bigcode/starcoder)
### Basic Information
<!-- Provide the basic links for the model. -->
- **Paper**: [link](https://arxiv.org/abs/2305.16504)
- **Github**: [link](https://github.com/sambanova/toolbench)
## Uses
<details>
<summary>Click to expand</summary>
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
### Direct Use
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
This model is intended for commercial and research use.
### Out-of-Scope Use
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
starcoder-toolbench should NOT be used for purpose other than API based action generation.
</details>
---
## How to Get Started with the Model
<details>
<summary>Click to expand</summary>
### Loading in model with Huggingface
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("sambanovasystems/starcoder-toolbench")
model = AutoModelForCausalLM.from_pretrained("sambanovasystems/starcoder-toolbench", device_map="auto", torch_dtype="auto")
```
### Example Prompts To Try in GPU Tutorial
Prompt 1:
```
I have the following set of API:\n\n# To set the maximum commute time in minute to your office location, assuming the office location is already defined\nAPI.set_max_commute_time(value: int)\n\n# To set the maximum home size in square feet\nAPI.set_max_square_feet(value: int)\n\n# To set the minimum home price in dollars\nAPI.set_min_price(value: int)\n\n# To set the number of garage(s)\nAPI.set_num_garages(value: int)\n\n# To set home types for search. For home buying, home_types choices are: \"House\", \"Townhouse\", \"Condo\", \"Land\", \"Multi-family\", \"Mobile\", \"Co-op\"; for home renting, home_types choices are: \"House\", \"Townhouse\", \"Condo\", \"Apartment\".\nAPI.select_home_type(home_types: List[str])\n\n# To set the number of balconies\nAPI.set_num_balconies(value: int)\n\n# Submit criterion to get search results. This function should be called after setting all the criterion.\nAPI.search()\n\n# To set the floor number\nAPI.set_floor_number(value: int)\n\n# To set the number of bedroom(s)\nAPI.set_num_beds(value: int)\n\n# To set the number of swimming pool(s)\nAPI.set_num_swimming_pools(value: int)\n\n# To set the maximum home price in dollars\nAPI.set_max_price(value: int)\n\n# To specify whether to search homes for buying or renting. 'value' can be chosen from ['buy', 'rent']. This function must be called after setting the location and before setting any other criteria.\nAPI.set_buy_or_rent(value: str)\n\n# To set the number of bathroom(s)\nAPI.set_num_baths(value: float)\n\n# To set the location for the search area. This function must be called before setting any criteria.\nAPI.set_location(value: string)\n\n# To set the minimum home size in square feet\nAPI.set_min_square_feet(value: int)\n\n-------------\n\nTask: Looking for homes to rent in Santa Clarita with a price range between $110000 and $1753000, a minimum of 1700 square feet, at least 2 balconies, and 3.5 bathrooms.\nAction:\n
```
Prompt 2:
```
I have the following set of API:\n\n# To set the location for hotel search, given a Loc object. This function must be called if booking type is 'hotels' or 'both'.\nAPI.set_hotel_location(Loc)\n\n# To set the number of hotel rooms to book.\nAPI.set_num_rooms(value)\n\n# To set the location for departure, given a Loc object. This function must be called if booking type is 'trip tickets' or 'both'.\nAPI.set_origin(Loc)\n\n# To select the transportation type from ['flight', 'train', 'bus', 'cruise']. This function must be called if booking type is 'trip tickets' or 'both'.\nAPI.select_transportation(transportation_type)\n\n# To set the return date of the trip, given a Date object. If booking type is 'both' and this function is not called explicitly, 'return_date' will be set to 'hotel_checkout_date' implicitly.\nAPI.set_return_date(Date)\n\n# To set the hotel check-in date, given a Date object. This function must be called if booking type is 'hotels' or 'both'.\nAPI.set_checkin_date(Date)\n\n# To define a date.\ndate = Date(month, day, year)\n\n# To set the departure date of the trip, given a Date object. This function must be called if booking type is 'trip tickets'. If booking type is 'both' and this function is not called explicitly, 'departure_date' will be set to 'hotel_checkin_date' implicitly.\nAPI.set_departure_date(Date)\n\n# To set the location for arrival, given a Loc object. This function must be called if booking type is 'trip tickets' or 'both'.\nAPI.set_destination(Loc)\n\n# To define a location of a given city 'City'.\nlocation = Loc('City')\n\n# To set maximum hotel room price.\nAPI.set_max_room_price(value)\n\n# To set minimum ticket price.\nAPI.set_min_ticket_price(value)\n\n# To select the booking type from ['hotels', 'trip tickets', 'both']. This function must be called before setting any criteria.\nAPI.select_booking_type(booking_type)\n\n# To set minimum hotel room price.\nAPI.set_min_room_price(value)\n\n# To set the number of child tickets to purchase.\nAPI.set_num_children(value)\n\n# To set the number of adult tickets to purchase.\nAPI.set_num_adults(value)\n\n# To select the hotel room type from ['King Bed', 'Queen Bed', 'Double', 'Luxury'].\nAPI.select_room_type(room_type)\n\n# To set maximum ticket price.\nAPI.set_max_ticket_price(value)\n\n# Submit criterion to get search results. This function should be called after setting all the criterion.\nAPI.search()\n\n# To set the hotel check-out date, given a Date object. This function must be called if booking type is 'hotels' or 'both'.\nAPI.set_checkout_date(Date)\n\n-------------\n\nTask: Looking to book 2 adult and 4 child tickets from Stockton to Baltimore by cruise, on 2023-07-29.\nAction:\n
```
</details>
---
## Training Details
<details>
<summary>Click to expand</summary>
### Training Data
<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
The training data is curated for the 8 tasks in ToolBench. See Appendix A of the [paper](https://arxiv.org/abs/2305.16504) for task details and Appendix C.1 for the training data curation details. In total, there are 9704 training samples, organized in all-shot format as described in Appendix C.2. Here is the [download link](https://drive.google.com/file/d/1lUatLGnSVhfy1uVIPEQ7qCoLtnCIXi2O/view?usp=sharing) to the training data.
### Training Procedure
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
We trained starcoder-toolbench on 4 80GB A100 gpu's. We started from [starcoder](https://huggingface.co/bigcode/starcoder) and finetuned it on the dataset mentioned above.
### Hyperparameters
- Hardware: A100 GPU
- Optimizer: AdamW
- Grad accumulation: 1
- Epochs: 8
- Global Batch size: 16
- Batch tokens: 16 * 2048 = 32,768 tokens
- Learning Rate: 1e-5
- Learning Rate Scheduler: Fixed LR
- Weight decay: 0.1
</details>
## Acknowledgment
We would like to express our gratitude to the great work done in [StarCoder: may the source be with you!](https://arxiv.org/abs/2305.06161)
## Cite starcoder-toolbench
```
@misc{xu2023tool,
title={On the Tool Manipulation Capability of Open-source Large Language Models},
author={Qiantong Xu and Fenglu Hong and Bo Li and Changran Hu and Zhengyu Chen and Jian Zhang},
year={2023},
eprint={2305.16504},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
``` |