cuierfei commited on
Commit
a7311dd
1 Parent(s): 5aa4844

Upload folder using huggingface_hub

Browse files
Files changed (2) hide show
  1. README.md +6 -20
  2. config.json +4 -1
README.md CHANGED
@@ -47,17 +47,13 @@ This article comprises the following sections:
47
  Trying the following codes, you can perform the batched offline inference with the quantized model:
48
 
49
  ```python
50
- from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig
51
  from lmdeploy.vl import load_image
52
 
53
  model = 'OpenGVLab/InternVL2-2B-AWQ'
54
- system_prompt = '我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。'
55
  image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
56
- chat_template_config = ChatTemplateConfig('internvl-internlm2')
57
- chat_template_config.meta_instruction = system_prompt
58
  backend_config = TurbomindEngineConfig(model_format='awq')
59
- pipe = pipeline(model, chat_template_config=chat_template_config,
60
- backend_config=backend_config))
61
  response = pipe(('describe this image', image))
62
  print(response.text)
63
  ```
@@ -66,20 +62,10 @@ For more information about the pipeline parameters, please refer to [here](https
66
 
67
  ### Service
68
 
69
- To deploy InternVL2 as an API, please configure the chat template config first. Create the following JSON file `chat_template.json`.
70
-
71
- ```json
72
- {
73
- "model_name":"internvl-internlm2",
74
- "meta_instruction":"我是书生·万象,英文名是InternVL,是由上海人工智能实验室及多家合作单位联合开发的多模态大语言模型。",
75
- "stop_words":["<|im_start|>", "<|im_end|>"]
76
- }
77
- ```
78
-
79
- LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup.
80
 
81
  ```shell
82
- lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --model-name InternVL2-2B-AWQ --backend turbomind --server-port 23333 --model-format awq --chat-template chat_template.json
83
  ```
84
 
85
  To use the OpenAI-style interface, you need to install OpenAI:
@@ -96,7 +82,7 @@ from openai import OpenAI
96
  client = OpenAI(api_key='YOUR_API_KEY', base_url='http://0.0.0.0:23333/v1')
97
  model_name = client.models.list().data[0].id
98
  response = client.chat.completions.create(
99
- model="InternVL2-2B-AWQ",
100
  messages=[{
101
  'role':
102
  'user',
@@ -118,7 +104,7 @@ print(response)
118
 
119
  ## License
120
 
121
- This project is released under the MIT license, while InternLM is licensed under the Apache-2.0 license.
122
 
123
  ## Citation
124
 
 
47
  Trying the following codes, you can perform the batched offline inference with the quantized model:
48
 
49
  ```python
50
+ from lmdeploy import pipeline, TurbomindEngineConfig
51
  from lmdeploy.vl import load_image
52
 
53
  model = 'OpenGVLab/InternVL2-2B-AWQ'
 
54
  image = load_image('https://raw.githubusercontent.com/open-mmlab/mmdeploy/main/tests/data/tiger.jpeg')
 
 
55
  backend_config = TurbomindEngineConfig(model_format='awq')
56
+ pipe = pipeline(model, backend_config=backend_config, log_level='INFO')
 
57
  response = pipe(('describe this image', image))
58
  print(response.text)
59
  ```
 
62
 
63
  ### Service
64
 
65
+ LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
 
 
 
 
 
 
 
 
 
 
66
 
67
  ```shell
68
+ lmdeploy serve api_server OpenGVLab/InternVL2-2B-AWQ --backend turbomind --server-port 23333 --model-format awq
69
  ```
70
 
71
  To use the OpenAI-style interface, you need to install OpenAI:
 
82
  client = OpenAI(api_key='YOUR_API_KEY', base_url='http://0.0.0.0:23333/v1')
83
  model_name = client.models.list().data[0].id
84
  response = client.chat.completions.create(
85
+ model=model_name,
86
  messages=[{
87
  'role':
88
  'user',
 
104
 
105
  ## License
106
 
107
+ This project is released under the MIT license, while InternLM2 is licensed under the Apache-2.0 license.
108
 
109
  ## Citation
110
 
config.json CHANGED
@@ -84,7 +84,10 @@
84
  "return_dict": true,
85
  "return_dict_in_generate": false,
86
  "rms_norm_eps": 1e-05,
87
- "rope_scaling": null,
 
 
 
88
  "rope_theta": 1000000,
89
  "sep_token_id": null,
90
  "suppress_tokens": null,
 
84
  "return_dict": true,
85
  "return_dict_in_generate": false,
86
  "rms_norm_eps": 1e-05,
87
+ "rope_scaling": {
88
+ "factor": 2.0,
89
+ "type": "dynamic"
90
+ },
91
  "rope_theta": 1000000,
92
  "sep_token_id": null,
93
  "suppress_tokens": null,