Spaces:
Running
Running
:pencil: [Doc] New model, and prettify formats
Browse files
README.md
CHANGED
@@ -10,15 +10,17 @@ app_port: 23333
|
|
10 |
## HF-LLM-API
|
11 |
Huggingface LLM Inference API in OpenAI message format.
|
12 |
|
|
|
|
|
13 |
## Features
|
14 |
|
15 |
-
- Available Models (2024/01/
|
16 |
-
- `mixtral-8x7b`, `
|
17 |
-
- Adaptive prompt templates for different models
|
18 |
- Support OpenAI API format
|
19 |
- Enable api endpoint via official `openai-python` package
|
20 |
- Support both stream and no-stream response
|
21 |
-
- Support API Key via both HTTP auth header and env varible (https://github.com/Hansimov/hf-llm-api/issues/4)
|
22 |
- Docker deployment
|
23 |
|
24 |
## Run API service
|
@@ -60,7 +62,7 @@ sudo docker run -p 23333:23333 --env http_proxy="http://<server>:<port>" hf-llm-
|
|
60 |
|
61 |
### Using `openai-python`
|
62 |
|
63 |
-
See: [examples/chat_with_openai.py](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_openai.py)
|
64 |
|
65 |
```py
|
66 |
from openai import OpenAI
|
@@ -69,6 +71,8 @@ from openai import OpenAI
|
|
69 |
base_url = "http://127.0.0.1:23333"
|
70 |
# Your own HF_TOKEN
|
71 |
api_key = "hf_xxxxxxxxxxxxxxxx"
|
|
|
|
|
72 |
|
73 |
client = OpenAI(base_url=base_url, api_key=api_key)
|
74 |
response = client.chat.completions.create(
|
@@ -93,7 +97,7 @@ for chunk in response:
|
|
93 |
|
94 |
### Using post requests
|
95 |
|
96 |
-
See: [examples/chat_with_post.py](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_post.py)
|
97 |
|
98 |
|
99 |
```py
|
@@ -104,7 +108,11 @@ import re
|
|
104 |
|
105 |
# If runnning this service with proxy, you might need to unset `http(s)_proxy`.
|
106 |
chat_api = "http://127.0.0.1:23333"
|
107 |
-
|
|
|
|
|
|
|
|
|
108 |
requests_headers = {}
|
109 |
requests_payload = {
|
110 |
"model": "mixtral-8x7b",
|
|
|
10 |
## HF-LLM-API
|
11 |
Huggingface LLM Inference API in OpenAI message format.
|
12 |
|
13 |
+
Project link: https://github.com/Hansimov/hf-llm-api
|
14 |
+
|
15 |
## Features
|
16 |
|
17 |
+
- Available Models (2024/01/22): [#5](https://github.com/Hansimov/hf-llm-api/issues/5)
|
18 |
+
- `mistral-7b`, `mixtral-8x7b`, `nous-mixtral-8x7b`
|
19 |
+
- Adaptive prompt templates for different models
|
20 |
- Support OpenAI API format
|
21 |
- Enable api endpoint via official `openai-python` package
|
22 |
- Support both stream and no-stream response
|
23 |
+
- Support API Key via both HTTP auth header and env varible [#4](https://github.com/Hansimov/hf-llm-api/issues/4)
|
24 |
- Docker deployment
|
25 |
|
26 |
## Run API service
|
|
|
62 |
|
63 |
### Using `openai-python`
|
64 |
|
65 |
+
See: [`examples/chat_with_openai.py`](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_openai.py)
|
66 |
|
67 |
```py
|
68 |
from openai import OpenAI
|
|
|
71 |
base_url = "http://127.0.0.1:23333"
|
72 |
# Your own HF_TOKEN
|
73 |
api_key = "hf_xxxxxxxxxxxxxxxx"
|
74 |
+
# use below as non-auth user
|
75 |
+
# api_key = "sk-xxx"
|
76 |
|
77 |
client = OpenAI(base_url=base_url, api_key=api_key)
|
78 |
response = client.chat.completions.create(
|
|
|
97 |
|
98 |
### Using post requests
|
99 |
|
100 |
+
See: [`examples/chat_with_post.py`](https://github.com/Hansimov/hf-llm-api/blob/main/examples/chat_with_post.py)
|
101 |
|
102 |
|
103 |
```py
|
|
|
108 |
|
109 |
# If runnning this service with proxy, you might need to unset `http(s)_proxy`.
|
110 |
chat_api = "http://127.0.0.1:23333"
|
111 |
+
# Your own HF_TOKEN
|
112 |
+
api_key = "hf_xxxxxxxxxxxxxxxx"
|
113 |
+
# use below as non-auth user
|
114 |
+
# api_key = "sk-xxx"
|
115 |
+
|
116 |
requests_headers = {}
|
117 |
requests_payload = {
|
118 |
"model": "mixtral-8x7b",
|