Akash Kundu nsarrazin HF staff commited on
Commit
2272dad
1 Parent(s): bed356a

[DOCS] Minor fixes in README.md (#532)

Browse files

* [DOCS] Minor fixes in README.md

minor fixes

* lint

---------

Co-authored-by: Nathan Sarrazin <[email protected]>

Files changed (1) hide show
  1. README.md +13 -11
README.md CHANGED
@@ -41,7 +41,7 @@ The default config for Chat UI is stored in the `.env` file. You will need to ov
41
  Start by creating a `.env.local` file in the root of the repository. The bare minimum config you need to get Chat UI to run locally is the following:
42
 
43
  ```env
44
- MONGODB_URL=<the URL to your mongoDB instance>
45
  HF_ACCESS_TOKEN=<your access token>
46
  ```
47
 
@@ -61,7 +61,7 @@ Alternatively, you can use a [free MongoDB Atlas](https://www.mongodb.com/pricin
61
 
62
  ### Hugging Face Access Token
63
 
64
- You will need a Hugging Face access token to run Chat UI locally, if you use a remote inference endpoint. You can get one from [your Hugging Face profile](https://huggingface.co/settings/tokens).
65
 
66
  ## Launch
67
 
@@ -79,8 +79,8 @@ Chat UI features a powerful Web Search feature. It works by:
79
  1. Generating an appropriate search query from the user prompt.
80
  2. Performing web search and extracting content from webpages.
81
  3. Creating embeddings from texts using [transformers.js](https://huggingface.co/docs/transformers.js). Specifically, using [Xenova/gte-small](https://huggingface.co/Xenova/gte-small) model.
82
- 4. From these embeddings, find the ones that are closest to the user query using vector similarity search. Specifically, we use `inner product` distance.
83
- 5. Get the corresponding texts to those closest embeddings and perform [Retrieval-Augmented Generation](https://huggingface.co/papers/2005.11401) (i.e. expand user prompt by adding those texts so that a LLM can use this information).
84
 
85
  ## Extra parameters
86
 
@@ -139,14 +139,14 @@ MODELS=`[
139
  "assistantMessageToken": "<|assistant|>", # This does not need to be a token, can be any string
140
  "userMessageEndToken": "<|endoftext|>", # Applies only to user messages. Can be any string.
141
  "assistantMessageEndToken": "<|endoftext|>", # Applies only to assistant messages. Can be any string.
142
- "preprompt": "Below are a series of dialogues between various people and an AI assistant. The AI tries to be helpful, polite, honest, sophisticated, emotionally aware, and humble-but-knowledgeable. The assistant is happy to help with almost anything, and will do its best to understand exactly what is needed. It also tries to avoid giving false or misleading information, and it caveats when it isn't entirely sure about the right answer. That said, the assistant is practical and really does its best, and doesn't let caution get too much in the way of being useful.\n-----\n",
143
  "promptExamples": [
144
  {
145
  "title": "Write an email from bullet list",
146
  "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
147
  }, {
148
  "title": "Code a snake game",
149
- "prompt": "Code a basic snake game in python, give explanations for each step."
150
  }, {
151
  "title": "Assist in a task",
152
  "prompt": "How do I make a delicious lemon cheesecake?"
@@ -170,7 +170,7 @@ You can change things like the parameters, or customize the preprompt to better
170
 
171
  #### Custom prompt templates
172
 
173
- By default the prompt is constructed using `userMessageToken`, `assistantMessageToken`, `userMessageEndToken`, `assistantMessageEndToken`, `preprompt` parameters and a series of default templates.
174
 
175
  However, these templates can be modified by setting the `chatPromptTemplate` and `webSearchQueryPromptTemplate` parameters. Note that if WebSearch is not enabled, only `chatPromptTemplate` needs to be set. The template language is <https://handlebarsjs.com>. The templates have access to the model's prompt parameters (`preprompt`, etc.). However, if the templates are specified it is recommended to inline the prompt parameters, as using the references (`{{preprompt}}`) is deprecated.
176
 
@@ -187,7 +187,7 @@ For example:
187
 
188
  ##### chatPromptTemplate
189
 
190
- When quering the model for a chat response, the `chatPromptTemplate` template is used. `messages` is an array of chat messages, it has the format `[{ content: string }, ...]`. To idenify if a message is a user message or an assistant message the `ifUser` and `ifAssistant` block helpers can be used.
191
 
192
  The following is the default `chatPromptTemplate`, although newlines and indentiation have been added for readability. You can find the prompts used in production for HuggingChat [here](https://github.com/huggingface/chat-ui/blob/main/PROMPTS.md).
193
 
@@ -202,14 +202,16 @@ The following is the default `chatPromptTemplate`, although newlines and indenti
202
 
203
  ##### webSearchQueryPromptTemplate
204
 
205
- When performing a websearch, the search query is constructed using the `webSearchQueryPromptTemplate` template. It is recommended that that the prompt instructs the chat model to only return a few keywords.
206
 
207
  The following is the default `webSearchQueryPromptTemplate`.
208
 
209
  ```prompt
210
  {{userMessageToken}}
211
  My question is: {{message.content}}.
212
- Based on the conversation history (my previous questions are: {{previousMessages}}), give me an appropriate query to answer my question for web search. You should not say more than query. You should not say any words except the query. For the context, today is {{currentDate}}
 
 
213
  {{userMessageEndToken}}
214
  {{assistantMessageToken}}
215
  ```
@@ -229,7 +231,7 @@ To do this, you can add your own endpoints to the `MODELS` variable in `.env.loc
229
  }
230
  ```
231
 
232
- If `endpoints` is left unspecified, ChatUI will look for the model on the hosted Hugging Face inference API using the model name.
233
 
234
  ### Custom endpoint authorization
235
 
 
41
  Start by creating a `.env.local` file in the root of the repository. The bare minimum config you need to get Chat UI to run locally is the following:
42
 
43
  ```env
44
+ MONGODB_URL=<the URL to your MongoDB instance>
45
  HF_ACCESS_TOKEN=<your access token>
46
  ```
47
 
 
61
 
62
  ### Hugging Face Access Token
63
 
64
+ If you use a remote inference endpoint, you will need a Hugging Face access token to run Chat UI locally. You can get one from [your Hugging Face profile](https://huggingface.co/settings/tokens).
65
 
66
  ## Launch
67
 
 
79
  1. Generating an appropriate search query from the user prompt.
80
  2. Performing web search and extracting content from webpages.
81
  3. Creating embeddings from texts using [transformers.js](https://huggingface.co/docs/transformers.js). Specifically, using [Xenova/gte-small](https://huggingface.co/Xenova/gte-small) model.
82
+ 4. From these embeddings, find the ones that are closest to the user query using a vector similarity search. Specifically, we use `inner product` distance.
83
+ 5. Get the corresponding texts to those closest embeddings and perform [Retrieval-Augmented Generation](https://huggingface.co/papers/2005.11401) (i.e. expand user prompt by adding those texts so that an LLM can use this information).
84
 
85
  ## Extra parameters
86
 
 
139
  "assistantMessageToken": "<|assistant|>", # This does not need to be a token, can be any string
140
  "userMessageEndToken": "<|endoftext|>", # Applies only to user messages. Can be any string.
141
  "assistantMessageEndToken": "<|endoftext|>", # Applies only to assistant messages. Can be any string.
142
+ "preprompt": "Below are a series of dialogues between various people and an AI assistant. The AI tries to be helpful, polite, honest, sophisticated, emotionally aware, and humble but knowledgeable. The assistant is happy to help with almost anything and will do its best to understand exactly what is needed. It also tries to avoid giving false or misleading information, and it caveats when it isn't entirely sure about the right answer. That said, the assistant is practical and really does its best, and doesn't let caution get too much in the way of being useful.\n-----\n",
143
  "promptExamples": [
144
  {
145
  "title": "Write an email from bullet list",
146
  "prompt": "As a restaurant owner, write a professional email to the supplier to get these products every week: \n\n- Wine (x10)\n- Eggs (x24)\n- Bread (x12)"
147
  }, {
148
  "title": "Code a snake game",
149
+ "prompt": "Code a basic snake game in python and give explanations for each step."
150
  }, {
151
  "title": "Assist in a task",
152
  "prompt": "How do I make a delicious lemon cheesecake?"
 
170
 
171
  #### Custom prompt templates
172
 
173
+ By default, the prompt is constructed using `userMessageToken`, `assistantMessageToken`, `userMessageEndToken`, `assistantMessageEndToken`, `preprompt` parameters and a series of default templates.
174
 
175
  However, these templates can be modified by setting the `chatPromptTemplate` and `webSearchQueryPromptTemplate` parameters. Note that if WebSearch is not enabled, only `chatPromptTemplate` needs to be set. The template language is <https://handlebarsjs.com>. The templates have access to the model's prompt parameters (`preprompt`, etc.). However, if the templates are specified it is recommended to inline the prompt parameters, as using the references (`{{preprompt}}`) is deprecated.
176
 
 
187
 
188
  ##### chatPromptTemplate
189
 
190
+ When querying the model for a chat response, the `chatPromptTemplate` template is used. `messages` is an array of chat messages, it has the format `[{ content: string }, ...]`. To identify if a message is a user message or an assistant message the `ifUser` and `ifAssistant` block helpers can be used.
191
 
192
  The following is the default `chatPromptTemplate`, although newlines and indentiation have been added for readability. You can find the prompts used in production for HuggingChat [here](https://github.com/huggingface/chat-ui/blob/main/PROMPTS.md).
193
 
 
202
 
203
  ##### webSearchQueryPromptTemplate
204
 
205
+ When performing a websearch, the search query is constructed using the `webSearchQueryPromptTemplate` template. It is recommended that the prompt instructs the chat model to only return a few keywords.
206
 
207
  The following is the default `webSearchQueryPromptTemplate`.
208
 
209
  ```prompt
210
  {{userMessageToken}}
211
  My question is: {{message.content}}.
212
+
213
+ Based on the conversation history (my previous questions are: {{previousMessages}}), give me an appropriate query to answer my question for web search. You should not say more than query. You should not say any words except the query. For the context, today is {{currentDate}}
214
+
215
  {{userMessageEndToken}}
216
  {{assistantMessageToken}}
217
  ```
 
231
  }
232
  ```
233
 
234
+ If `endpoints` are left unspecified, ChatUI will look for the model on the hosted Hugging Face inference API using the model name.
235
 
236
  ### Custom endpoint authorization
237