qingxu98 commited on
Commit
9f9848c
1 Parent(s): e874a16
.github/workflows/build-with-latex.yml ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # https://docs.github.com/en/actions/publishing-packages/publishing-docker-images#publishing-images-to-github-packages
2
+ name: Create and publish a Docker image for Latex support
3
+
4
+ on:
5
+ push:
6
+ branches:
7
+ - 'master'
8
+
9
+ env:
10
+ REGISTRY: ghcr.io
11
+ IMAGE_NAME: ${{ github.repository }}_with_latex
12
+
13
+ jobs:
14
+ build-and-push-image:
15
+ runs-on: ubuntu-latest
16
+ permissions:
17
+ contents: read
18
+ packages: write
19
+
20
+ steps:
21
+ - name: Checkout repository
22
+ uses: actions/checkout@v3
23
+
24
+ - name: Log in to the Container registry
25
+ uses: docker/login-action@v2
26
+ with:
27
+ registry: ${{ env.REGISTRY }}
28
+ username: ${{ github.actor }}
29
+ password: ${{ secrets.GITHUB_TOKEN }}
30
+
31
+ - name: Extract metadata (tags, labels) for Docker
32
+ id: meta
33
+ uses: docker/metadata-action@v4
34
+ with:
35
+ images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
36
+
37
+ - name: Build and push Docker image
38
+ uses: docker/build-push-action@v4
39
+ with:
40
+ context: .
41
+ push: true
42
+ file: docs/GithubAction+NoLocal+Latex
43
+ tags: ${{ steps.meta.outputs.tags }}
44
+ labels: ${{ steps.meta.outputs.labels }}
Dockerfile CHANGED
@@ -10,12 +10,16 @@ RUN echo '[global]' > /etc/pip.conf && \
10
 
11
  WORKDIR /gpt
12
 
13
- # 装载项目文件
14
- COPY . .
15
 
16
  # 安装依赖
 
 
 
 
 
17
  RUN pip3 install -r requirements.txt
18
-
19
 
20
  # 可选步骤,用于预热模块
21
  RUN python3 -c 'from check_proxy import warm_up_modules; warm_up_modules()'
 
10
 
11
  WORKDIR /gpt
12
 
13
+
14
+
15
 
16
  # 安装依赖
17
+ COPY requirements.txt ./
18
+ COPY ./docs/gradio-3.32.2-py3-none-any.whl ./docs/gradio-3.32.2-py3-none-any.whl
19
+ RUN pip3 install -r requirements.txt
20
+ # 装载项目文件
21
+ COPY . .
22
  RUN pip3 install -r requirements.txt
 
23
 
24
  # 可选步骤,用于预热模块
25
  RUN python3 -c 'from check_proxy import warm_up_modules; warm_up_modules()'
README.md CHANGED
@@ -12,9 +12,9 @@ pinned: false
12
  # ChatGPT 学术优化
13
  > **Note**
14
  >
15
- > 527日对gradio依赖进行了较大的修复和调整,fork并解决了官方Gradio的一系列bug。但如果27日当天进行了更新,可能会导致代码报错(依赖缺失,卡在loading界面等),请及时更新到**最新版代码**并重新安装pip依赖即可。若给您带来困扰还请谅解。安装依赖时,请严格选择requirements.txt中**指定的版本**:
16
  >
17
- > `pip install -r requirements.txt -i https://pypi.org/simple`
18
  >
19
 
20
  # <img src="docs/logo.png" width="40" > GPT 学术优化 (GPT Academic)
@@ -28,7 +28,7 @@ To translate this project to arbitary language with GPT, read and run [`multi_la
28
  >
29
  > 1.请注意只有**红颜色**标识的函数插件(按钮)才支持读取文件,部分插件位于插件区的**下拉菜单**中。另外我们以**最高优先级**欢迎和处理任何新插件的PR!
30
  >
31
- > 2.本项目中每个文件的功能都在自译解[`self_analysis.md`](https://github.com/binary-husky/chatgpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)详细说明。随着版本的迭代,您也可以随时自行点击相关函数插件,调用GPT重新生成项目的自我解析报告。常见问题汇总在[`wiki`](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)当中。[安装方法](#installation)。
32
  >
33
  > 3.本项目兼容并鼓励尝试国产大语言模型chatglm和RWKV, 盘古等等。支持多个api-key共存,可在配置文件中填写如`API_KEY="openai-key1,openai-key2,api2d-key3"`。需要临时更换`API_KEY`时,在输入区输入临时的`API_KEY`然后回车键提交后即可生效。
34
 
@@ -43,22 +43,23 @@ To translate this project to arbitary language with GPT, read and run [`multi_la
43
  一键中英互译 | 一键中英互译
44
  一键代码解释 | 显示代码、解释代码、生成代码、给代码加注释
45
  [自定义快捷键](https://www.bilibili.com/video/BV14s4y1E7jN) | 支持自定义快捷键
46
- 模块化设计 | 支持自定义强大的[函数插件](https://github.com/binary-husky/chatgpt_academic/tree/master/crazy_functions),插件支持[热更新](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)
47
- [自我程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] [一键读懂](https://github.com/binary-husky/chatgpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)本项目的源代码
48
  [程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] 一键可以剖析其他Python/C/C++/Java/Lua/...项目树
49
  读论文、[翻译](https://www.bilibili.com/video/BV1KT411x7Wn)论文 | [函数插件] 一键解读latex/pdf论文全文并生成摘要
50
  Latex全文[翻译](https://www.bilibili.com/video/BV1nk4y1Y7Js/)、[润色](https://www.bilibili.com/video/BV1FT411H7c5/) | [函数插件] 一键翻译或润色latex论文
51
  批量注释生成 | [函数插件] 一键批量生成函数注释
52
- Markdown[中英互译](https://www.bilibili.com/video/BV1yo4y157jV/) | [函数插件] 看到上面5种语言的[README](https://github.com/binary-husky/chatgpt_academic/blob/master/docs/README_EN.md)了吗?
53
  chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
54
  [PDF论文全文翻译功能](https://www.bilibili.com/video/BV1KT411x7Wn) | [函数插件] PDF论文提取题目&摘要+翻译全文(多线程)
55
  [Arxiv小助手](https://www.bilibili.com/video/BV1LM4y1279X) | [函数插���] 输入arxiv文章url即可一键翻译摘要+下载PDF
56
  [谷歌学术统合小助手](https://www.bilibili.com/video/BV19L411U7ia) | [函数插件] 给定任意谷歌学术搜索页面URL,让gpt帮你[写relatedworks](https://www.bilibili.com/video/BV1GP411U7Az/)
57
  互联网信息聚合+GPT | [函数插件] 一键[让GPT先从互联网获取信息](https://www.bilibili.com/video/BV1om4y127ck),再回答问题,让信息永不过时
 
58
  公式/图片/表格显示 | 可以同时显示公式的[tex形式和渲染形式](https://user-images.githubusercontent.com/96192199/230598842-1d7fcddd-815d-40ee-af60-baf488a199df.png),支持公式、代码高亮
59
  多线程函数插件支持 | 支持多线调用chatgpt,一键处理[海量文本](https://www.bilibili.com/video/BV1FT411H7c5/)或程序
60
- 启动暗色gradio[主题](https://github.com/binary-husky/chatgpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
61
- [多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持,[API2D](https://api2d.com/)接口支持 | 同时被GPT3.5、GPT4、[清华ChatGLM](https://github.com/THUDM/ChatGLM-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧?
62
  更多LLM模型接入,支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应),引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama),[RWKV](https://github.com/BlinkDL/ChatRWKV)和[盘古α](https://openi.org.cn/pangu/)
63
  更多新功能展示(图像生成等) …… | 见本文档结尾处 ……
64
 
@@ -102,13 +103,13 @@ chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
102
 
103
  1. 下载项目
104
  ```sh
105
- git clone https://github.com/binary-husky/chatgpt_academic.git
106
- cd chatgpt_academic
107
  ```
108
 
109
  2. 配置API_KEY
110
 
111
- 在`config.py`中,配置API KEY等设置,[特殊网络环境设置](https://github.com/binary-husky/gpt_academic/issues/1) 。
112
 
113
  (P.S. 程序运行时会优先检查是否存在名为`config_private.py`的私密配置文件,并用其中的配置覆盖`config.py`的同名配置。因此,如果您能理解我们的配置读取逻辑,我们强烈建议您在`config.py`旁边创建一个名为`config_private.py`的新配置文件,并把`config.py`中的配置转移(复制)到`config_private.py`中。`config_private.py`不受git管控,可以让您的隐私信息更加安全。P.S.项目同样支持通过`环境变量`配置大多数选项,环境变量的书写格式参考`docker-compose`文件。读取优先级: `环境变量` > `config_private.py` > `config.py`)
114
 
@@ -124,6 +125,7 @@ conda activate gptac_venv # 激活anaconda环境
124
  python -m pip install -r requirements.txt # 这个步骤和pip安装一样的步骤
125
  ```
126
 
 
127
  <details><summary>如果需要支持清华ChatGLM/复旦MOSS作为后端,请点击展开此处</summary>
128
  <p>
129
 
@@ -150,19 +152,13 @@ AVAIL_LLM_MODELS = ["gpt-3.5-turbo", "api2d-gpt-3.5-turbo", "gpt-4", "api2d-gpt-
150
  python main.py
151
  ```
152
 
153
- 5. 测试函数插件
154
- ```
155
- - 测试函数插件模板函数(要求gpt回答历史上的今天发生了什么),您可以根据此函数为模板,实现更复杂的功能
156
- 点击 "[函数插件模板Demo] 历史上的今天"
157
- ```
158
-
159
  ## 安装-方法2:使用Docker
160
 
161
- 1. 仅ChatGPT(推荐大多数人选择)
162
 
163
  ``` sh
164
- git clone https://github.com/binary-husky/chatgpt_academic.git # 下载项目
165
- cd chatgpt_academic # 进入路径
166
  nano config.py # 用任意文本编辑器编辑config.py, 配置 “Proxy”, “API_KEY” 以及 “WEB_PORT” (例如50923) 等
167
  docker build -t gpt-academic . # 安装
168
 
@@ -171,37 +167,45 @@ docker run --rm -it --net=host gpt-academic
171
  #(最后一步-选择2)在macOS/windows环境下,只能用-p选项将容器上的端口(例如50923)暴露给主机上的端口
172
  docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic
173
  ```
 
174
 
175
  2. ChatGPT + ChatGLM + MOSS(需要熟悉Docker)
176
 
177
  ``` sh
178
- # 修改docker-compose.yml,删除方案1和方案3,保留方案2。修改docker-compose.yml中方案2的配置,参考其中注释即可
179
  docker-compose up
180
  ```
181
 
182
  3. ChatGPT + LLAMA + 盘古 + RWKV(需要熟悉Docker)
183
  ``` sh
184
- # 修改docker-compose.yml,删除方案1和方案2,保留方案3。修改docker-compose.yml中方案3的配置,参考其中注释即可
185
  docker-compose up
186
  ```
187
 
188
 
189
  ## 安装-方法3:其他部署姿势
 
 
 
 
 
 
190
 
191
- 1. 如何使用反代URL/微软云AzureAPI
192
  按照`config.py`中的说明配置API_URL_REDIRECT即可。
193
 
194
- 2. 远程云服务器部署(需要云服务器知识与经验)
195
- 请访问[部署wiki-1](https://github.com/binary-husky/chatgpt_academic/wiki/%E4%BA%91%E6%9C%8D%E5%8A%A1%E5%99%A8%E8%BF%9C%E7%A8%8B%E9%83%A8%E7%BD%B2%E6%8C%87%E5%8D%97)
196
 
197
- 3. 使用WSL2(Windows Subsystem for Linux 子系统)
198
- 请访问[部署wiki-2](https://github.com/binary-husky/chatgpt_academic/wiki/%E4%BD%BF%E7%94%A8WSL2%EF%BC%88Windows-Subsystem-for-Linux-%E5%AD%90%E7%B3%BB%E7%BB%9F%EF%BC%89%E9%83%A8%E7%BD%B2)
199
 
200
- 4. 如何在二级网址(如`http://localhost/subpath`)下运行
 
 
 
201
  请访问[FastAPI运行说明](docs/WithFastapi.md)
202
 
203
- 5. 使用docker-compose运行
204
- 请阅读docker-compose.yml后,按照其中的提示操作即可
205
  ---
206
  # Advanced Usage
207
  ## 自定义新的便捷按钮 / 自定义函数插件
@@ -226,7 +230,7 @@ docker-compose up
226
 
227
  编写强大的函数插件来执行任何你想得到的和想不到的任务。
228
  本项目的插件编写、调试难度很低,只要您具备一定的python基础知识,就可以仿照我们提供的模板实现自己的插件功能。
229
- 详情请参考[函数插件指南](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)。
230
 
231
  ---
232
  # Latest Update
@@ -234,38 +238,33 @@ docker-compose up
234
 
235
  1. 对话保存功能。在函数插件区调用 `保存当前的对话` 即可将当前对话保存为可读+可复原的html文件,
236
  另外在函数插件区(下拉菜单)调用 `载入对话历史存档` ,即可还原之前的会话。
237
- Tip:不指定文件直接点击 `载入对话历史存档` 可以查看历史html存档缓存,点击 `删除所有本地对话历史记录` 可以删除所有html存档缓存。
238
  <div align="center">
239
  <img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
240
  </div>
241
 
242
-
243
-
244
- 2. 生成报告。大部分插件都会在执行结束后,生成工作报告
245
  <div align="center">
246
- <img src="https://user-images.githubusercontent.com/96192199/227503770-fe29ce2c-53fd-47b0-b0ff-93805f0c2ff4.png" height="300" >
247
- <img src="https://user-images.githubusercontent.com/96192199/227504617-7a497bb3-0a2a-4b50-9a8a-95ae60ea7afd.png" height="300" >
248
- <img src="https://user-images.githubusercontent.com/96192199/227504005-efeaefe0-b687-49d0-bf95-2d7b7e66c348.png" height="300" >
249
  </div>
250
 
251
- 3. 模块化功能设计,简单的接口却能支持强大的功能
252
  <div align="center">
253
- <img src="https://user-images.githubusercontent.com/96192199/229288270-093643c1-0018-487a-81e6-1d7809b6e90f.png" height="400" >
254
- <img src="https://user-images.githubusercontent.com/96192199/227504931-19955f78-45cd-4d1c-adac-e71e50957915.png" height="400" >
255
  </div>
256
 
257
- 4. 这是一个能够“自我译解”的开源项目
258
  <div align="center">
259
- <img src="https://user-images.githubusercontent.com/96192199/226936850-c77d7183-0749-4c1c-9875-fd4891842d0c.png" width="500" >
260
- </div>
261
-
262
- 5. 译解其他开源项目,不在话下
263
- <div align="center">
264
- <img src="https://user-images.githubusercontent.com/96192199/226935232-6b6a73ce-8900-4aee-93f9-733c7e6fef53.png" width="500" >
265
  </div>
266
 
 
267
  <div align="center">
268
- <img src="https://user-images.githubusercontent.com/96192199/226969067-968a27c1-1b9c-486b-8b81-ab2de8d3f88a.png" width="500" >
 
269
  </div>
270
 
271
  6. 装饰[live2d](https://github.com/fghrsh/live2d_demo)的小功能(默认关闭,需要修改`config.py`)
@@ -290,13 +289,15 @@ Tip:不指定文件直接点击 `载入对话历史存档` 可以查看历史h
290
 
291
  10. Latex全文校对纠错
292
  <div align="center">
293
- <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/651ccd98-02c9-4464-91e1-77a6b7d1b033" width="500" >
 
294
  </div>
295
 
296
 
 
297
  ## 版本:
298
  - version 3.5(Todo): 使用自然语言调用本项目的所有函数插件(高优先级)
299
- - version 3.4(Todo): 完善chatglm本地大模型的多线支持
300
  - version 3.3: +互联网信息综合功能
301
  - version 3.2: 函数插件支持更多参数接口 (保存对话功能, 解读任意语言代码+同时询问任意的LLM组合)
302
  - version 3.1: 支持同时问询多个gpt模型!支持api2d,支持多个apikey负载均衡
@@ -314,29 +315,32 @@ gpt_academic开发者QQ群-2:610599535
314
 
315
  - 已知问题
316
  - 某些浏览器翻译插件干扰此软件前端的运行
317
- - 官方Gradio目前有很多兼容性Bug,请务必使用requirement.txt安装Gradio
318
 
319
  ## 参考与学习
320
 
321
  ```
322
- 代码中参考了很多其他优秀项目中的设计,主要包括:
323
 
324
- # 项目1:清华ChatGLM-6B:
325
  https://github.com/THUDM/ChatGLM-6B
326
 
327
- # 项目2:清华JittorLLMs:
328
  https://github.com/Jittor/JittorLLMs
329
 
330
- # 项目3:Edge-GPT:
 
 
 
331
  https://github.com/acheong08/EdgeGPT
332
 
333
- # 项目4:ChuanhuChatGPT:
334
  https://github.com/GaiZhenbiao/ChuanhuChatGPT
335
 
336
- # 项目5:ChatPaper:
337
- https://github.com/kaixindelele/ChatPaper
338
 
339
- # 更多:
340
  https://github.com/gradio-app/gradio
341
  https://github.com/fghrsh/live2d_demo
342
  ```
 
12
  # ChatGPT 学术优化
13
  > **Note**
14
  >
15
+ > 2023.5.27 对Gradio依赖进行了调整,Fork并解决了官方Gradio的若干Bugs。请及时**更新代码**并重新更新pip依赖。安装依赖时,请严格选择`requirements.txt`中**指定的版本**:
16
  >
17
+ > `pip install -r requirements.txt`
18
  >
19
 
20
  # <img src="docs/logo.png" width="40" > GPT 学术优化 (GPT Academic)
 
28
  >
29
  > 1.请注意只有**红颜色**标识的函数插件(按钮)才支持读取文件,部分插件位于插件区的**下拉菜单**中。另外我们以**最高优先级**欢迎和处理任何新插件的PR!
30
  >
31
+ > 2.本项目中每个文件的功能都在自译解[`self_analysis.md`](https://github.com/binary-husky/gpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)详细说明。随着版本的迭代,您也可以随时自行点击相关函数插件,调用GPT重新生成项目的自我解析报告。常见问题汇总在[`wiki`](https://github.com/binary-husky/gpt_academic/wiki/%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98)当中。[安装方法](#installation)。
32
  >
33
  > 3.本项目兼容并鼓励尝试国产大语言模型chatglm和RWKV, 盘古等等。支持多个api-key共存,可在配置文件中填写如`API_KEY="openai-key1,openai-key2,api2d-key3"`。需要临时更换`API_KEY`时,在输入区输入临时的`API_KEY`然后回车键提交后即可生效。
34
 
 
43
  一键中英互译 | 一键中英互译
44
  一键代码解释 | 显示代码、解释代码、生成代码、给代码加注释
45
  [自定义快捷键](https://www.bilibili.com/video/BV14s4y1E7jN) | 支持自定义快捷键
46
+ 模块化设计 | 支持自定义强大的[函数插件](https://github.com/binary-husky/gpt_academic/tree/master/crazy_functions),插件支持[热更新](https://github.com/binary-husky/gpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)
47
+ [自我程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] [一键读懂](https://github.com/binary-husky/gpt_academic/wiki/chatgpt-academic%E9%A1%B9%E7%9B%AE%E8%87%AA%E8%AF%91%E8%A7%A3%E6%8A%A5%E5%91%8A)本项目的源代码
48
  [程序剖析](https://www.bilibili.com/video/BV1cj411A7VW) | [函数插件] 一键可以剖析其他Python/C/C++/Java/Lua/...项目树
49
  读论文、[翻译](https://www.bilibili.com/video/BV1KT411x7Wn)论文 | [函数插件] 一键解读latex/pdf论文全文并生成摘要
50
  Latex全文[翻译](https://www.bilibili.com/video/BV1nk4y1Y7Js/)、[润色](https://www.bilibili.com/video/BV1FT411H7c5/) | [函数插件] 一键翻译或润色latex论文
51
  批量注释生成 | [函数插件] 一键批量生成函数注释
52
+ Markdown[中英互译](https://www.bilibili.com/video/BV1yo4y157jV/) | [函数插件] 看到上面5种语言的[README](https://github.com/binary-husky/gpt_academic/blob/master/docs/README_EN.md)了吗?
53
  chat分析报告生成 | [函数插件] 运行后自动生成总结汇报
54
  [PDF论文全文翻译功能](https://www.bilibili.com/video/BV1KT411x7Wn) | [函数插件] PDF论文提取题目&摘要+翻译全文(多线程)
55
  [Arxiv小助手](https://www.bilibili.com/video/BV1LM4y1279X) | [函数插���] 输入arxiv文章url即可一键翻译摘要+下载PDF
56
  [谷歌学术统合小助手](https://www.bilibili.com/video/BV19L411U7ia) | [函数插件] 给定任意谷歌学术搜索页面URL,让gpt帮你[写relatedworks](https://www.bilibili.com/video/BV1GP411U7Az/)
57
  互联网信息聚合+GPT | [函数插件] 一键[让GPT先从互联网获取信息](https://www.bilibili.com/video/BV1om4y127ck),再回答问题,让信息永不过时
58
+ ⭐Arxiv论文精细翻译 | [函数插件] 一键[以超高质量翻译arxiv论文](https://www.bilibili.com/video/BV1dz4y1v77A/),迄今为止最好的论文翻译工具⭐
59
  公式/图片/表格显示 | 可以同时显示公式的[tex形式和渲染形式](https://user-images.githubusercontent.com/96192199/230598842-1d7fcddd-815d-40ee-af60-baf488a199df.png),支持公式、代码高亮
60
  多线程函数插件支持 | 支持多线调用chatgpt,一键处理[海量文本](https://www.bilibili.com/video/BV1FT411H7c5/)或程序
61
+ 启动暗色gradio[主题](https://github.com/binary-husky/gpt_academic/issues/173) | 在浏览器url后面添加```/?__theme=dark```可以切换dark主题
62
+ [多LLM模型](https://www.bilibili.com/video/BV1wT411p7yf)支持 | 同时被GPT3.5、GPT4、[清华ChatGLM](https://github.com/THUDM/ChatGLM-6B)、[复旦MOSS](https://github.com/OpenLMLab/MOSS)同时伺候的感觉一定会很不错吧?
63
  更多LLM模型接入,支持[huggingface部署](https://huggingface.co/spaces/qingxu98/gpt-academic) | 加入Newbing接口(新必应),引入清华[Jittorllms](https://github.com/Jittor/JittorLLMs)支持[LLaMA](https://github.com/facebookresearch/llama),[RWKV](https://github.com/BlinkDL/ChatRWKV)和[盘古α](https://openi.org.cn/pangu/)
64
  更多新功能展示(图像生成等) …… | 见本文档结尾处 ……
65
 
 
103
 
104
  1. 下载项目
105
  ```sh
106
+ git clone https://github.com/binary-husky/gpt_academic.git
107
+ cd gpt_academic
108
  ```
109
 
110
  2. 配置API_KEY
111
 
112
+ 在`config.py`中,配置API KEY等设置,[点击查看特殊网络环境设置方法](https://github.com/binary-husky/gpt_academic/issues/1) 。
113
 
114
  (P.S. 程序运行时会优先检查是否存在名为`config_private.py`的私密配置文件,并用其中的配置覆盖`config.py`的同名配置。因此,如果您能理解我们的配置读取逻辑,我们强烈建议您在`config.py`旁边创建一个名为`config_private.py`的新配置文件,并把`config.py`中的配置转移(复制)到`config_private.py`中。`config_private.py`不受git管控,可以让您的隐私信息更加安全。P.S.项目同样支持通过`环境变量`配置大多数选项,环境变量的书写格式参考`docker-compose`文件。读取优先级: `环境变量` > `config_private.py` > `config.py`)
115
 
 
125
  python -m pip install -r requirements.txt # 这个步骤和pip安装一样的步骤
126
  ```
127
 
128
+
129
  <details><summary>如果需要支持清华ChatGLM/复旦MOSS作为后端,请点击展开此处</summary>
130
  <p>
131
 
 
152
  python main.py
153
  ```
154
 
 
 
 
 
 
 
155
  ## 安装-方法2:使用Docker
156
 
157
+ 1. 仅ChatGPT(推荐大多数人选择,等价于docker-compose方案1)
158
 
159
  ``` sh
160
+ git clone https://github.com/binary-husky/gpt_academic.git # 下载项目
161
+ cd gpt_academic # 进入路径
162
  nano config.py # 用任意文本编辑器编辑config.py, 配置 “Proxy”, “API_KEY” 以及 “WEB_PORT” (例如50923) 等
163
  docker build -t gpt-academic . # 安装
164
 
 
167
  #(最后一步-选择2)在macOS/windows环境下,只能用-p选项将容器上的端口(例如50923)暴露给主机上的端口
168
  docker run --rm -it -e WEB_PORT=50923 -p 50923:50923 gpt-academic
169
  ```
170
+ P.S. 如果需要依赖Latex的插件功能,请见Wiki。另外,您也可以直接使用docker-compose获取Latex功能(修改docker-compose.yml,保留方案4并删除其他方案)。
171
 
172
  2. ChatGPT + ChatGLM + MOSS(需要熟悉Docker)
173
 
174
  ``` sh
175
+ # 修改docker-compose.yml,保留方案2并删除其他方案。修改docker-compose.yml中方案2的配置,参考其中注释即可
176
  docker-compose up
177
  ```
178
 
179
  3. ChatGPT + LLAMA + 盘古 + RWKV(需要熟悉Docker)
180
  ``` sh
181
+ # 修改docker-compose.yml,保留方案3并删除其他方案。修改docker-compose.yml中方案3的配置,参考其中注释即可
182
  docker-compose up
183
  ```
184
 
185
 
186
  ## 安装-方法3:其他部署姿势
187
+ 1. 一键运行脚本。
188
+ 完全不熟悉python环境的Windows用户可以下载[Release](https://github.com/binary-husky/gpt_academic/releases)中发布的一键运行脚本安装无本地模型的版本。
189
+ 脚本的贡献来源是[oobabooga](https://github.com/oobabooga/one-click-installers)。
190
+
191
+ 2. 使用docker-compose运行。
192
+ 请阅读docker-compose.yml后,按照其中的提示操作即可
193
 
194
+ 3. 如何使用反代URL
195
  按照`config.py`中的说明配置API_URL_REDIRECT即可。
196
 
197
+ 4. 微软云AzureAPI
198
+ 按照`config.py`中的说明配置即可(AZURE_ENDPOINT等四个配置)
199
 
200
+ 5. 远程云服务器部署(需要云服务器知识与经验)。
201
+ 请访问[部署wiki-1](https://github.com/binary-husky/gpt_academic/wiki/%E4%BA%91%E6%9C%8D%E5%8A%A1%E5%99%A8%E8%BF%9C%E7%A8%8B%E9%83%A8%E7%BD%B2%E6%8C%87%E5%8D%97)
202
 
203
+ 6. 使用WSL2(Windows Subsystem for Linux 子系统)。
204
+ 请访问[部署wiki-2](https://github.com/binary-husky/gpt_academic/wiki/%E4%BD%BF%E7%94%A8WSL2%EF%BC%88Windows-Subsystem-for-Linux-%E5%AD%90%E7%B3%BB%E7%BB%9F%EF%BC%89%E9%83%A8%E7%BD%B2)
205
+
206
+ 7. 如何在二级网址(如`http://localhost/subpath`)下运行。
207
  请访问[FastAPI运行说明](docs/WithFastapi.md)
208
 
 
 
209
  ---
210
  # Advanced Usage
211
  ## 自定义新的便捷按钮 / 自定义函数插件
 
230
 
231
  编写强大的函数插件来执行任何你想得到的和想不到的任务。
232
  本项目的插件编写、调试难度很低,只要您具备一定的python基础知识,就可以仿照我们提供的模板实现自己的插件功能。
233
+ 详情请参考[函数插件指南](https://github.com/binary-husky/gpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97)。
234
 
235
  ---
236
  # Latest Update
 
238
 
239
  1. 对话保存功能。在函数插件区调用 `保存当前的对话` 即可将当前对话保存为可读+可复原的html文件,
240
  另外在函数插件区(下拉菜单)调用 `载入对话历史存档` ,即可还原之前的会话。
241
+ Tip:不指定文件直接点击 `载入对话历史存档` 可以查看历史html存档缓存。
242
  <div align="center">
243
  <img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
244
  </div>
245
 
246
+ 2. ⭐Latex/Arxiv论文翻译功能⭐
 
 
247
  <div align="center">
248
+ <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/002a1a75-ace0-4e6a-94e2-ec1406a746f1" height="250" > ===>
249
+ <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/9fdcc391-f823-464f-9322-f8719677043b" height="250" >
 
250
  </div>
251
 
252
+ 3. 生成报告。大部分插件都会在执行结束后,生成工作报告
253
  <div align="center">
254
+ <img src="https://user-images.githubusercontent.com/96192199/227503770-fe29ce2c-53fd-47b0-b0ff-93805f0c2ff4.png" height="250" >
255
+ <img src="https://user-images.githubusercontent.com/96192199/227504617-7a497bb3-0a2a-4b50-9a8a-95ae60ea7afd.png" height="250" >
256
  </div>
257
 
258
+ 4. 模块化功能设计,简单的接口却能支持强大的功能
259
  <div align="center">
260
+ <img src="https://user-images.githubusercontent.com/96192199/229288270-093643c1-0018-487a-81e6-1d7809b6e90f.png" height="400" >
261
+ <img src="https://user-images.githubusercontent.com/96192199/227504931-19955f78-45cd-4d1c-adac-e71e50957915.png" height="400" >
 
 
 
 
262
  </div>
263
 
264
+ 5. 译解其他开源项目
265
  <div align="center">
266
+ <img src="https://user-images.githubusercontent.com/96192199/226935232-6b6a73ce-8900-4aee-93f9-733c7e6fef53.png" height="250" >
267
+ <img src="https://user-images.githubusercontent.com/96192199/226969067-968a27c1-1b9c-486b-8b81-ab2de8d3f88a.png" height="250" >
268
  </div>
269
 
270
  6. 装饰[live2d](https://github.com/fghrsh/live2d_demo)的小功能(默认关闭,需要修改`config.py`)
 
289
 
290
  10. Latex全文校对纠错
291
  <div align="center">
292
+ <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/651ccd98-02c9-4464-91e1-77a6b7d1b033" height="200" > ===>
293
+ <img src="https://github.com/binary-husky/gpt_academic/assets/96192199/476f66d9-7716-4537-b5c1-735372c25adb" height="200">
294
  </div>
295
 
296
 
297
+
298
  ## 版本:
299
  - version 3.5(Todo): 使用自然语言调用本项目的所有函数插件(高优先级)
300
+ - version 3.4: +arxiv论文翻译、latex论文批改功能
301
  - version 3.3: +互联网信息综合功能
302
  - version 3.2: 函数插件支持更多参数接口 (保存对话功能, 解读任意语言代码+同时询问任意的LLM组合)
303
  - version 3.1: 支持同时问询多个gpt模型!支持api2d,支持多个apikey负载均衡
 
315
 
316
  - 已知问题
317
  - 某些浏览器翻译插件干扰此软件前端的运行
318
+ - 官方Gradio目前有很多兼容性Bug,请务必使用`requirement.txt`安装Gradio
319
 
320
  ## 参考与学习
321
 
322
  ```
323
+ 代码中参考了很多其他优秀项目中的设计,顺序不分先后:
324
 
325
+ # 清华ChatGLM-6B:
326
  https://github.com/THUDM/ChatGLM-6B
327
 
328
+ # 清华JittorLLMs:
329
  https://github.com/Jittor/JittorLLMs
330
 
331
+ # ChatPaper:
332
+ https://github.com/kaixindelele/ChatPaper
333
+
334
+ # Edge-GPT:
335
  https://github.com/acheong08/EdgeGPT
336
 
337
+ # ChuanhuChatGPT:
338
  https://github.com/GaiZhenbiao/ChuanhuChatGPT
339
 
340
+ # Oobabooga one-click installer:
341
+ https://github.com/oobabooga/one-click-installers
342
 
343
+ # More:
344
  https://github.com/gradio-app/gradio
345
  https://github.com/fghrsh/live2d_demo
346
  ```
app.py CHANGED
@@ -2,7 +2,7 @@ import os; os.environ['no_proxy'] = '*' # 避免代理网络产生意外污染
2
 
3
  def main():
4
  import subprocess, sys
5
- subprocess.check_call([sys.executable, '-m', 'pip', 'install', '-r', 'requirements.txt'])
6
  import gradio as gr
7
  if gr.__version__ not in ['3.28.3','3.32.3']: assert False, "请用 pip install -r requirements.txt 安装依赖"
8
  from request_llm.bridge_all import predict
@@ -158,7 +158,7 @@ def main():
158
  for k in crazy_fns:
159
  if not crazy_fns[k].get("AsButton", True): continue
160
  click_handle = crazy_fns[k]["Button"].click(ArgsGeneralWrapper(crazy_fns[k]["Function"]), [*input_combo, gr.State(PORT)], output_combo)
161
- click_handle.then(on_report_generated, [file_upload, chatbot], [file_upload, chatbot])
162
  cancel_handles.append(click_handle)
163
  # 函数插件-下拉菜单与随变按钮的互动
164
  def on_dropdown_changed(k):
@@ -178,7 +178,7 @@ def main():
178
  if k in [r"打开插件列表", r"请先从插件列表中选择"]: return
179
  yield from ArgsGeneralWrapper(crazy_fns[k]["Function"])(*args, **kwargs)
180
  click_handle = switchy_bt.click(route,[switchy_bt, *input_combo, gr.State(PORT)], output_combo)
181
- click_handle.then(on_report_generated, [file_upload, chatbot], [file_upload, chatbot])
182
  cancel_handles.append(click_handle)
183
  # 终止按钮的回调函数注册
184
  stopBtn.click(fn=None, inputs=None, outputs=None, cancels=cancel_handles)
 
2
 
3
  def main():
4
  import subprocess, sys
5
+ subprocess.check_call([sys.executable, '-m', 'pip', 'install', 'gradio-stable-fork'])
6
  import gradio as gr
7
  if gr.__version__ not in ['3.28.3','3.32.3']: assert False, "请用 pip install -r requirements.txt 安装依赖"
8
  from request_llm.bridge_all import predict
 
158
  for k in crazy_fns:
159
  if not crazy_fns[k].get("AsButton", True): continue
160
  click_handle = crazy_fns[k]["Button"].click(ArgsGeneralWrapper(crazy_fns[k]["Function"]), [*input_combo, gr.State(PORT)], output_combo)
161
+ click_handle.then(on_report_generated, [cookies, file_upload, chatbot], [cookies, file_upload, chatbot])
162
  cancel_handles.append(click_handle)
163
  # 函数插件-下拉菜单与随变按钮的互动
164
  def on_dropdown_changed(k):
 
178
  if k in [r"打开插件列表", r"请先从插件列表中选择"]: return
179
  yield from ArgsGeneralWrapper(crazy_fns[k]["Function"])(*args, **kwargs)
180
  click_handle = switchy_bt.click(route,[switchy_bt, *input_combo, gr.State(PORT)], output_combo)
181
+ click_handle.then(on_report_generated, [cookies, file_upload, chatbot], [cookies, file_upload, chatbot])
182
  cancel_handles.append(click_handle)
183
  # 终止按钮的回调函数注册
184
  stopBtn.click(fn=None, inputs=None, outputs=None, cancels=cancel_handles)
colorful.py CHANGED
@@ -34,58 +34,28 @@ def print亮紫(*kw,**kargs):
34
  def print亮靛(*kw,**kargs):
35
  print("\033[1;36m",*kw,"\033[0m",**kargs)
36
 
37
-
38
-
39
- def print亮红(*kw,**kargs):
40
- print("\033[1;31m",*kw,"\033[0m",**kargs)
41
- def print亮绿(*kw,**kargs):
42
- print("\033[1;32m",*kw,"\033[0m",**kargs)
43
- def print亮黄(*kw,**kargs):
44
- print("\033[1;33m",*kw,"\033[0m",**kargs)
45
- def print亮蓝(*kw,**kargs):
46
- print("\033[1;34m",*kw,"\033[0m",**kargs)
47
- def print亮紫(*kw,**kargs):
48
- print("\033[1;35m",*kw,"\033[0m",**kargs)
49
- def print亮靛(*kw,**kargs):
50
- print("\033[1;36m",*kw,"\033[0m",**kargs)
51
-
52
- print_red = print红
53
- print_green = print绿
54
- print_yellow = print黄
55
- print_blue = print蓝
56
- print_purple = print紫
57
- print_indigo = print靛
58
-
59
- print_bold_red = print亮红
60
- print_bold_green = print亮绿
61
- print_bold_yellow = print亮黄
62
- print_bold_blue = print亮蓝
63
- print_bold_purple = print亮紫
64
- print_bold_indigo = print亮靛
65
-
66
- if not stdout.isatty():
67
- # redirection, avoid a fucked up log file
68
- print红 = print
69
- print绿 = print
70
- print黄 = print
71
- print蓝 = print
72
- print紫 = print
73
- print靛 = print
74
- print亮红 = print
75
- print亮绿 = print
76
- print亮黄 = print
77
- print亮蓝 = print
78
- print亮紫 = print
79
- print亮靛 = print
80
- print_red = print
81
- print_green = print
82
- print_yellow = print
83
- print_blue = print
84
- print_purple = print
85
- print_indigo = print
86
- print_bold_red = print
87
- print_bold_green = print
88
- print_bold_yellow = print
89
- print_bold_blue = print
90
- print_bold_purple = print
91
- print_bold_indigo = print
 
34
  def print亮靛(*kw,**kargs):
35
  print("\033[1;36m",*kw,"\033[0m",**kargs)
36
 
37
+ # Do you like the elegance of Chinese characters?
38
+ def sprint红(*kw):
39
+ return "\033[0;31m"+' '.join(kw)+"\033[0m"
40
+ def sprint绿(*kw):
41
+ return "\033[0;32m"+' '.join(kw)+"\033[0m"
42
+ def sprint黄(*kw):
43
+ return "\033[0;33m"+' '.join(kw)+"\033[0m"
44
+ def sprint蓝(*kw):
45
+ return "\033[0;34m"+' '.join(kw)+"\033[0m"
46
+ def sprint紫(*kw):
47
+ return "\033[0;35m"+' '.join(kw)+"\033[0m"
48
+ def sprint靛(*kw):
49
+ return "\033[0;36m"+' '.join(kw)+"\033[0m"
50
+ def sprint亮红(*kw):
51
+ return "\033[1;31m"+' '.join(kw)+"\033[0m"
52
+ def sprint亮绿(*kw):
53
+ return "\033[1;32m"+' '.join(kw)+"\033[0m"
54
+ def sprint亮黄(*kw):
55
+ return "\033[1;33m"+' '.join(kw)+"\033[0m"
56
+ def sprint亮蓝(*kw):
57
+ return "\033[1;34m"+' '.join(kw)+"\033[0m"
58
+ def sprint亮紫(*kw):
59
+ return "\033[1;35m"+' '.join(kw)+"\033[0m"
60
+ def sprint亮靛(*kw):
61
+ return "\033[1;36m"+' '.join(kw)+"\033[0m"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.py CHANGED
@@ -1,6 +1,7 @@
1
  # [step 1]>> 例如: API_KEY = "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" (此key无效)
2
  API_KEY = "sk-此处填API密钥" # 可同时填写多个API-KEY,用英文逗号分割,例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey1,fkxxxx-api2dkey2"
3
 
 
4
  # [step 2]>> 改为True应用代理,如果直接在海外服务器部署,此处不修改
5
  USE_PROXY = False
6
  if USE_PROXY:
@@ -80,3 +81,10 @@ your bing cookies here
80
  # 如果需要使用Slack Claude,使用教程详情见 request_llm/README.md
81
  SLACK_CLAUDE_BOT_ID = ''
82
  SLACK_CLAUDE_USER_TOKEN = ''
 
 
 
 
 
 
 
 
1
  # [step 1]>> 例如: API_KEY = "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" (此key无效)
2
  API_KEY = "sk-此处填API密钥" # 可同时填写多个API-KEY,用英文逗号分割,例如API_KEY = "sk-openaikey1,sk-openaikey2,fkxxxx-api2dkey1,fkxxxx-api2dkey2"
3
 
4
+
5
  # [step 2]>> 改为True应用代理,如果直接在海外服务器部署,此处不修改
6
  USE_PROXY = False
7
  if USE_PROXY:
 
81
  # 如果需要使用Slack Claude,使用教程详情见 request_llm/README.md
82
  SLACK_CLAUDE_BOT_ID = ''
83
  SLACK_CLAUDE_USER_TOKEN = ''
84
+
85
+
86
+ # 如果需要使用AZURE 详情请见额外文档 docs\use_azure.md
87
+ AZURE_ENDPOINT = "https://你的api名称.openai.azure.com/"
88
+ AZURE_API_KEY = "填入azure openai api的密钥"
89
+ AZURE_API_VERSION = "填入api版本"
90
+ AZURE_ENGINE = "填入ENGINE"
crazy_functional.py CHANGED
@@ -112,11 +112,11 @@ def get_crazy_functions():
112
  "AsButton": False, # 加入下拉菜单中
113
  "Function": HotReload(解析项目本身)
114
  },
115
- "[老旧的Demo] 把本项目源代码切换成全英文": {
116
- # HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
117
- "AsButton": False, # 加入下拉菜单中
118
- "Function": HotReload(全项目切换英文)
119
- },
120
  "[插件demo] 历史上的今天": {
121
  # HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
122
  "Function": HotReload(高阶功能模板函数)
@@ -126,7 +126,7 @@ def get_crazy_functions():
126
  ###################### 第二组插件 ###########################
127
  # [第二组插件]: 经过充分测试
128
  from crazy_functions.批量总结PDF文档 import 批量总结PDF文档
129
- from crazy_functions.批量总结PDF文档pdfminer import 批量总结PDF文档pdfminer
130
  from crazy_functions.批量翻译PDF文档_多线程 import 批量翻译PDF文档
131
  from crazy_functions.谷歌检索小助手 import 谷歌检索小助手
132
  from crazy_functions.理解PDF文档内容 import 理解PDF文档内容标准文件输入
@@ -152,17 +152,16 @@ def get_crazy_functions():
152
  # HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
153
  "Function": HotReload(批量总结PDF文档)
154
  },
155
- "[测试功能] 批量总结PDF文档pdfminer": {
156
- "Color": "stop",
157
- "AsButton": False, # 加入下拉菜单中
158
- "Function": HotReload(批量总结PDF文档pdfminer)
159
- },
160
  "谷歌学术检索助手(输入谷歌学术搜索页url)": {
161
  "Color": "stop",
162
  "AsButton": False, # 加入下拉菜单中
163
  "Function": HotReload(谷歌检索小助手)
164
  },
165
-
166
  "理解PDF文档内容 (模仿ChatPDF)": {
167
  # HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
168
  "Color": "stop",
@@ -181,7 +180,7 @@ def get_crazy_functions():
181
  "AsButton": False, # 加入下拉菜单中
182
  "Function": HotReload(Latex英文纠错)
183
  },
184
- "[测试功能] 中文Latex项目全文润色(输入路径或上传压缩包)": {
185
  # HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
186
  "Color": "stop",
187
  "AsButton": False, # 加入下拉菜单中
@@ -210,65 +209,96 @@ def get_crazy_functions():
210
  })
211
 
212
  ###################### 第三组插件 ###########################
213
- # [第三组插件]: 尚未充分测试的函数插件,放在这里
214
- from crazy_functions.下载arxiv论文翻译摘要 import 下载arxiv论文并翻译摘要
215
- function_plugins.update({
216
- "一键下载arxiv论文并翻译摘要(先在input输入编号,如1812.10695)": {
217
- "Color": "stop",
218
- "AsButton": False, # 加入下拉菜单中
219
- "Function": HotReload(下载arxiv论文并翻译摘要)
220
- }
221
- })
222
 
223
- from crazy_functions.联网的ChatGPT import 连接网络回答问题
224
- function_plugins.update({
225
- "连接网络回答问题(先输入问题,再点击按钮,需要访问谷歌)": {
226
- "Color": "stop",
227
- "AsButton": False, # 加入下拉菜单中
228
- "Function": HotReload(连接网络回答问题)
229
- }
230
- })
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
231
 
232
- from crazy_functions.解析项目源代码 import 解析任意code项目
233
- function_plugins.update({
234
- "解析项目源代码(手动指定和筛选源代码文件类型)": {
235
- "Color": "stop",
236
- "AsButton": False,
237
- "AdvancedArgs": True, # 调用时,唤起高级参数输入区(默认False)
238
- "ArgsReminder": "输入时用逗号隔开, *代表通配符, 加了^代表不匹配; 不输入代表全部匹配。例如: \"*.c, ^*.cpp, config.toml, ^*.toml\"", # 高级参数输入区的显示提示
239
- "Function": HotReload(解析任意code项目)
240
- },
241
- })
242
- from crazy_functions.询问多个大语言模型 import 同时问询_指定模型
243
- function_plugins.update({
244
- "询问多个GPT模型(手动指定询问哪些模型)": {
245
- "Color": "stop",
246
- "AsButton": False,
247
- "AdvancedArgs": True, # 调用时,唤起高级参数输入区(默认False)
248
- "ArgsReminder": "支持任意数量的llm接口,用&符号分隔。例如chatglm&gpt-3.5-turbo&api2d-gpt-4", # 高级参数输入区的显示提示
249
- "Function": HotReload(同时问询_指定模型)
250
- },
251
- })
252
- from crazy_functions.图片生成 import 图片生成
253
- function_plugins.update({
254
- "图片生成(先切换模型到openai或api2d)": {
255
- "Color": "stop",
256
- "AsButton": False,
257
- "AdvancedArgs": True, # 调用时,唤起高级参数输入区(默认False)
258
- "ArgsReminder": "在这里输入分辨率, 如256x256(默认)", # 高级参数输入区的显示��示
259
- "Function": HotReload(图片生成)
260
- },
261
- })
262
- from crazy_functions.总结音视频 import 总结音视频
263
- function_plugins.update({
264
- "批量总结音视频(输入路径或上传压缩包)": {
265
- "Color": "stop",
266
- "AsButton": False,
267
- "AdvancedArgs": True,
268
- "ArgsReminder": "调用openai api 使用whisper-1模型, 目前支持的格式:mp4, m4a, wav, mpga, mpeg, mp3。此处可以输入解析提示,例如:解析为简体中文(默认)。",
269
- "Function": HotReload(总结音视频)
270
- }
271
- })
272
  try:
273
  from crazy_functions.数学动画生成manim import 动画生成
274
  function_plugins.update({
@@ -295,5 +325,83 @@ def get_crazy_functions():
295
  except:
296
  print('Load function plugin failed')
297
 
298
- ###################### 第n组插件 ###########################
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
299
  return function_plugins
 
112
  "AsButton": False, # 加入下拉菜单中
113
  "Function": HotReload(解析项目本身)
114
  },
115
+ # "[老旧的Demo] 把本项目源代码切换成全英文": {
116
+ # # HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
117
+ # "AsButton": False, # 加入下拉菜单中
118
+ # "Function": HotReload(全项目切换英文)
119
+ # },
120
  "[插件demo] 历史上的今天": {
121
  # HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
122
  "Function": HotReload(高阶功能模板函数)
 
126
  ###################### 第二组插件 ###########################
127
  # [第二组插件]: 经过充分测试
128
  from crazy_functions.批量总结PDF文档 import 批量总结PDF文档
129
+ # from crazy_functions.批量总结PDF文档pdfminer import 批量总结PDF文档pdfminer
130
  from crazy_functions.批量翻译PDF文档_多线程 import 批量翻译PDF文档
131
  from crazy_functions.谷歌检索小助手 import 谷歌检索小助手
132
  from crazy_functions.理解PDF文档内容 import 理解PDF文档内容标准文件输入
 
152
  # HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
153
  "Function": HotReload(批量总结PDF文档)
154
  },
155
+ # "[测试功能] 批量总结PDF文档pdfminer": {
156
+ # "Color": "stop",
157
+ # "AsButton": False, # 加入下拉菜单中
158
+ # "Function": HotReload(批量总结PDF文档pdfminer)
159
+ # },
160
  "谷歌学术检索助手(输入谷歌学术搜索页url)": {
161
  "Color": "stop",
162
  "AsButton": False, # 加入下拉菜单中
163
  "Function": HotReload(谷歌检索小助手)
164
  },
 
165
  "理解PDF文档内容 (模仿ChatPDF)": {
166
  # HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
167
  "Color": "stop",
 
180
  "AsButton": False, # 加入下拉菜单中
181
  "Function": HotReload(Latex英文纠错)
182
  },
183
+ "中文Latex项目全文润色(输入路径或上传压缩包)": {
184
  # HotReload 的意思是热更新,修改函数插件代码后,不需要重启程序,代码直接生效
185
  "Color": "stop",
186
  "AsButton": False, # 加入下拉菜单中
 
209
  })
210
 
211
  ###################### 第三组插件 ###########################
212
+ # [第三组插件]: 尚未充分测试的函数插件
 
 
 
 
 
 
 
 
213
 
214
+ try:
215
+ from crazy_functions.下载arxiv论文翻译摘要 import 下载arxiv论文并翻译摘要
216
+ function_plugins.update({
217
+ "一键下载arxiv论文并翻译摘要(先在input输入编号,如1812.10695)": {
218
+ "Color": "stop",
219
+ "AsButton": False, # 加入下拉菜单中
220
+ "Function": HotReload(下载arxiv论文并翻译摘要)
221
+ }
222
+ })
223
+ except:
224
+ print('Load function plugin failed')
225
+
226
+ try:
227
+ from crazy_functions.联网的ChatGPT import 连接网络回答问题
228
+ function_plugins.update({
229
+ "连接网络回答问题(输入问题后点击该插件,需要访问谷歌)": {
230
+ "Color": "stop",
231
+ "AsButton": False, # 加入下拉菜单中
232
+ "Function": HotReload(连接网络回答问题)
233
+ }
234
+ })
235
+ from crazy_functions.联网的ChatGPT_bing版 import 连接bing搜索回答问题
236
+ function_plugins.update({
237
+ "连接网络回答问题(中文Bing版,输入问题后点击该插件)": {
238
+ "Color": "stop",
239
+ "AsButton": False, # 加入下拉菜单中
240
+ "Function": HotReload(连接bing搜索回答问题)
241
+ }
242
+ })
243
+ except:
244
+ print('Load function plugin failed')
245
+
246
+ try:
247
+ from crazy_functions.解析项目源代码 import 解析任意code项目
248
+ function_plugins.update({
249
+ "解析项目源代码(手动指定和筛选源代码文件类型)": {
250
+ "Color": "stop",
251
+ "AsButton": False,
252
+ "AdvancedArgs": True, # 调用时,唤起高级参数输入区(默认False)
253
+ "ArgsReminder": "输入时用逗号隔开, *代表通配符, 加了^代表不匹配; 不输入代表全部匹配。例如: \"*.c, ^*.cpp, config.toml, ^*.toml\"", # 高级参数输入区的显示提示
254
+ "Function": HotReload(解析任意code项目)
255
+ },
256
+ })
257
+ except:
258
+ print('Load function plugin failed')
259
+
260
+ try:
261
+ from crazy_functions.询问多个大语言模型 import 同时问询_指定模型
262
+ function_plugins.update({
263
+ "询问多个GPT模型(手动指定询问哪些模型)": {
264
+ "Color": "stop",
265
+ "AsButton": False,
266
+ "AdvancedArgs": True, # 调用时,唤起高级参数输入区(默认False)
267
+ "ArgsReminder": "支持任意数量的llm接口,用&符号分隔。例如chatglm&gpt-3.5-turbo&api2d-gpt-4", # 高级参数输入区的显示提示
268
+ "Function": HotReload(同时问询_指定模型)
269
+ },
270
+ })
271
+ except:
272
+ print('Load function plugin failed')
273
+
274
+ try:
275
+ from crazy_functions.图片生成 import 图片生成
276
+ function_plugins.update({
277
+ "图片生成(先切换模型到openai或api2d)": {
278
+ "Color": "stop",
279
+ "AsButton": False,
280
+ "AdvancedArgs": True, # 调用时,唤起高级参数输入区(默认False)
281
+ "ArgsReminder": "在这里输入分辨率, 如256x256(默认)", # 高级参数输入区的显示提示
282
+ "Function": HotReload(图片生成)
283
+ },
284
+ })
285
+ except:
286
+ print('Load function plugin failed')
287
+
288
+ try:
289
+ from crazy_functions.总结音视频 import 总结音视频
290
+ function_plugins.update({
291
+ "批量总结音视频(输入路径或上传压缩包)": {
292
+ "Color": "stop",
293
+ "AsButton": False,
294
+ "AdvancedArgs": True,
295
+ "ArgsReminder": "调用openai api 使用whisper-1模型, 目前支持的格式:mp4, m4a, wav, mpga, mpeg, mp3。此处可以输入解析提示,例如:解析为简体中文(默认)。",
296
+ "Function": HotReload(总结音视频)
297
+ }
298
+ })
299
+ except:
300
+ print('Load function plugin failed')
301
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
302
  try:
303
  from crazy_functions.数学动画生成manim import 动画生成
304
  function_plugins.update({
 
325
  except:
326
  print('Load function plugin failed')
327
 
328
+ try:
329
+ from crazy_functions.Langchain知识库 import 知识库问答
330
+ function_plugins.update({
331
+ "[功能尚不稳定] 构建知识库(请先上传文件素材)": {
332
+ "Color": "stop",
333
+ "AsButton": False,
334
+ "AdvancedArgs": True,
335
+ "ArgsReminder": "待注入的知识库名称id, 默认为default",
336
+ "Function": HotReload(知识库问答)
337
+ }
338
+ })
339
+ except:
340
+ print('Load function plugin failed')
341
+
342
+ try:
343
+ from crazy_functions.Langchain知识库 import 读取知识库作答
344
+ function_plugins.update({
345
+ "[功能尚不稳定] 知识库问答": {
346
+ "Color": "stop",
347
+ "AsButton": False,
348
+ "AdvancedArgs": True,
349
+ "ArgsReminder": "待提取的知识库名称id, 默认为default, 您需要首先调用构建知识库",
350
+ "Function": HotReload(读取知识库作答)
351
+ }
352
+ })
353
+ except:
354
+ print('Load function plugin failed')
355
+
356
+ try:
357
+ from crazy_functions.Latex输出PDF结果 import Latex英文纠错加PDF对比
358
+ function_plugins.update({
359
+ "Latex英文纠错+高亮修正位置 [需Latex]": {
360
+ "Color": "stop",
361
+ "AsButton": False,
362
+ "AdvancedArgs": True,
363
+ "ArgsReminder": "如果有必要, 请在此处追加更细致的矫错指令(使用英文)。",
364
+ "Function": HotReload(Latex英文纠错加PDF对比)
365
+ }
366
+ })
367
+ from crazy_functions.Latex输出PDF结果 import Latex翻译中文并重新编译PDF
368
+ function_plugins.update({
369
+ "Arixv翻译(输入arxivID)[需Latex]": {
370
+ "Color": "stop",
371
+ "AsButton": False,
372
+ "AdvancedArgs": True,
373
+ "ArgsReminder":
374
+ "如果有必要, 请在此处给出自定义翻译命令, 解决部分词汇翻译不准确的问题。 "+
375
+ "例如当单词'agent'翻译不准确时, 请尝试把以下指令复制到高级参数区: " + 'If the term "agent" is used in this section, it should be translated to "智能体". ',
376
+ "Function": HotReload(Latex翻译中文并重新编译PDF)
377
+ }
378
+ })
379
+ function_plugins.update({
380
+ "本地论文翻译(上传Latex压缩包)[需Latex]": {
381
+ "Color": "stop",
382
+ "AsButton": False,
383
+ "AdvancedArgs": True,
384
+ "ArgsReminder":
385
+ "如果有必要, 请在此处给出自定义翻译命令, 解决部分词汇翻译不准确的问题。 "+
386
+ "例如当单词'agent'翻译不准确时, 请尝试把以下指令复制到高级参数区: " + 'If the term "agent" is used in this section, it should be translated to "智能体". ',
387
+ "Function": HotReload(Latex翻译中文并重新编译PDF)
388
+ }
389
+ })
390
+ except:
391
+ print('Load function plugin failed')
392
+
393
+ # try:
394
+ # from crazy_functions.虚空终端 import 终端
395
+ # function_plugins.update({
396
+ # "超级终端": {
397
+ # "Color": "stop",
398
+ # "AsButton": False,
399
+ # # "AdvancedArgs": True,
400
+ # # "ArgsReminder": "",
401
+ # "Function": HotReload(终端)
402
+ # }
403
+ # })
404
+ # except:
405
+ # print('Load function plugin failed')
406
+
407
  return function_plugins
crazy_functions/Langchain知识库.py ADDED
@@ -0,0 +1,107 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from toolbox import CatchException, update_ui, ProxyNetworkActivate
2
+ from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive, get_files_from_everything
3
+
4
+
5
+
6
+ @CatchException
7
+ def 知识库问答(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
8
+ """
9
+ txt 输入栏用户输入的文本,例如需要翻译的一段话,再例如一个包含了待处理文件的路径
10
+ llm_kwargs gpt模型参数, 如温度和top_p等, 一般原样传递下去就行
11
+ plugin_kwargs 插件模型的参数,暂时没有用武之地
12
+ chatbot 聊天显示框的句柄,用于显示给用户
13
+ history 聊天历史,前情提要
14
+ system_prompt 给gpt的静默提醒
15
+ web_port 当前软件运行的端口号
16
+ """
17
+ history = [] # 清空历史,以免输入溢出
18
+ chatbot.append(("这是什么功能?", "[Local Message] 从一批文件(txt, md, tex)中读取数据构建知识库, 然后进行问答。"))
19
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
20
+
21
+ # resolve deps
22
+ try:
23
+ from zh_langchain import construct_vector_store
24
+ from langchain.embeddings.huggingface import HuggingFaceEmbeddings
25
+ from .crazy_utils import knowledge_archive_interface
26
+ except Exception as e:
27
+ chatbot.append(
28
+ ["依赖不足",
29
+ "导入依赖失败。正在尝试自动安装,请查看终端的输出或耐心等待..."]
30
+ )
31
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
32
+ from .crazy_utils import try_install_deps
33
+ try_install_deps(['zh_langchain==0.2.1'])
34
+
35
+ # < --------------------读取参数--------------- >
36
+ if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
37
+ kai_id = plugin_kwargs.get("advanced_arg", 'default')
38
+
39
+ # < --------------------读取文件--------------- >
40
+ file_manifest = []
41
+ spl = ["txt", "doc", "docx", "email", "epub", "html", "json", "md", "msg", "pdf", "ppt", "pptx", "rtf"]
42
+ for sp in spl:
43
+ _, file_manifest_tmp, _ = get_files_from_everything(txt, type=f'.{sp}')
44
+ file_manifest += file_manifest_tmp
45
+
46
+ if len(file_manifest) == 0:
47
+ chatbot.append(["没有找到任何可读取文件", "当前支持的格式包括: txt, md, docx, pptx, pdf, json等"])
48
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
49
+ return
50
+
51
+ # < -------------------预热文本向量化模组--------------- >
52
+ chatbot.append(['<br/>'.join(file_manifest), "正在预热文本向量化模组, 如果是第一次运行, 将消耗较长时间下载中文向量化模型..."])
53
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
54
+ print('Checking Text2vec ...')
55
+ from langchain.embeddings.huggingface import HuggingFaceEmbeddings
56
+ with ProxyNetworkActivate(): # 临时地激活代理网络
57
+ HuggingFaceEmbeddings(model_name="GanymedeNil/text2vec-large-chinese")
58
+
59
+ # < -------------------构建知识库--------------- >
60
+ chatbot.append(['<br/>'.join(file_manifest), "正在构建知识库..."])
61
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
62
+ print('Establishing knowledge archive ...')
63
+ with ProxyNetworkActivate(): # 临时地激活代理网络
64
+ kai = knowledge_archive_interface()
65
+ kai.feed_archive(file_manifest=file_manifest, id=kai_id)
66
+ kai_files = kai.get_loaded_file()
67
+ kai_files = '<br/>'.join(kai_files)
68
+ # chatbot.append(['知识库构建成功', "正在将知识库存储至cookie中"])
69
+ # yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
70
+ # chatbot._cookies['langchain_plugin_embedding'] = kai.get_current_archive_id()
71
+ # chatbot._cookies['lock_plugin'] = 'crazy_functions.Langchain知识库->读取知识库作答'
72
+ # chatbot.append(['完成', "“根据知识库作答”函数插件已经接管问答系统, 提问吧! 但注意, 您接下来不能再使用其他插件了,刷新页面即可以退出知识库问答模式。"])
73
+ chatbot.append(['构建完成', f"当前知识库内的有效文件:\n\n---\n\n{kai_files}\n\n---\n\n请切换至“知识库问答”插件进行知识库访问, 或者使用此插件继续上传更多文件。"])
74
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间,我们先及时地做一次界面更新
75
+
76
+ @CatchException
77
+ def 读取知识库作答(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port=-1):
78
+ # resolve deps
79
+ try:
80
+ from zh_langchain import construct_vector_store
81
+ from langchain.embeddings.huggingface import HuggingFaceEmbeddings
82
+ from .crazy_utils import knowledge_archive_interface
83
+ except Exception as e:
84
+ chatbot.append(["依赖不足", "导入依赖失败。正在尝试自动安装,请查看终端的输出或耐��等待..."])
85
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
86
+ from .crazy_utils import try_install_deps
87
+ try_install_deps(['zh_langchain==0.2.1'])
88
+
89
+ # < ------------------- --------------- >
90
+ kai = knowledge_archive_interface()
91
+
92
+ if 'langchain_plugin_embedding' in chatbot._cookies:
93
+ resp, prompt = kai.answer_with_archive_by_id(txt, chatbot._cookies['langchain_plugin_embedding'])
94
+ else:
95
+ if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
96
+ kai_id = plugin_kwargs.get("advanced_arg", 'default')
97
+ resp, prompt = kai.answer_with_archive_by_id(txt, kai_id)
98
+
99
+ chatbot.append((txt, '[Local Message] ' + prompt))
100
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间,我们先及时地做一次界面更新
101
+ gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
102
+ inputs=prompt, inputs_show_user=txt,
103
+ llm_kwargs=llm_kwargs, chatbot=chatbot, history=[],
104
+ sys_prompt=system_prompt
105
+ )
106
+ history.extend((prompt, gpt_say))
107
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间,我们先及时地做一次界面更新
crazy_functions/Latex全文润色.py CHANGED
@@ -238,3 +238,6 @@ def Latex英文纠错(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_p
238
  yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
239
  return
240
  yield from 多文件润色(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, language='en', mode='proofread')
 
 
 
 
238
  yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
239
  return
240
  yield from 多文件润色(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, language='en', mode='proofread')
241
+
242
+
243
+
crazy_functions/Latex输出PDF结果.py ADDED
@@ -0,0 +1,300 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from toolbox import update_ui, trimmed_format_exc, get_conf, objdump, objload, promote_file_to_downloadzone
2
+ from toolbox import CatchException, report_execption, update_ui_lastest_msg, zip_result, gen_time_str
3
+ from functools import partial
4
+ import glob, os, requests, time
5
+ pj = os.path.join
6
+ ARXIV_CACHE_DIR = os.path.expanduser(f"~/arxiv_cache/")
7
+
8
+ # =================================== 工具函数 ===============================================
9
+ 专业词汇声明 = 'If the term "agent" is used in this section, it should be translated to "智能体". '
10
+ def switch_prompt(pfg, mode, more_requirement):
11
+ """
12
+ Generate prompts and system prompts based on the mode for proofreading or translating.
13
+ Args:
14
+ - pfg: Proofreader or Translator instance.
15
+ - mode: A string specifying the mode, either 'proofread' or 'translate_zh'.
16
+
17
+ Returns:
18
+ - inputs_array: A list of strings containing prompts for users to respond to.
19
+ - sys_prompt_array: A list of strings containing prompts for system prompts.
20
+ """
21
+ n_split = len(pfg.sp_file_contents)
22
+ if mode == 'proofread_en':
23
+ inputs_array = [r"Below is a section from an academic paper, proofread this section." +
24
+ r"Do not modify any latex command such as \section, \cite, \begin, \item and equations. " + more_requirement +
25
+ r"Answer me only with the revised text:" +
26
+ f"\n\n{frag}" for frag in pfg.sp_file_contents]
27
+ sys_prompt_array = ["You are a professional academic paper writer." for _ in range(n_split)]
28
+ elif mode == 'translate_zh':
29
+ inputs_array = [r"Below is a section from an English academic paper, translate it into Chinese. " + more_requirement +
30
+ r"Do not modify any latex command such as \section, \cite, \begin, \item and equations. " +
31
+ r"Answer me only with the translated text:" +
32
+ f"\n\n{frag}" for frag in pfg.sp_file_contents]
33
+ sys_prompt_array = ["You are a professional translator." for _ in range(n_split)]
34
+ else:
35
+ assert False, "未知指令"
36
+ return inputs_array, sys_prompt_array
37
+
38
+ def desend_to_extracted_folder_if_exist(project_folder):
39
+ """
40
+ Descend into the extracted folder if it exists, otherwise return the original folder.
41
+
42
+ Args:
43
+ - project_folder: A string specifying the folder path.
44
+
45
+ Returns:
46
+ - A string specifying the path to the extracted folder, or the original folder if there is no extracted folder.
47
+ """
48
+ maybe_dir = [f for f in glob.glob(f'{project_folder}/*') if os.path.isdir(f)]
49
+ if len(maybe_dir) == 0: return project_folder
50
+ if maybe_dir[0].endswith('.extract'): return maybe_dir[0]
51
+ return project_folder
52
+
53
+ def move_project(project_folder, arxiv_id=None):
54
+ """
55
+ Create a new work folder and copy the project folder to it.
56
+
57
+ Args:
58
+ - project_folder: A string specifying the folder path of the project.
59
+
60
+ Returns:
61
+ - A string specifying the path to the new work folder.
62
+ """
63
+ import shutil, time
64
+ time.sleep(2) # avoid time string conflict
65
+ if arxiv_id is not None:
66
+ new_workfolder = pj(ARXIV_CACHE_DIR, arxiv_id, 'workfolder')
67
+ else:
68
+ new_workfolder = f'gpt_log/{gen_time_str()}'
69
+ try:
70
+ shutil.rmtree(new_workfolder)
71
+ except:
72
+ pass
73
+
74
+ # align subfolder if there is a folder wrapper
75
+ items = glob.glob(pj(project_folder,'*'))
76
+ if len(glob.glob(pj(project_folder,'*.tex'))) == 0 and len(items) == 1:
77
+ if os.path.isdir(items[0]): project_folder = items[0]
78
+
79
+ shutil.copytree(src=project_folder, dst=new_workfolder)
80
+ return new_workfolder
81
+
82
+ def arxiv_download(chatbot, history, txt):
83
+ def check_cached_translation_pdf(arxiv_id):
84
+ translation_dir = pj(ARXIV_CACHE_DIR, arxiv_id, 'translation')
85
+ if not os.path.exists(translation_dir):
86
+ os.makedirs(translation_dir)
87
+ target_file = pj(translation_dir, 'translate_zh.pdf')
88
+ if os.path.exists(target_file):
89
+ promote_file_to_downloadzone(target_file, rename_file=None, chatbot=chatbot)
90
+ return target_file
91
+ return False
92
+ def is_float(s):
93
+ try:
94
+ float(s)
95
+ return True
96
+ except ValueError:
97
+ return False
98
+ if ('.' in txt) and ('/' not in txt) and is_float(txt): # is arxiv ID
99
+ txt = 'https://arxiv.org/abs/' + txt.strip()
100
+ if ('.' in txt) and ('/' not in txt) and is_float(txt[:10]): # is arxiv ID
101
+ txt = 'https://arxiv.org/abs/' + txt[:10]
102
+ if not txt.startswith('https://arxiv.org'):
103
+ return txt, None
104
+
105
+ # <-------------- inspect format ------------->
106
+ chatbot.append([f"检测到arxiv文档连接", '尝试下载 ...'])
107
+ yield from update_ui(chatbot=chatbot, history=history)
108
+ time.sleep(1) # 刷新界面
109
+
110
+ url_ = txt # https://arxiv.org/abs/1707.06690
111
+ if not txt.startswith('https://arxiv.org/abs/'):
112
+ msg = f"解析arxiv网址失败, 期望格式例如: https://arxiv.org/abs/1707.06690。实际得到格式: {url_}"
113
+ yield from update_ui_lastest_msg(msg, chatbot=chatbot, history=history) # 刷新界面
114
+ return msg, None
115
+ # <-------------- set format ------------->
116
+ arxiv_id = url_.split('/abs/')[-1]
117
+ if 'v' in arxiv_id: arxiv_id = arxiv_id[:10]
118
+ cached_translation_pdf = check_cached_translation_pdf(arxiv_id)
119
+ if cached_translation_pdf: return cached_translation_pdf, arxiv_id
120
+
121
+ url_tar = url_.replace('/abs/', '/e-print/')
122
+ translation_dir = pj(ARXIV_CACHE_DIR, arxiv_id, 'e-print')
123
+ extract_dst = pj(ARXIV_CACHE_DIR, arxiv_id, 'extract')
124
+ os.makedirs(translation_dir, exist_ok=True)
125
+
126
+ # <-------------- download arxiv source file ------------->
127
+ dst = pj(translation_dir, arxiv_id+'.tar')
128
+ if os.path.exists(dst):
129
+ yield from update_ui_lastest_msg("调用缓存", chatbot=chatbot, history=history) # 刷新界面
130
+ else:
131
+ yield from update_ui_lastest_msg("开始下载", chatbot=chatbot, history=history) # 刷新界面
132
+ proxies, = get_conf('proxies')
133
+ r = requests.get(url_tar, proxies=proxies)
134
+ with open(dst, 'wb+') as f:
135
+ f.write(r.content)
136
+ # <-------------- extract file ------------->
137
+ yield from update_ui_lastest_msg("下载完成", chatbot=chatbot, history=history) # 刷新界面
138
+ from toolbox import extract_archive
139
+ extract_archive(file_path=dst, dest_dir=extract_dst)
140
+ return extract_dst, arxiv_id
141
+ # ========================================= 插件主程序1 =====================================================
142
+
143
+
144
+ @CatchException
145
+ def Latex英文纠错加PDF对比(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
146
+ # <-------------- information about this plugin ------------->
147
+ chatbot.append([ "函数插件功能?",
148
+ "对整个Latex项目进行纠错, 用latex编译为PDF对修正处做高亮。函数插件贡献者: Binary-Husky。注意事项: 目前仅支持GPT3.5/GPT4,其他模型转化效果未知。目前对机器学习类文献转化效果最好,其他类型文献转化效果未知。仅在Windows系统进行了测试,其他操作系统表现未知。"])
149
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
150
+
151
+ # <-------------- more requirements ------------->
152
+ if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
153
+ more_req = plugin_kwargs.get("advanced_arg", "")
154
+ _switch_prompt_ = partial(switch_prompt, more_requirement=more_req)
155
+
156
+ # <-------------- check deps ------------->
157
+ try:
158
+ import glob, os, time, subprocess
159
+ subprocess.Popen(['pdflatex', '-version'])
160
+ from .latex_utils import Latex精细分解与转化, 编译Latex
161
+ except Exception as e:
162
+ chatbot.append([ f"解析项目: {txt}",
163
+ f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。安装方法https://tug.org/texlive/。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
164
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
165
+ return
166
+
167
+
168
+ # <-------------- clear history and read input ------------->
169
+ history = []
170
+ if os.path.exists(txt):
171
+ project_folder = txt
172
+ else:
173
+ if txt == "": txt = '空空如也的输入栏'
174
+ report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"找不到本地项目或无权访问: {txt}")
175
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
176
+ return
177
+ file_manifest = [f for f in glob.glob(f'{project_folder}/**/*.tex', recursive=True)]
178
+ if len(file_manifest) == 0:
179
+ report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"找不到任何.tex文件: {txt}")
180
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
181
+ return
182
+
183
+
184
+ # <-------------- if is a zip/tar file ------------->
185
+ project_folder = desend_to_extracted_folder_if_exist(project_folder)
186
+
187
+
188
+ # <-------------- move latex project away from temp folder ------------->
189
+ project_folder = move_project(project_folder, arxiv_id=None)
190
+
191
+
192
+ # <-------------- if merge_translate_zh is already generated, skip gpt req ------------->
193
+ if not os.path.exists(project_folder + '/merge_proofread_en.tex'):
194
+ yield from Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs,
195
+ chatbot, history, system_prompt, mode='proofread_en', switch_prompt=_switch_prompt_)
196
+
197
+
198
+ # <-------------- compile PDF ------------->
199
+ success = yield from 编译Latex(chatbot, history, main_file_original='merge', main_file_modified='merge_proofread_en',
200
+ work_folder_original=project_folder, work_folder_modified=project_folder, work_folder=project_folder)
201
+
202
+
203
+ # <-------------- zip PDF ------------->
204
+ zip_res = zip_result(project_folder)
205
+ if success:
206
+ chatbot.append((f"成功啦", '请查收结果(压缩包)...'))
207
+ yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
208
+ promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
209
+ else:
210
+ chatbot.append((f"失败了", '虽然PDF生成失败了, 但请查收结果(压缩包), 内含已经翻译的Tex文档, 也是可读的, 您可以到Github Issue区, 用该压缩包+对话历史存档进行反馈 ...'))
211
+ yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
212
+ promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
213
+
214
+ # <-------------- we are done ------------->
215
+ return success
216
+
217
+
218
+ # ========================================= 插件主程序2 =====================================================
219
+
220
+ @CatchException
221
+ def Latex翻译中文并重新编译PDF(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
222
+ # <-------------- information about this plugin ------------->
223
+ chatbot.append([
224
+ "函数插件功能?",
225
+ "对整个Latex项目进行翻译, 生成中文PDF。函数插件贡献者: Binary-Husky。注意事项: 此插件Windows支持最佳,Linux下必须使用Docker安装,详见项目主README.md。目前仅支持GPT3.5/GPT4,其他模型转化效果未知。目前对机器学习类文献转化效果最好,其他类型文献转化效果未知。"])
226
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
227
+
228
+ # <-------------- more requirements ------------->
229
+ if ("advanced_arg" in plugin_kwargs) and (plugin_kwargs["advanced_arg"] == ""): plugin_kwargs.pop("advanced_arg")
230
+ more_req = plugin_kwargs.get("advanced_arg", "")
231
+ _switch_prompt_ = partial(switch_prompt, more_requirement=more_req)
232
+
233
+ # <-------------- check deps ------------->
234
+ try:
235
+ import glob, os, time, subprocess
236
+ subprocess.Popen(['pdflatex', '-version'])
237
+ from .latex_utils import Latex精细分解与转化, 编译Latex
238
+ except Exception as e:
239
+ chatbot.append([ f"解析项目: {txt}",
240
+ f"尝试执行Latex指令失败。Latex没有安装, 或者不在环境变量PATH中。安装方法https://tug.org/texlive/。报错信息\n\n```\n\n{trimmed_format_exc()}\n\n```\n\n"])
241
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
242
+ return
243
+
244
+
245
+ # <-------------- clear history and read input ------------->
246
+ history = []
247
+ txt, arxiv_id = yield from arxiv_download(chatbot, history, txt)
248
+ if txt.endswith('.pdf'):
249
+ report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"发现已经存在翻译好的PDF文档")
250
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
251
+ return
252
+
253
+
254
+ if os.path.exists(txt):
255
+ project_folder = txt
256
+ else:
257
+ if txt == "": txt = '空空如也的输入栏'
258
+ report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"找不到本地项目或无权访问: {txt}")
259
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
260
+ return
261
+
262
+ file_manifest = [f for f in glob.glob(f'{project_folder}/**/*.tex', recursive=True)]
263
+ if len(file_manifest) == 0:
264
+ report_execption(chatbot, history, a = f"解析项目: {txt}", b = f"找不到任何.tex文件: {txt}")
265
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
266
+ return
267
+
268
+
269
+ # <-------------- if is a zip/tar file ------------->
270
+ project_folder = desend_to_extracted_folder_if_exist(project_folder)
271
+
272
+
273
+ # <-------------- move latex project away from temp folder ------------->
274
+ project_folder = move_project(project_folder, arxiv_id)
275
+
276
+
277
+ # <-------------- if merge_translate_zh is already generated, skip gpt req ------------->
278
+ if not os.path.exists(project_folder + '/merge_translate_zh.tex'):
279
+ yield from Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs,
280
+ chatbot, history, system_prompt, mode='translate_zh', switch_prompt=_switch_prompt_)
281
+
282
+
283
+ # <-------------- compile PDF ------------->
284
+ success = yield from 编译Latex(chatbot, history, main_file_original='merge', main_file_modified='merge_translate_zh', mode='translate_zh',
285
+ work_folder_original=project_folder, work_folder_modified=project_folder, work_folder=project_folder)
286
+
287
+ # <-------------- zip PDF ------------->
288
+ zip_res = zip_result(project_folder)
289
+ if success:
290
+ chatbot.append((f"成功啦", '请查收结果(压缩包)...'))
291
+ yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
292
+ promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
293
+ else:
294
+ chatbot.append((f"失败了", '虽然PDF生成失败了, 但请查收结果(压缩包), 内含已经翻译的Tex文档, 也是可读的, 您可以到Github Issue区, 用该压缩包+对话历史存档进行反馈 ...'))
295
+ yield from update_ui(chatbot=chatbot, history=history); time.sleep(1) # 刷新界面
296
+ promote_file_to_downloadzone(file=zip_res, chatbot=chatbot)
297
+
298
+
299
+ # <-------------- we are done ------------->
300
+ return success
crazy_functions/crazy_functions_test.py CHANGED
@@ -3,6 +3,8 @@
3
  这个文件用于函数插件的单元测试
4
  运行方法 python crazy_functions/crazy_functions_test.py
5
  """
 
 
6
 
7
  def validate_path():
8
  import os, sys
@@ -10,10 +12,16 @@ def validate_path():
10
  root_dir_assume = os.path.abspath(os.path.dirname(__file__) + '/..')
11
  os.chdir(root_dir_assume)
12
  sys.path.append(root_dir_assume)
13
-
14
  validate_path() # validate path so you can run from base directory
 
 
 
15
  from colorful import *
16
  from toolbox import get_conf, ChatBotWithCookies
 
 
 
 
17
  proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION, CHATBOT_HEIGHT, LAYOUT, API_KEY = \
18
  get_conf('proxies', 'WEB_PORT', 'LLM_MODEL', 'CONCURRENT_COUNT', 'AUTHENTICATION', 'CHATBOT_HEIGHT', 'LAYOUT', 'API_KEY')
19
 
@@ -30,7 +38,43 @@ history = []
30
  system_prompt = "Serve me as a writing and programming assistant."
31
  web_port = 1024
32
 
33
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  def test_解析一个Python项目():
35
  from crazy_functions.解析项目源代码 import 解析一个Python项目
36
  txt = "crazy_functions/test_project/python/dqn"
@@ -116,6 +160,56 @@ def test_Markdown多语言():
116
  for cookies, cb, hist, msg in Markdown翻译指定语言(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
117
  print(cb)
118
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
119
 
120
 
121
  # test_解析一个Python项目()
@@ -129,7 +223,9 @@ def test_Markdown多语言():
129
  # test_联网回答问题()
130
  # test_解析ipynb文件()
131
  # test_数学动画生成manim()
132
- test_Markdown多语言()
133
-
134
- input("程序完成,回车退出。")
135
- print("退出。")
 
 
 
3
  这个文件用于函数插件的单元测试
4
  运行方法 python crazy_functions/crazy_functions_test.py
5
  """
6
+
7
+ # ==============================================================================================================================
8
 
9
  def validate_path():
10
  import os, sys
 
12
  root_dir_assume = os.path.abspath(os.path.dirname(__file__) + '/..')
13
  os.chdir(root_dir_assume)
14
  sys.path.append(root_dir_assume)
 
15
  validate_path() # validate path so you can run from base directory
16
+
17
+ # ==============================================================================================================================
18
+
19
  from colorful import *
20
  from toolbox import get_conf, ChatBotWithCookies
21
+ import contextlib
22
+ import os
23
+ import sys
24
+ from functools import wraps
25
  proxies, WEB_PORT, LLM_MODEL, CONCURRENT_COUNT, AUTHENTICATION, CHATBOT_HEIGHT, LAYOUT, API_KEY = \
26
  get_conf('proxies', 'WEB_PORT', 'LLM_MODEL', 'CONCURRENT_COUNT', 'AUTHENTICATION', 'CHATBOT_HEIGHT', 'LAYOUT', 'API_KEY')
27
 
 
38
  system_prompt = "Serve me as a writing and programming assistant."
39
  web_port = 1024
40
 
41
+ # ==============================================================================================================================
42
+
43
+ def silence_stdout(func):
44
+ @wraps(func)
45
+ def wrapper(*args, **kwargs):
46
+ _original_stdout = sys.stdout
47
+ sys.stdout = open(os.devnull, 'w')
48
+ for q in func(*args, **kwargs):
49
+ sys.stdout = _original_stdout
50
+ yield q
51
+ sys.stdout = open(os.devnull, 'w')
52
+ sys.stdout.close()
53
+ sys.stdout = _original_stdout
54
+ return wrapper
55
+
56
+ class CLI_Printer():
57
+ def __init__(self) -> None:
58
+ self.pre_buf = ""
59
+
60
+ def print(self, buf):
61
+ bufp = ""
62
+ for index, chat in enumerate(buf):
63
+ a, b = chat
64
+ bufp += sprint亮靛('[Me]:' + a) + '\n'
65
+ bufp += '[GPT]:' + b
66
+ if index < len(buf)-1:
67
+ bufp += '\n'
68
+
69
+ if self.pre_buf!="" and bufp.startswith(self.pre_buf):
70
+ print(bufp[len(self.pre_buf):], end='')
71
+ else:
72
+ print('\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n'+bufp, end='')
73
+ self.pre_buf = bufp
74
+ return
75
+
76
+ cli_printer = CLI_Printer()
77
+ # ==============================================================================================================================
78
  def test_解析一个Python项目():
79
  from crazy_functions.解析项目源代码 import 解析一个Python项目
80
  txt = "crazy_functions/test_project/python/dqn"
 
160
  for cookies, cb, hist, msg in Markdown翻译指定语言(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
161
  print(cb)
162
 
163
+ def test_Langchain知识库():
164
+ from crazy_functions.Langchain知识库 import 知识库问答
165
+ txt = "./"
166
+ chatbot = ChatBotWithCookies(llm_kwargs)
167
+ for cookies, cb, hist, msg in silence_stdout(知识库问答)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
168
+ cli_printer.print(cb) # print(cb)
169
+
170
+ chatbot = ChatBotWithCookies(cookies)
171
+ from crazy_functions.Langchain知识库 import 读取知识库作答
172
+ txt = "What is the installation method?"
173
+ for cookies, cb, hist, msg in silence_stdout(读取知识库作答)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
174
+ cli_printer.print(cb) # print(cb)
175
+
176
+ def test_Langchain知识库读取():
177
+ from crazy_functions.Langchain知识库 import 读取知识库作答
178
+ txt = "远程云服务器部署?"
179
+ for cookies, cb, hist, msg in silence_stdout(读取知识库作答)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
180
+ cli_printer.print(cb) # print(cb)
181
+
182
+ def test_Latex():
183
+ from crazy_functions.Latex输出PDF结果 import Latex英文纠错加PDF对比, Latex翻译中文并重新编译PDF
184
+
185
+ # txt = r"https://arxiv.org/abs/1706.03762"
186
+ # txt = r"https://arxiv.org/abs/1902.03185"
187
+ # txt = r"https://arxiv.org/abs/2305.18290"
188
+ # txt = r"https://arxiv.org/abs/2305.17608"
189
+ # txt = r"https://arxiv.org/abs/2211.16068" # ACE
190
+ # txt = r"C:\Users\x\arxiv_cache\2211.16068\workfolder" # ACE
191
+ # txt = r"https://arxiv.org/abs/2002.09253"
192
+ # txt = r"https://arxiv.org/abs/2306.07831"
193
+ # txt = r"https://arxiv.org/abs/2212.10156"
194
+ # txt = r"https://arxiv.org/abs/2211.11559"
195
+ # txt = r"https://arxiv.org/abs/2303.08774"
196
+ txt = r"https://arxiv.org/abs/2303.12712"
197
+ # txt = r"C:\Users\fuqingxu\arxiv_cache\2303.12712\workfolder"
198
+
199
+
200
+ for cookies, cb, hist, msg in (Latex翻译中文并重新编译PDF)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
201
+ cli_printer.print(cb) # print(cb)
202
+
203
+
204
+
205
+ # txt = "2302.02948.tar"
206
+ # print(txt)
207
+ # main_tex, work_folder = Latex预处理(txt)
208
+ # print('main tex:', main_tex)
209
+ # res = 编译Latex(main_tex, work_folder)
210
+ # # for cookies, cb, hist, msg in silence_stdout(编译Latex)(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
211
+ # cli_printer.print(cb) # print(cb)
212
+
213
 
214
 
215
  # test_解析一个Python项目()
 
223
  # test_联网回答问题()
224
  # test_解析ipynb文件()
225
  # test_数学动画生成manim()
226
+ # test_Langchain知识库()
227
+ # test_Langchain知识库读取()
228
+ if __name__ == "__main__":
229
+ test_Latex()
230
+ input("程序完成,回车退出。")
231
+ print("退出。")
crazy_functions/crazy_utils.py CHANGED
@@ -1,4 +1,5 @@
1
  from toolbox import update_ui, get_conf, trimmed_format_exc
 
2
 
3
  def input_clipping(inputs, history, max_token_limit):
4
  import numpy as np
@@ -606,3 +607,142 @@ def get_files_from_everything(txt, type): # type='.md'
606
  success = False
607
 
608
  return success, file_manifest, project_folder
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  from toolbox import update_ui, get_conf, trimmed_format_exc
2
+ import threading
3
 
4
  def input_clipping(inputs, history, max_token_limit):
5
  import numpy as np
 
607
  success = False
608
 
609
  return success, file_manifest, project_folder
610
+
611
+
612
+
613
+
614
+ def Singleton(cls):
615
+ _instance = {}
616
+
617
+ def _singleton(*args, **kargs):
618
+ if cls not in _instance:
619
+ _instance[cls] = cls(*args, **kargs)
620
+ return _instance[cls]
621
+
622
+ return _singleton
623
+
624
+
625
+ @Singleton
626
+ class knowledge_archive_interface():
627
+ def __init__(self) -> None:
628
+ self.threadLock = threading.Lock()
629
+ self.current_id = ""
630
+ self.kai_path = None
631
+ self.qa_handle = None
632
+ self.text2vec_large_chinese = None
633
+
634
+ def get_chinese_text2vec(self):
635
+ if self.text2vec_large_chinese is None:
636
+ # < -------------------预热文本向量化模组--------------- >
637
+ from toolbox import ProxyNetworkActivate
638
+ print('Checking Text2vec ...')
639
+ from langchain.embeddings.huggingface import HuggingFaceEmbeddings
640
+ with ProxyNetworkActivate(): # 临时地激活代理网络
641
+ self.text2vec_large_chinese = HuggingFaceEmbeddings(model_name="GanymedeNil/text2vec-large-chinese")
642
+
643
+ return self.text2vec_large_chinese
644
+
645
+
646
+ def feed_archive(self, file_manifest, id="default"):
647
+ self.threadLock.acquire()
648
+ # import uuid
649
+ self.current_id = id
650
+ from zh_langchain import construct_vector_store
651
+ self.qa_handle, self.kai_path = construct_vector_store(
652
+ vs_id=self.current_id,
653
+ files=file_manifest,
654
+ sentence_size=100,
655
+ history=[],
656
+ one_conent="",
657
+ one_content_segmentation="",
658
+ text2vec = self.get_chinese_text2vec(),
659
+ )
660
+ self.threadLock.release()
661
+
662
+ def get_current_archive_id(self):
663
+ return self.current_id
664
+
665
+ def get_loaded_file(self):
666
+ return self.qa_handle.get_loaded_file()
667
+
668
+ def answer_with_archive_by_id(self, txt, id):
669
+ self.threadLock.acquire()
670
+ if not self.current_id == id:
671
+ self.current_id = id
672
+ from zh_langchain import construct_vector_store
673
+ self.qa_handle, self.kai_path = construct_vector_store(
674
+ vs_id=self.current_id,
675
+ files=[],
676
+ sentence_size=100,
677
+ history=[],
678
+ one_conent="",
679
+ one_content_segmentation="",
680
+ text2vec = self.get_chinese_text2vec(),
681
+ )
682
+ VECTOR_SEARCH_SCORE_THRESHOLD = 0
683
+ VECTOR_SEARCH_TOP_K = 4
684
+ CHUNK_SIZE = 512
685
+ resp, prompt = self.qa_handle.get_knowledge_based_conent_test(
686
+ query = txt,
687
+ vs_path = self.kai_path,
688
+ score_threshold=VECTOR_SEARCH_SCORE_THRESHOLD,
689
+ vector_search_top_k=VECTOR_SEARCH_TOP_K,
690
+ chunk_conent=True,
691
+ chunk_size=CHUNK_SIZE,
692
+ text2vec = self.get_chinese_text2vec(),
693
+ )
694
+ self.threadLock.release()
695
+ return resp, prompt
696
+
697
+ def try_install_deps(deps):
698
+ for dep in deps:
699
+ import subprocess, sys
700
+ subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--user', dep])
701
+
702
+
703
+ class construct_html():
704
+ def __init__(self) -> None:
705
+ self.css = """
706
+ .row {
707
+ display: flex;
708
+ flex-wrap: wrap;
709
+ }
710
+
711
+ .column {
712
+ flex: 1;
713
+ padding: 10px;
714
+ }
715
+
716
+ .table-header {
717
+ font-weight: bold;
718
+ border-bottom: 1px solid black;
719
+ }
720
+
721
+ .table-row {
722
+ border-bottom: 1px solid lightgray;
723
+ }
724
+
725
+ .table-cell {
726
+ padding: 5px;
727
+ }
728
+ """
729
+ self.html_string = f'<!DOCTYPE html><head><meta charset="utf-8"><title>翻译结果</title><style>{self.css}</style></head>'
730
+
731
+
732
+ def add_row(self, a, b):
733
+ tmp = """
734
+ <div class="row table-row">
735
+ <div class="column table-cell">REPLACE_A</div>
736
+ <div class="column table-cell">REPLACE_B</div>
737
+ </div>
738
+ """
739
+ from toolbox import markdown_convertion
740
+ tmp = tmp.replace('REPLACE_A', markdown_convertion(a))
741
+ tmp = tmp.replace('REPLACE_B', markdown_convertion(b))
742
+ self.html_string += tmp
743
+
744
+
745
+ def save_file(self, file_name):
746
+ with open(f'./gpt_log/{file_name}', 'w', encoding='utf8') as f:
747
+ f.write(self.html_string.encode('utf-8', 'ignore').decode())
748
+
crazy_functions/latex_utils.py ADDED
@@ -0,0 +1,773 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from toolbox import update_ui, update_ui_lastest_msg # 刷新Gradio前端界面
2
+ from toolbox import zip_folder, objdump, objload, promote_file_to_downloadzone
3
+ import os, shutil
4
+ import re
5
+ import numpy as np
6
+ pj = os.path.join
7
+
8
+ """
9
+ ========================================================================
10
+ Part One
11
+ Latex segmentation with a binary mask (PRESERVE=0, TRANSFORM=1)
12
+ ========================================================================
13
+ """
14
+ PRESERVE = 0
15
+ TRANSFORM = 1
16
+
17
+ def set_forbidden_text(text, mask, pattern, flags=0):
18
+ """
19
+ Add a preserve text area in this paper
20
+ e.g. with pattern = r"\\begin\{algorithm\}(.*?)\\end\{algorithm\}"
21
+ you can mask out (mask = PRESERVE so that text become untouchable for GPT)
22
+ everything between "\begin{equation}" and "\end{equation}"
23
+ """
24
+ if isinstance(pattern, list): pattern = '|'.join(pattern)
25
+ pattern_compile = re.compile(pattern, flags)
26
+ for res in pattern_compile.finditer(text):
27
+ mask[res.span()[0]:res.span()[1]] = PRESERVE
28
+ return text, mask
29
+
30
+ def set_forbidden_text_careful_brace(text, mask, pattern, flags=0):
31
+ """
32
+ Add a preserve text area in this paper (text become untouchable for GPT).
33
+ count the number of the braces so as to catch compelete text area.
34
+ e.g.
35
+ \caption{blablablablabla\texbf{blablabla}blablabla.}
36
+ """
37
+ pattern_compile = re.compile(pattern, flags)
38
+ for res in pattern_compile.finditer(text):
39
+ brace_level = -1
40
+ p = begin = end = res.regs[0][0]
41
+ for _ in range(1024*16):
42
+ if text[p] == '}' and brace_level == 0: break
43
+ elif text[p] == '}': brace_level -= 1
44
+ elif text[p] == '{': brace_level += 1
45
+ p += 1
46
+ end = p+1
47
+ mask[begin:end] = PRESERVE
48
+ return text, mask
49
+
50
+ def reverse_forbidden_text_careful_brace(text, mask, pattern, flags=0, forbid_wrapper=True):
51
+ """
52
+ Move area out of preserve area (make text editable for GPT)
53
+ count the number of the braces so as to catch compelete text area.
54
+ e.g.
55
+ \caption{blablablablabla\texbf{blablabla}blablabla.}
56
+ """
57
+ pattern_compile = re.compile(pattern, flags)
58
+ for res in pattern_compile.finditer(text):
59
+ brace_level = 0
60
+ p = begin = end = res.regs[1][0]
61
+ for _ in range(1024*16):
62
+ if text[p] == '}' and brace_level == 0: break
63
+ elif text[p] == '}': brace_level -= 1
64
+ elif text[p] == '{': brace_level += 1
65
+ p += 1
66
+ end = p
67
+ mask[begin:end] = TRANSFORM
68
+ if forbid_wrapper:
69
+ mask[res.regs[0][0]:begin] = PRESERVE
70
+ mask[end:res.regs[0][1]] = PRESERVE
71
+ return text, mask
72
+
73
+ def set_forbidden_text_begin_end(text, mask, pattern, flags=0, limit_n_lines=42):
74
+ """
75
+ Find all \begin{} ... \end{} text block that with less than limit_n_lines lines.
76
+ Add it to preserve area
77
+ """
78
+ pattern_compile = re.compile(pattern, flags)
79
+ def search_with_line_limit(text, mask):
80
+ for res in pattern_compile.finditer(text):
81
+ cmd = res.group(1) # begin{what}
82
+ this = res.group(2) # content between begin and end
83
+ this_mask = mask[res.regs[2][0]:res.regs[2][1]]
84
+ white_list = ['document', 'abstract', 'lemma', 'definition', 'sproof',
85
+ 'em', 'emph', 'textit', 'textbf', 'itemize', 'enumerate']
86
+ if (cmd in white_list) or this.count('\n') >= limit_n_lines: # use a magical number 42
87
+ this, this_mask = search_with_line_limit(this, this_mask)
88
+ mask[res.regs[2][0]:res.regs[2][1]] = this_mask
89
+ else:
90
+ mask[res.regs[0][0]:res.regs[0][1]] = PRESERVE
91
+ return text, mask
92
+ return search_with_line_limit(text, mask)
93
+
94
+ class LinkedListNode():
95
+ """
96
+ Linked List Node
97
+ """
98
+ def __init__(self, string, preserve=True) -> None:
99
+ self.string = string
100
+ self.preserve = preserve
101
+ self.next = None
102
+ # self.begin_line = 0
103
+ # self.begin_char = 0
104
+
105
+ def convert_to_linklist(text, mask):
106
+ root = LinkedListNode("", preserve=True)
107
+ current_node = root
108
+ for c, m, i in zip(text, mask, range(len(text))):
109
+ if (m==PRESERVE and current_node.preserve) \
110
+ or (m==TRANSFORM and not current_node.preserve):
111
+ # add
112
+ current_node.string += c
113
+ else:
114
+ current_node.next = LinkedListNode(c, preserve=(m==PRESERVE))
115
+ current_node = current_node.next
116
+ return root
117
+ """
118
+ ========================================================================
119
+ Latex Merge File
120
+ ========================================================================
121
+ """
122
+
123
+ def 寻找Latex主文件(file_manifest, mode):
124
+ """
125
+ 在多Tex文档中,寻找主文件,必须包含documentclass,返回找到的第一个。
126
+ P.S. 但愿没人把latex模板放在里面传进来 (6.25 加入判定latex模板的代码)
127
+ """
128
+ canidates = []
129
+ for texf in file_manifest:
130
+ if os.path.basename(texf).startswith('merge'):
131
+ continue
132
+ with open(texf, 'r', encoding='utf8') as f:
133
+ file_content = f.read()
134
+ if r'\documentclass' in file_content:
135
+ canidates.append(texf)
136
+ else:
137
+ continue
138
+
139
+ if len(canidates) == 0:
140
+ raise RuntimeError('无法找到一个主Tex文件(包含documentclass关键字)')
141
+ elif len(canidates) == 1:
142
+ return canidates[0]
143
+ else: # if len(canidates) >= 2 通过一些Latex模板中常见(但通常不会出现在正文)的单词,对不同latex源文件扣分,取评分最高者返回
144
+ canidates_score = []
145
+ # 给出一些判定模板文档的词作为扣分项
146
+ unexpected_words = ['\LaTeX', 'manuscript', 'Guidelines', 'font', 'citations', 'rejected', 'blind review', 'reviewers']
147
+ expected_words = ['\input', '\ref', '\cite']
148
+ for texf in canidates:
149
+ canidates_score.append(0)
150
+ with open(texf, 'r', encoding='utf8') as f:
151
+ file_content = f.read()
152
+ for uw in unexpected_words:
153
+ if uw in file_content:
154
+ canidates_score[-1] -= 1
155
+ for uw in expected_words:
156
+ if uw in file_content:
157
+ canidates_score[-1] += 1
158
+ select = np.argmax(canidates_score) # 取评分最高者返回
159
+ return canidates[select]
160
+
161
+ def rm_comments(main_file):
162
+ new_file_remove_comment_lines = []
163
+ for l in main_file.splitlines():
164
+ # 删除整行的空注释
165
+ if l.lstrip().startswith("%"):
166
+ pass
167
+ else:
168
+ new_file_remove_comment_lines.append(l)
169
+ main_file = '\n'.join(new_file_remove_comment_lines)
170
+ # main_file = re.sub(r"\\include{(.*?)}", r"\\input{\1}", main_file) # 将 \include 命令转换为 \input 命令
171
+ main_file = re.sub(r'(?<!\\)%.*', '', main_file) # 使用正则表达式查找半行注释, 并替换为空字符串
172
+ return main_file
173
+
174
+ def merge_tex_files_(project_foler, main_file, mode):
175
+ """
176
+ Merge Tex project recrusively
177
+ """
178
+ main_file = rm_comments(main_file)
179
+ for s in reversed([q for q in re.finditer(r"\\input\{(.*?)\}", main_file, re.M)]):
180
+ f = s.group(1)
181
+ fp = os.path.join(project_foler, f)
182
+ if os.path.exists(fp):
183
+ # e.g., \input{srcs/07_appendix.tex}
184
+ with open(fp, 'r', encoding='utf-8', errors='replace') as fx:
185
+ c = fx.read()
186
+ else:
187
+ # e.g., \input{srcs/07_appendix}
188
+ with open(fp+'.tex', 'r', encoding='utf-8', errors='replace') as fx:
189
+ c = fx.read()
190
+ c = merge_tex_files_(project_foler, c, mode)
191
+ main_file = main_file[:s.span()[0]] + c + main_file[s.span()[1]:]
192
+ return main_file
193
+
194
+ def merge_tex_files(project_foler, main_file, mode):
195
+ """
196
+ Merge Tex project recrusively
197
+ P.S. 顺便把CTEX塞进去以支持中文
198
+ P.S. 顺便把Latex的注释去除
199
+ """
200
+ main_file = merge_tex_files_(project_foler, main_file, mode)
201
+ main_file = rm_comments(main_file)
202
+
203
+ if mode == 'translate_zh':
204
+ # find paper documentclass
205
+ pattern = re.compile(r'\\documentclass.*\n')
206
+ match = pattern.search(main_file)
207
+ assert match is not None, "Cannot find documentclass statement!"
208
+ position = match.end()
209
+ add_ctex = '\\usepackage{ctex}\n'
210
+ add_url = '\\usepackage{url}\n' if '{url}' not in main_file else ''
211
+ main_file = main_file[:position] + add_ctex + add_url + main_file[position:]
212
+ # fontset=windows
213
+ import platform
214
+ main_file = re.sub(r"\\documentclass\[(.*?)\]{(.*?)}", r"\\documentclass[\1,fontset=windows,UTF8]{\2}",main_file)
215
+ main_file = re.sub(r"\\documentclass{(.*?)}", r"\\documentclass[fontset=windows,UTF8]{\1}",main_file)
216
+ # find paper abstract
217
+ pattern_opt1 = re.compile(r'\\begin\{abstract\}.*\n')
218
+ pattern_opt2 = re.compile(r"\\abstract\{(.*?)\}", flags=re.DOTALL)
219
+ match_opt1 = pattern_opt1.search(main_file)
220
+ match_opt2 = pattern_opt2.search(main_file)
221
+ assert (match_opt1 is not None) or (match_opt2 is not None), "Cannot find paper abstract section!"
222
+ return main_file
223
+
224
+
225
+
226
+ """
227
+ ========================================================================
228
+ Post process
229
+ ========================================================================
230
+ """
231
+ def mod_inbraket(match):
232
+ """
233
+ 为啥chatgpt会把cite里面的逗号换成中文逗号呀
234
+ """
235
+ # get the matched string
236
+ cmd = match.group(1)
237
+ str_to_modify = match.group(2)
238
+ # modify the matched string
239
+ str_to_modify = str_to_modify.replace(':', ':') # 前面是中文冒号,后面是英文冒号
240
+ str_to_modify = str_to_modify.replace(',', ',') # 前面是中文逗号,后面是英文逗号
241
+ # str_to_modify = 'BOOM'
242
+ return "\\" + cmd + "{" + str_to_modify + "}"
243
+
244
+ def fix_content(final_tex, node_string):
245
+ """
246
+ Fix common GPT errors to increase success rate
247
+ """
248
+ final_tex = re.sub(r"(?<!\\)%", "\\%", final_tex)
249
+ final_tex = re.sub(r"\\([a-z]{2,10})\ \{", r"\\\1{", string=final_tex)
250
+ final_tex = re.sub(r"\\\ ([a-z]{2,10})\{", r"\\\1{", string=final_tex)
251
+ final_tex = re.sub(r"\\([a-z]{2,10})\{([^\}]*?)\}", mod_inbraket, string=final_tex)
252
+
253
+ if "Traceback" in final_tex and "[Local Message]" in final_tex:
254
+ final_tex = node_string # 出问题了,还原原文
255
+ if node_string.count('\\begin') != final_tex.count('\\begin'):
256
+ final_tex = node_string # 出问题了,还原原文
257
+ if node_string.count('\_') > 0 and node_string.count('\_') > final_tex.count('\_'):
258
+ # walk and replace any _ without \
259
+ final_tex = re.sub(r"(?<!\\)_", "\\_", final_tex)
260
+
261
+ def compute_brace_level(string):
262
+ # this function count the number of { and }
263
+ brace_level = 0
264
+ for c in string:
265
+ if c == "{": brace_level += 1
266
+ elif c == "}": brace_level -= 1
267
+ return brace_level
268
+ def join_most(tex_t, tex_o):
269
+ # this function join translated string and original string when something goes wrong
270
+ p_t = 0
271
+ p_o = 0
272
+ def find_next(string, chars, begin):
273
+ p = begin
274
+ while p < len(string):
275
+ if string[p] in chars: return p, string[p]
276
+ p += 1
277
+ return None, None
278
+ while True:
279
+ res1, char = find_next(tex_o, ['{','}'], p_o)
280
+ if res1 is None: break
281
+ res2, char = find_next(tex_t, [char], p_t)
282
+ if res2 is None: break
283
+ p_o = res1 + 1
284
+ p_t = res2 + 1
285
+ return tex_t[:p_t] + tex_o[p_o:]
286
+
287
+ if compute_brace_level(final_tex) != compute_brace_level(node_string):
288
+ # 出问题了,还原部分原文,保证括号正确
289
+ final_tex = join_most(final_tex, node_string)
290
+ return final_tex
291
+
292
+ def split_subprocess(txt, project_folder, return_dict, opts):
293
+ """
294
+ break down latex file to a linked list,
295
+ each node use a preserve flag to indicate whether it should
296
+ be proccessed by GPT.
297
+ """
298
+ text = txt
299
+ mask = np.zeros(len(txt), dtype=np.uint8) + TRANSFORM
300
+
301
+ # 吸收title与作者以上的部分
302
+ text, mask = set_forbidden_text(text, mask, r"(.*?)\\maketitle", re.DOTALL)
303
+ # 吸收iffalse注释
304
+ text, mask = set_forbidden_text(text, mask, r"\\iffalse(.*?)\\fi", re.DOTALL)
305
+ # 吸收在42行以内的begin-end组合
306
+ text, mask = set_forbidden_text_begin_end(text, mask, r"\\begin\{([a-z\*]*)\}(.*?)\\end\{\1\}", re.DOTALL, limit_n_lines=42)
307
+ # 吸收匿名公式
308
+ text, mask = set_forbidden_text(text, mask, [ r"\$\$(.*?)\$\$", r"\\\[.*?\\\]" ], re.DOTALL)
309
+ # 吸收其他杂项
310
+ text, mask = set_forbidden_text(text, mask, [ r"\\section\{(.*?)\}", r"\\section\*\{(.*?)\}", r"\\subsection\{(.*?)\}", r"\\subsubsection\{(.*?)\}" ])
311
+ text, mask = set_forbidden_text(text, mask, [ r"\\bibliography\{(.*?)\}", r"\\bibliographystyle\{(.*?)\}" ])
312
+ text, mask = set_forbidden_text(text, mask, r"\\begin\{thebibliography\}.*?\\end\{thebibliography\}", re.DOTALL)
313
+ text, mask = set_forbidden_text(text, mask, r"\\begin\{lstlisting\}(.*?)\\end\{lstlisting\}", re.DOTALL)
314
+ text, mask = set_forbidden_text(text, mask, r"\\begin\{wraptable\}(.*?)\\end\{wraptable\}", re.DOTALL)
315
+ text, mask = set_forbidden_text(text, mask, r"\\begin\{algorithm\}(.*?)\\end\{algorithm\}", re.DOTALL)
316
+ text, mask = set_forbidden_text(text, mask, [r"\\begin\{wrapfigure\}(.*?)\\end\{wrapfigure\}", r"\\begin\{wrapfigure\*\}(.*?)\\end\{wrapfigure\*\}"], re.DOTALL)
317
+ text, mask = set_forbidden_text(text, mask, [r"\\begin\{figure\}(.*?)\\end\{figure\}", r"\\begin\{figure\*\}(.*?)\\end\{figure\*\}"], re.DOTALL)
318
+ text, mask = set_forbidden_text(text, mask, [r"\\begin\{multline\}(.*?)\\end\{multline\}", r"\\begin\{multline\*\}(.*?)\\end\{multline\*\}"], re.DOTALL)
319
+ text, mask = set_forbidden_text(text, mask, [r"\\begin\{table\}(.*?)\\end\{table\}", r"\\begin\{table\*\}(.*?)\\end\{table\*\}"], re.DOTALL)
320
+ text, mask = set_forbidden_text(text, mask, [r"\\begin\{minipage\}(.*?)\\end\{minipage\}", r"\\begin\{minipage\*\}(.*?)\\end\{minipage\*\}"], re.DOTALL)
321
+ text, mask = set_forbidden_text(text, mask, [r"\\begin\{align\*\}(.*?)\\end\{align\*\}", r"\\begin\{align\}(.*?)\\end\{align\}"], re.DOTALL)
322
+ text, mask = set_forbidden_text(text, mask, [r"\\begin\{equation\}(.*?)\\end\{equation\}", r"\\begin\{equation\*\}(.*?)\\end\{equation\*\}"], re.DOTALL)
323
+ text, mask = set_forbidden_text(text, mask, [r"\\includepdf\[(.*?)\]\{(.*?)\}", r"\\clearpage", r"\\newpage", r"\\appendix", r"\\tableofcontents", r"\\include\{(.*?)\}"])
324
+ text, mask = set_forbidden_text(text, mask, [r"\\vspace\{(.*?)\}", r"\\hspace\{(.*?)\}", r"\\label\{(.*?)\}", r"\\begin\{(.*?)\}", r"\\end\{(.*?)\}", r"\\item "])
325
+ text, mask = set_forbidden_text_careful_brace(text, mask, r"\\hl\{(.*?)\}", re.DOTALL)
326
+ # reverse 操作必须放在最后
327
+ text, mask = reverse_forbidden_text_careful_brace(text, mask, r"\\caption\{(.*?)\}", re.DOTALL, forbid_wrapper=True)
328
+ text, mask = reverse_forbidden_text_careful_brace(text, mask, r"\\abstract\{(.*?)\}", re.DOTALL, forbid_wrapper=True)
329
+ root = convert_to_linklist(text, mask)
330
+
331
+ # 修复括号
332
+ node = root
333
+ while True:
334
+ string = node.string
335
+ if node.preserve:
336
+ node = node.next
337
+ if node is None: break
338
+ continue
339
+ def break_check(string):
340
+ str_stack = [""] # (lv, index)
341
+ for i, c in enumerate(string):
342
+ if c == '{':
343
+ str_stack.append('{')
344
+ elif c == '}':
345
+ if len(str_stack) == 1:
346
+ print('stack fix')
347
+ return i
348
+ str_stack.pop(-1)
349
+ else:
350
+ str_stack[-1] += c
351
+ return -1
352
+ bp = break_check(string)
353
+
354
+ if bp == -1:
355
+ pass
356
+ elif bp == 0:
357
+ node.string = string[:1]
358
+ q = LinkedListNode(string[1:], False)
359
+ q.next = node.next
360
+ node.next = q
361
+ else:
362
+ node.string = string[:bp]
363
+ q = LinkedListNode(string[bp:], False)
364
+ q.next = node.next
365
+ node.next = q
366
+
367
+ node = node.next
368
+ if node is None: break
369
+
370
+ # 屏蔽空行和太短的句子
371
+ node = root
372
+ while True:
373
+ if len(node.string.strip('\n').strip(''))==0: node.preserve = True
374
+ if len(node.string.strip('\n').strip(''))<42: node.preserve = True
375
+ node = node.next
376
+ if node is None: break
377
+ node = root
378
+ while True:
379
+ if node.next and node.preserve and node.next.preserve:
380
+ node.string += node.next.string
381
+ node.next = node.next.next
382
+ node = node.next
383
+ if node is None: break
384
+
385
+ # 将前后断行符脱离
386
+ node = root
387
+ prev_node = None
388
+ while True:
389
+ if not node.preserve:
390
+ lstriped_ = node.string.lstrip().lstrip('\n')
391
+ if (prev_node is not None) and (prev_node.preserve) and (len(lstriped_)!=len(node.string)):
392
+ prev_node.string += node.string[:-len(lstriped_)]
393
+ node.string = lstriped_
394
+ rstriped_ = node.string.rstrip().rstrip('\n')
395
+ if (node.next is not None) and (node.next.preserve) and (len(rstriped_)!=len(node.string)):
396
+ node.next.string = node.string[len(rstriped_):] + node.next.string
397
+ node.string = rstriped_
398
+ # =====
399
+ prev_node = node
400
+ node = node.next
401
+ if node is None: break
402
+ # 输出html调试文件,用红色标注处保留区(PRESERVE),用黑色标注转换区(TRANSFORM)
403
+ with open(pj(project_folder, 'debug_log.html'), 'w', encoding='utf8') as f:
404
+ segment_parts_for_gpt = []
405
+ nodes = []
406
+ node = root
407
+ while True:
408
+ nodes.append(node)
409
+ show_html = node.string.replace('\n','<br/>')
410
+ if not node.preserve:
411
+ segment_parts_for_gpt.append(node.string)
412
+ f.write(f'<p style="color:black;">#{show_html}#</p>')
413
+ else:
414
+ f.write(f'<p style="color:red;">{show_html}</p>')
415
+ node = node.next
416
+ if node is None: break
417
+
418
+ for n in nodes: n.next = None # break
419
+ return_dict['nodes'] = nodes
420
+ return_dict['segment_parts_for_gpt'] = segment_parts_for_gpt
421
+ return return_dict
422
+
423
+
424
+
425
+ class LatexPaperSplit():
426
+ """
427
+ break down latex file to a linked list,
428
+ each node use a preserve flag to indicate whether it should
429
+ be proccessed by GPT.
430
+ """
431
+ def __init__(self) -> None:
432
+ self.nodes = None
433
+ self.msg = "*{\\scriptsize\\textbf{警告:该PDF由GPT-Academic开源项目调用大语言模型+Latex翻译插件一键生成," + \
434
+ "版权归原文作者所有。翻译内容可靠性无保障,请仔细鉴别并以原文为准。" + \
435
+ "项目Github地址 \\url{https://github.com/binary-husky/gpt_academic/}。"
436
+ # 请您不要删除或修改这行警告,除非您是论文的原作者(如果您是论文原作者,欢迎加REAME中的QQ联系开发者)
437
+ self.msg_declare = "为了防止大语言模型的意外谬误产生扩散影响,禁止移除或修改此警告。}}\\\\"
438
+
439
+ def merge_result(self, arr, mode, msg):
440
+ """
441
+ Merge the result after the GPT process completed
442
+ """
443
+ result_string = ""
444
+ p = 0
445
+ for node in self.nodes:
446
+ if node.preserve:
447
+ result_string += node.string
448
+ else:
449
+ result_string += fix_content(arr[p], node.string)
450
+ p += 1
451
+ if mode == 'translate_zh':
452
+ pattern = re.compile(r'\\begin\{abstract\}.*\n')
453
+ match = pattern.search(result_string)
454
+ if not match:
455
+ # match \abstract{xxxx}
456
+ pattern_compile = re.compile(r"\\abstract\{(.*?)\}", flags=re.DOTALL)
457
+ match = pattern_compile.search(result_string)
458
+ position = match.regs[1][0]
459
+ else:
460
+ # match \begin{abstract}xxxx\end{abstract}
461
+ position = match.end()
462
+ result_string = result_string[:position] + self.msg + msg + self.msg_declare + result_string[position:]
463
+ return result_string
464
+
465
+ def split(self, txt, project_folder, opts):
466
+ """
467
+ break down latex file to a linked list,
468
+ each node use a preserve flag to indicate whether it should
469
+ be proccessed by GPT.
470
+ P.S. use multiprocessing to avoid timeout error
471
+ """
472
+ import multiprocessing
473
+ manager = multiprocessing.Manager()
474
+ return_dict = manager.dict()
475
+ p = multiprocessing.Process(
476
+ target=split_subprocess,
477
+ args=(txt, project_folder, return_dict, opts))
478
+ p.start()
479
+ p.join()
480
+ p.close()
481
+ self.nodes = return_dict['nodes']
482
+ self.sp = return_dict['segment_parts_for_gpt']
483
+ return self.sp
484
+
485
+
486
+
487
+ class LatexPaperFileGroup():
488
+ """
489
+ use tokenizer to break down text according to max_token_limit
490
+ """
491
+ def __init__(self):
492
+ self.file_paths = []
493
+ self.file_contents = []
494
+ self.sp_file_contents = []
495
+ self.sp_file_index = []
496
+ self.sp_file_tag = []
497
+
498
+ # count_token
499
+ from request_llm.bridge_all import model_info
500
+ enc = model_info["gpt-3.5-turbo"]['tokenizer']
501
+ def get_token_num(txt): return len(enc.encode(txt, disallowed_special=()))
502
+ self.get_token_num = get_token_num
503
+
504
+ def run_file_split(self, max_token_limit=1900):
505
+ """
506
+ use tokenizer to break down text according to max_token_limit
507
+ """
508
+ for index, file_content in enumerate(self.file_contents):
509
+ if self.get_token_num(file_content) < max_token_limit:
510
+ self.sp_file_contents.append(file_content)
511
+ self.sp_file_index.append(index)
512
+ self.sp_file_tag.append(self.file_paths[index])
513
+ else:
514
+ from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
515
+ segments = breakdown_txt_to_satisfy_token_limit_for_pdf(file_content, self.get_token_num, max_token_limit)
516
+ for j, segment in enumerate(segments):
517
+ self.sp_file_contents.append(segment)
518
+ self.sp_file_index.append(index)
519
+ self.sp_file_tag.append(self.file_paths[index] + f".part-{j}.tex")
520
+ print('Segmentation: done')
521
+
522
+ def merge_result(self):
523
+ self.file_result = ["" for _ in range(len(self.file_paths))]
524
+ for r, k in zip(self.sp_file_result, self.sp_file_index):
525
+ self.file_result[k] += r
526
+
527
+ def write_result(self):
528
+ manifest = []
529
+ for path, res in zip(self.file_paths, self.file_result):
530
+ with open(path + '.polish.tex', 'w', encoding='utf8') as f:
531
+ manifest.append(path + '.polish.tex')
532
+ f.write(res)
533
+ return manifest
534
+
535
+ def write_html(sp_file_contents, sp_file_result, chatbot, project_folder):
536
+
537
+ # write html
538
+ try:
539
+ import shutil
540
+ from .crazy_utils import construct_html
541
+ from toolbox import gen_time_str
542
+ ch = construct_html()
543
+ orig = ""
544
+ trans = ""
545
+ final = []
546
+ for c,r in zip(sp_file_contents, sp_file_result):
547
+ final.append(c)
548
+ final.append(r)
549
+ for i, k in enumerate(final):
550
+ if i%2==0:
551
+ orig = k
552
+ if i%2==1:
553
+ trans = k
554
+ ch.add_row(a=orig, b=trans)
555
+ create_report_file_name = f"{gen_time_str()}.trans.html"
556
+ ch.save_file(create_report_file_name)
557
+ shutil.copyfile(pj('./gpt_log/', create_report_file_name), pj(project_folder, create_report_file_name))
558
+ promote_file_to_downloadzone(file=f'./gpt_log/{create_report_file_name}', chatbot=chatbot)
559
+ except:
560
+ from toolbox import trimmed_format_exc
561
+ print('writing html result failed:', trimmed_format_exc())
562
+
563
+ def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, mode='proofread', switch_prompt=None, opts=[]):
564
+ import time, os, re
565
+ from .crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
566
+ from .latex_utils import LatexPaperFileGroup, merge_tex_files, LatexPaperSplit, 寻找Latex主文件
567
+
568
+ # <-------- 寻找主tex文件 ---------->
569
+ maintex = 寻找Latex主文件(file_manifest, mode)
570
+ chatbot.append((f"定位主Latex文件", f'[Local Message] 分析结果:该项目的Latex主文件是{maintex}, 如果分析错误, 请立即终止程序, 删除或修改歧义文件, 然后重试。主程序即将开始, 请稍候。'))
571
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
572
+ time.sleep(3)
573
+
574
+ # <-------- 读取Latex文件, 将多文件tex工程融合为一个巨型tex ---------->
575
+ main_tex_basename = os.path.basename(maintex)
576
+ assert main_tex_basename.endswith('.tex')
577
+ main_tex_basename_bare = main_tex_basename[:-4]
578
+ may_exist_bbl = pj(project_folder, f'{main_tex_basename_bare}.bbl')
579
+ if os.path.exists(may_exist_bbl):
580
+ shutil.copyfile(may_exist_bbl, pj(project_folder, f'merge.bbl'))
581
+ shutil.copyfile(may_exist_bbl, pj(project_folder, f'merge_{mode}.bbl'))
582
+ shutil.copyfile(may_exist_bbl, pj(project_folder, f'merge_diff.bbl'))
583
+
584
+ with open(maintex, 'r', encoding='utf-8', errors='replace') as f:
585
+ content = f.read()
586
+ merged_content = merge_tex_files(project_folder, content, mode)
587
+
588
+ with open(project_folder + '/merge.tex', 'w', encoding='utf-8', errors='replace') as f:
589
+ f.write(merged_content)
590
+
591
+ # <-------- 精细切分latex文件 ---------->
592
+ chatbot.append((f"Latex文件融合完成", f'[Local Message] 正在精细切分latex文件,这需要一段时间计算,文档越长耗时越长,请耐心等待。'))
593
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
594
+ lps = LatexPaperSplit()
595
+ res = lps.split(merged_content, project_folder, opts) # 消耗时间的函数
596
+
597
+ # <-------- 拆分过长的latex片段 ---------->
598
+ pfg = LatexPaperFileGroup()
599
+ for index, r in enumerate(res):
600
+ pfg.file_paths.append('segment-' + str(index))
601
+ pfg.file_contents.append(r)
602
+
603
+ pfg.run_file_split(max_token_limit=1024)
604
+ n_split = len(pfg.sp_file_contents)
605
+
606
+ # <-------- 根据需要切换prompt ---------->
607
+ inputs_array, sys_prompt_array = switch_prompt(pfg, mode)
608
+ inputs_show_user_array = [f"{mode} {f}" for f in pfg.sp_file_tag]
609
+
610
+ if os.path.exists(pj(project_folder,'temp.pkl')):
611
+
612
+ # <-------- 【仅调试】如果存在调试缓存文件,则跳过GPT请求环节 ---------->
613
+ pfg = objload(file=pj(project_folder,'temp.pkl'))
614
+
615
+ else:
616
+ # <-------- gpt 多线程请求 ---------->
617
+ gpt_response_collection = yield from request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency(
618
+ inputs_array=inputs_array,
619
+ inputs_show_user_array=inputs_show_user_array,
620
+ llm_kwargs=llm_kwargs,
621
+ chatbot=chatbot,
622
+ history_array=[[""] for _ in range(n_split)],
623
+ sys_prompt_array=sys_prompt_array,
624
+ # max_workers=5, # 并行任务数量限制, 最多同时执行5个, 其他的排队等待
625
+ scroller_max_len = 40
626
+ )
627
+
628
+ # <-------- 文本碎片重组为完整的tex片段 ---------->
629
+ pfg.sp_file_result = []
630
+ for i_say, gpt_say, orig_content in zip(gpt_response_collection[0::2], gpt_response_collection[1::2], pfg.sp_file_contents):
631
+ pfg.sp_file_result.append(gpt_say)
632
+ pfg.merge_result()
633
+
634
+ # <-------- 临时存储用于调试 ---------->
635
+ pfg.get_token_num = None
636
+ objdump(pfg, file=pj(project_folder,'temp.pkl'))
637
+
638
+ write_html(pfg.sp_file_contents, pfg.sp_file_result, chatbot=chatbot, project_folder=project_folder)
639
+
640
+ # <-------- 写出文件 ---------->
641
+ msg = f"当前大语言模型: {llm_kwargs['llm_model']},当前语言模型温度设定: {llm_kwargs['temperature']}。"
642
+ final_tex = lps.merge_result(pfg.file_result, mode, msg)
643
+ with open(project_folder + f'/merge_{mode}.tex', 'w', encoding='utf-8', errors='replace') as f:
644
+ if mode != 'translate_zh' or "binary" in final_tex: f.write(final_tex)
645
+
646
+
647
+ # <-------- 整理结果, 退出 ---------->
648
+ chatbot.append((f"完成了吗?", 'GPT结果已输出, 正在编译PDF'))
649
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
650
+
651
+ # <-------- 返回 ---------->
652
+ return project_folder + f'/merge_{mode}.tex'
653
+
654
+
655
+
656
+ def remove_buggy_lines(file_path, log_path, tex_name, tex_name_pure, n_fix, work_folder_modified):
657
+ try:
658
+ with open(log_path, 'r', encoding='utf-8', errors='replace') as f:
659
+ log = f.read()
660
+ with open(file_path, 'r', encoding='utf-8', errors='replace') as f:
661
+ file_lines = f.readlines()
662
+ import re
663
+ buggy_lines = re.findall(tex_name+':([0-9]{1,5}):', log)
664
+ buggy_lines = [int(l) for l in buggy_lines]
665
+ buggy_lines = sorted(buggy_lines)
666
+ print("removing lines that has errors", buggy_lines)
667
+ file_lines.pop(buggy_lines[0]-1)
668
+ with open(pj(work_folder_modified, f"{tex_name_pure}_fix_{n_fix}.tex"), 'w', encoding='utf-8', errors='replace') as f:
669
+ f.writelines(file_lines)
670
+ return True, f"{tex_name_pure}_fix_{n_fix}", buggy_lines
671
+ except:
672
+ print("Fatal error occurred, but we cannot identify error, please download zip, read latex log, and compile manually.")
673
+ return False, -1, [-1]
674
+
675
+
676
+ def compile_latex_with_timeout(command, timeout=60):
677
+ import subprocess
678
+ process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
679
+ try:
680
+ stdout, stderr = process.communicate(timeout=timeout)
681
+ except subprocess.TimeoutExpired:
682
+ process.kill()
683
+ stdout, stderr = process.communicate()
684
+ print("Process timed out!")
685
+ return False
686
+ return True
687
+
688
+ def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_folder_original, work_folder_modified, work_folder, mode='default'):
689
+ import os, time
690
+ current_dir = os.getcwd()
691
+ n_fix = 1
692
+ max_try = 32
693
+ chatbot.append([f"正在编译PDF文档", f'编译已经开始。当前工作路径为{work_folder},如果程序停顿5分钟以上,请直接去该路径下取回翻译结果,或者重启之后再度尝试 ...']); yield from update_ui(chatbot=chatbot, history=history)
694
+ chatbot.append([f"正在编译PDF文档", '...']); yield from update_ui(chatbot=chatbot, history=history); time.sleep(1); chatbot[-1] = list(chatbot[-1]) # 刷新界面
695
+ yield from update_ui_lastest_msg('编译已经开始...', chatbot, history) # 刷新Gradio前端界面
696
+
697
+ while True:
698
+ import os
699
+
700
+ # https://stackoverflow.com/questions/738755/dont-make-me-manually-abort-a-latex-compile-when-theres-an-error
701
+ yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 编译原始PDF ...', chatbot, history) # 刷新Gradio前端界面
702
+ os.chdir(work_folder_original); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_original}.tex'); os.chdir(current_dir)
703
+
704
+ yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 编译转化后的PDF ...', chatbot, history) # 刷新Gradio前端界面
705
+ os.chdir(work_folder_modified); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_modified}.tex'); os.chdir(current_dir)
706
+
707
+ if ok and os.path.exists(pj(work_folder_modified, f'{main_file_modified}.pdf')):
708
+ # 只有第二步成功,才能继续下面的步骤
709
+ yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 编译BibTex ...', chatbot, history) # 刷新Gradio前端界面
710
+ if not os.path.exists(pj(work_folder_original, f'{main_file_original}.bbl')):
711
+ os.chdir(work_folder_original); ok = compile_latex_with_timeout(f'bibtex {main_file_original}.aux'); os.chdir(current_dir)
712
+ if not os.path.exists(pj(work_folder_modified, f'{main_file_modified}.bbl')):
713
+ os.chdir(work_folder_modified); ok = compile_latex_with_timeout(f'bibtex {main_file_modified}.aux'); os.chdir(current_dir)
714
+
715
+ yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 编译文献交叉引用 ...', chatbot, history) # 刷新Gradio前端界面
716
+ os.chdir(work_folder_original); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_original}.tex'); os.chdir(current_dir)
717
+ os.chdir(work_folder_modified); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_modified}.tex'); os.chdir(current_dir)
718
+ os.chdir(work_folder_original); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_original}.tex'); os.chdir(current_dir)
719
+ os.chdir(work_folder_modified); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_modified}.tex'); os.chdir(current_dir)
720
+
721
+ if mode!='translate_zh':
722
+ yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 使用latexdiff生成论文转化前后对比 ...', chatbot, history) # 刷新Gradio前端界面
723
+ print( f'latexdiff --encoding=utf8 --append-safecmd=subfile {work_folder_original}/{main_file_original}.tex {work_folder_modified}/{main_file_modified}.tex --flatten > {work_folder}/merge_diff.tex')
724
+ ok = compile_latex_with_timeout(f'latexdiff --encoding=utf8 --append-safecmd=subfile {work_folder_original}/{main_file_original}.tex {work_folder_modified}/{main_file_modified}.tex --flatten > {work_folder}/merge_diff.tex')
725
+
726
+ yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 正在编译对比PDF ...', chatbot, history) # 刷新Gradio前端界面
727
+ os.chdir(work_folder); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error merge_diff.tex'); os.chdir(current_dir)
728
+ os.chdir(work_folder); ok = compile_latex_with_timeout(f'bibtex merge_diff.aux'); os.chdir(current_dir)
729
+ os.chdir(work_folder); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error merge_diff.tex'); os.chdir(current_dir)
730
+ os.chdir(work_folder); ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error merge_diff.tex'); os.chdir(current_dir)
731
+
732
+ # <--------------------->
733
+ os.chdir(current_dir)
734
+
735
+ # <---------- 检查结果 ----------->
736
+ results_ = ""
737
+ original_pdf_success = os.path.exists(pj(work_folder_original, f'{main_file_original}.pdf'))
738
+ modified_pdf_success = os.path.exists(pj(work_folder_modified, f'{main_file_modified}.pdf'))
739
+ diff_pdf_success = os.path.exists(pj(work_folder, f'merge_diff.pdf'))
740
+ results_ += f"原始PDF编译是否成功: {original_pdf_success};"
741
+ results_ += f"转化PDF编译是否成功: {modified_pdf_success};"
742
+ results_ += f"对比PDF编译是否成功: {diff_pdf_success};"
743
+ yield from update_ui_lastest_msg(f'第{n_fix}编译结束:<br/>{results_}...', chatbot, history) # 刷新Gradio前端界面
744
+
745
+ if diff_pdf_success:
746
+ result_pdf = pj(work_folder_modified, f'merge_diff.pdf') # get pdf path
747
+ promote_file_to_downloadzone(result_pdf, rename_file=None, chatbot=chatbot) # promote file to web UI
748
+ if modified_pdf_success:
749
+ yield from update_ui_lastest_msg(f'转化PDF编译已经成功, 即将退出 ...', chatbot, history) # 刷新Gradio前端界面
750
+ result_pdf = pj(work_folder_modified, f'{main_file_modified}.pdf') # get pdf path
751
+ if os.path.exists(pj(work_folder, '..', 'translation')):
752
+ shutil.copyfile(result_pdf, pj(work_folder, '..', 'translation', 'translate_zh.pdf'))
753
+ promote_file_to_downloadzone(result_pdf, rename_file=None, chatbot=chatbot) # promote file to web UI
754
+ return True # 成功啦
755
+ else:
756
+ if n_fix>=max_try: break
757
+ n_fix += 1
758
+ can_retry, main_file_modified, buggy_lines = remove_buggy_lines(
759
+ file_path=pj(work_folder_modified, f'{main_file_modified}.tex'),
760
+ log_path=pj(work_folder_modified, f'{main_file_modified}.log'),
761
+ tex_name=f'{main_file_modified}.tex',
762
+ tex_name_pure=f'{main_file_modified}',
763
+ n_fix=n_fix,
764
+ work_folder_modified=work_folder_modified,
765
+ )
766
+ yield from update_ui_lastest_msg(f'由于最为关键的转化PDF编译失败, 将根据报错信息修正tex源文件并重试, 当前报错的latex代码处于第{buggy_lines}行 ...', chatbot, history) # 刷新Gradio前端界面
767
+ if not can_retry: break
768
+
769
+ os.chdir(current_dir)
770
+ return False # 失败啦
771
+
772
+
773
+
crazy_functions/对话历史存档.py CHANGED
@@ -1,4 +1,4 @@
1
- from toolbox import CatchException, update_ui
2
  from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
3
  import re
4
 
@@ -29,9 +29,8 @@ def write_chat_to_file(chatbot, history=None, file_name=None):
29
  for h in history:
30
  f.write("\n>>>" + h)
31
  f.write('</code>')
32
- res = '对话历史写入:' + os.path.abspath(f'./gpt_log/{file_name}')
33
- print(res)
34
- return res
35
 
36
  def gen_file_preview(file_name):
37
  try:
 
1
+ from toolbox import CatchException, update_ui, promote_file_to_downloadzone
2
  from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
3
  import re
4
 
 
29
  for h in history:
30
  f.write("\n>>>" + h)
31
  f.write('</code>')
32
+ promote_file_to_downloadzone(f'./gpt_log/{file_name}', rename_file=file_name, chatbot=chatbot)
33
+ return '对话历史写入:' + os.path.abspath(f'./gpt_log/{file_name}')
 
34
 
35
  def gen_file_preview(file_name):
36
  try:
crazy_functions/数学动画生成manim.py CHANGED
@@ -8,7 +8,7 @@ def inspect_dependency(chatbot, history):
8
  import manim
9
  return True
10
  except:
11
- chatbot.append(["导入依赖失败", "使用该模块需要额外依赖,安装方法:```pip install manimgl```"])
12
  yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
13
  return False
14
 
 
8
  import manim
9
  return True
10
  except:
11
+ chatbot.append(["导入依赖失败", "使用该模块需要额外依赖,安装方法:```pip install manim manimgl```"])
12
  yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
13
  return False
14
 
crazy_functions/理解PDF文档内容.py CHANGED
@@ -13,7 +13,9 @@ def 解析PDF(file_name, llm_kwargs, plugin_kwargs, chatbot, history, system_pro
13
  # 递归地切割PDF文件,每一块(尽量是完整的一个section,比如introduction,experiment等,必要时再进行切割)
14
  # 的长度必须小于 2500 个 Token
15
  file_content, page_one = read_and_clean_pdf_text(file_name) # (尝试)按照章节切割PDF
16
-
 
 
17
  TOKEN_LIMIT_PER_FRAGMENT = 2500
18
 
19
  from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
 
13
  # 递归地切割PDF文件,每一块(尽量是完整的一个section,比如introduction,experiment等,必要时再进行切割)
14
  # 的长度必须小于 2500 个 Token
15
  file_content, page_one = read_and_clean_pdf_text(file_name) # (尝试)按照章节切割PDF
16
+ file_content = file_content.encode('utf-8', 'ignore').decode() # avoid reading non-utf8 chars
17
+ page_one = str(page_one).encode('utf-8', 'ignore').decode() # avoid reading non-utf8 chars
18
+
19
  TOKEN_LIMIT_PER_FRAGMENT = 2500
20
 
21
  from .crazy_utils import breakdown_txt_to_satisfy_token_limit_for_pdf
crazy_functions/联网的ChatGPT_bing版.py ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from toolbox import CatchException, update_ui
2
+ from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive, input_clipping
3
+ import requests
4
+ from bs4 import BeautifulSoup
5
+ from request_llm.bridge_all import model_info
6
+
7
+
8
+ def bing_search(query, proxies=None):
9
+ query = query
10
+ url = f"https://cn.bing.com/search?q={query}"
11
+ headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36'}
12
+ response = requests.get(url, headers=headers, proxies=proxies)
13
+ soup = BeautifulSoup(response.content, 'html.parser')
14
+ results = []
15
+ for g in soup.find_all('li', class_='b_algo'):
16
+ anchors = g.find_all('a')
17
+ if anchors:
18
+ link = anchors[0]['href']
19
+ if not link.startswith('http'):
20
+ continue
21
+ title = g.find('h2').text
22
+ item = {'title': title, 'link': link}
23
+ results.append(item)
24
+
25
+ for r in results:
26
+ print(r['link'])
27
+ return results
28
+
29
+
30
+ def scrape_text(url, proxies) -> str:
31
+ """Scrape text from a webpage
32
+
33
+ Args:
34
+ url (str): The URL to scrape text from
35
+
36
+ Returns:
37
+ str: The scraped text
38
+ """
39
+ headers = {
40
+ 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.61 Safari/537.36',
41
+ 'Content-Type': 'text/plain',
42
+ }
43
+ try:
44
+ response = requests.get(url, headers=headers, proxies=proxies, timeout=8)
45
+ if response.encoding == "ISO-8859-1": response.encoding = response.apparent_encoding
46
+ except:
47
+ return "无法连接到该网页"
48
+ soup = BeautifulSoup(response.text, "html.parser")
49
+ for script in soup(["script", "style"]):
50
+ script.extract()
51
+ text = soup.get_text()
52
+ lines = (line.strip() for line in text.splitlines())
53
+ chunks = (phrase.strip() for line in lines for phrase in line.split(" "))
54
+ text = "\n".join(chunk for chunk in chunks if chunk)
55
+ return text
56
+
57
+ @CatchException
58
+ def 连接bing搜索回答问题(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
59
+ """
60
+ txt 输入栏用户输入的文本,例如需要翻译的一段话,再例如一个包含了待处理文件的路径
61
+ llm_kwargs gpt模型参数,如温度和top_p等,一般原样传递下去就行
62
+ plugin_kwargs 插件模型的参数,暂时没有用武之地
63
+ chatbot 聊天显示框的句柄,用于显示给用户
64
+ history 聊天历史,前情提要
65
+ system_prompt 给gpt的静默提醒
66
+ web_port 当前软件运行的端口号
67
+ """
68
+ history = [] # 清空历史,以免输入溢出
69
+ chatbot.append((f"请结合互联网信息回答以下问题:{txt}",
70
+ "[Local Message] 请注意,您正在调用一个[函数插件]的模板,该模板可以实现ChatGPT联网信息综合。该函数面向希望实现更多有趣功能的开发者,它可以作为创建新功能函数的模板。您若希望分享新的功能模组,请不吝PR!"))
71
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间,我们先及时地做一次界面更新
72
+
73
+ # ------------- < 第1步:爬取搜索引擎的结果 > -------------
74
+ from toolbox import get_conf
75
+ proxies, = get_conf('proxies')
76
+ urls = bing_search(txt, proxies)
77
+ history = []
78
+
79
+ # ------------- < 第2步:依次访问网页 > -------------
80
+ max_search_result = 8 # 最多收纳多少个网页的结果
81
+ for index, url in enumerate(urls[:max_search_result]):
82
+ res = scrape_text(url['link'], proxies)
83
+ history.extend([f"第{index}份搜索结果:", res])
84
+ chatbot.append([f"第{index}份搜索结果:", res[:500]+"......"])
85
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 由于请求gpt需要一段时间,我们先及时地做一次界面更新
86
+
87
+ # ------------- < 第3步:ChatGPT综合 > -------------
88
+ i_say = f"从以上搜索结果中抽取信息,然后回答问题:{txt}"
89
+ i_say, history = input_clipping( # 裁剪输入,从最长的条目开始裁剪,防止爆token
90
+ inputs=i_say,
91
+ history=history,
92
+ max_token_limit=model_info[llm_kwargs['llm_model']]['max_token']*3//4
93
+ )
94
+ gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
95
+ inputs=i_say, inputs_show_user=i_say,
96
+ llm_kwargs=llm_kwargs, chatbot=chatbot, history=history,
97
+ sys_prompt="请从给定的若干条搜索结果中抽取信息,对最相关的两个搜索结果进行总结,然后回答问题。"
98
+ )
99
+ chatbot[-1] = (i_say, gpt_say)
100
+ history.append(i_say);history.append(gpt_say)
101
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面 # 界面更新
102
+
crazy_functions/虚空终端.py ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from toolbox import CatchException, update_ui, gen_time_str
2
+ from .crazy_utils import request_gpt_model_in_new_thread_with_ui_alive
3
+ from .crazy_utils import input_clipping
4
+
5
+
6
+ prompt = """
7
+ I have to achieve some functionalities by calling one of the functions below.
8
+ Your job is to find the correct funtion to use to satisfy my requirement,
9
+ and then write python code to call this function with correct parameters.
10
+
11
+ These are functions you are allowed to choose from:
12
+ 1.
13
+ 功能描述: 总结音视频内容
14
+ 调用函数: ConcludeAudioContent(txt, llm_kwargs)
15
+ 参数说明:
16
+ txt: 音频文件的路径
17
+ llm_kwargs: 模型参数, 永远给定None
18
+ 2.
19
+ 功能描述: 将每次对话记录写入Markdown格式的文件中
20
+ 调用函数: WriteMarkdown()
21
+ 3.
22
+ 功能描述: 将指定目录下的PDF文件从英文翻译成中文
23
+ 调用函数: BatchTranslatePDFDocuments_MultiThreaded(txt, llm_kwargs)
24
+ 参数说明:
25
+ txt: PDF文件所在的路径
26
+ llm_kwargs: 模型参数, 永远给定None
27
+ 4.
28
+ 功能描述: 根据文本使用GPT模型生成相应的图像
29
+ 调用函数: ImageGeneration(txt, llm_kwargs)
30
+ 参数说明:
31
+ txt: 图像生成所用到的提示文本
32
+ llm_kwargs: 模型参数, 永远给定None
33
+ 5.
34
+ 功能描述: 对输入的word文档进行摘要生成
35
+ 调用函数: SummarizingWordDocuments(input_path, output_path)
36
+ 参数说明:
37
+ input_path: 待处理的word文档路径
38
+ output_path: 摘要生成后的文档路径
39
+
40
+
41
+ You should always anwser with following format:
42
+ ----------------
43
+ Code:
44
+ ```
45
+ class AutoAcademic(object):
46
+ def __init__(self):
47
+ self.selected_function = "FILL_CORRECT_FUNCTION_HERE" # e.g., "GenerateImage"
48
+ self.txt = "FILL_MAIN_PARAMETER_HERE" # e.g., "荷叶上的蜻蜓"
49
+ self.llm_kwargs = None
50
+ ```
51
+ Explanation:
52
+ 只有GenerateImage和生成图像相关, 因此选择GenerateImage函数。
53
+ ----------------
54
+
55
+ Now, this is my requirement:
56
+
57
+ """
58
+ def get_fn_lib():
59
+ return {
60
+ "BatchTranslatePDFDocuments_MultiThreaded": ("crazy_functions.批量翻译PDF文档_多线程", "批量翻译PDF文档"),
61
+ "SummarizingWordDocuments": ("crazy_functions.总结word文档", "总结word文档"),
62
+ "ImageGeneration": ("crazy_functions.图片生成", "图片生成"),
63
+ "TranslateMarkdownFromEnglishToChinese": ("crazy_functions.批量Markdown翻译", "Markdown中译英"),
64
+ "SummaryAudioVideo": ("crazy_functions.总结音视频", "总结音视频"),
65
+ }
66
+
67
+ def inspect_dependency(chatbot, history):
68
+ return True
69
+
70
+ def eval_code(code, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
71
+ import subprocess, sys, os, shutil, importlib
72
+
73
+ with open('gpt_log/void_terminal_runtime.py', 'w', encoding='utf8') as f:
74
+ f.write(code)
75
+
76
+ try:
77
+ AutoAcademic = getattr(importlib.import_module('gpt_log.void_terminal_runtime', 'AutoAcademic'), 'AutoAcademic')
78
+ # importlib.reload(AutoAcademic)
79
+ auto_dict = AutoAcademic()
80
+ selected_function = auto_dict.selected_function
81
+ txt = auto_dict.txt
82
+ fp, fn = get_fn_lib()[selected_function]
83
+ fn_plugin = getattr(importlib.import_module(fp, fn), fn)
84
+ yield from fn_plugin(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port)
85
+ except:
86
+ from toolbox import trimmed_format_exc
87
+ chatbot.append(["执行错误", f"\n```\n{trimmed_format_exc()}\n```\n"])
88
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
89
+
90
+ def get_code_block(reply):
91
+ import re
92
+ pattern = r"```([\s\S]*?)```" # regex pattern to match code blocks
93
+ matches = re.findall(pattern, reply) # find all code blocks in text
94
+ if len(matches) != 1:
95
+ raise RuntimeError("GPT is not generating proper code.")
96
+ return matches[0].strip('python') # code block
97
+
98
+ @CatchException
99
+ def 终端(txt, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port):
100
+ """
101
+ txt 输入栏用户输入的文本, 例如需要翻译的一段话, 再例如一个包含了待处理文件的路径
102
+ llm_kwargs gpt模型参数, 如温度和top_p等, 一般原样传递下去就行
103
+ plugin_kwargs 插件模型的参数, 暂时没有用武之地
104
+ chatbot 聊天显示框的句柄, 用于显示给用户
105
+ history 聊天历史, 前情提要
106
+ system_prompt 给gpt的静默提醒
107
+ web_port 当前软件运行的端口号
108
+ """
109
+ # 清空历史, 以免输入溢出
110
+ history = []
111
+
112
+ # 基本信息:功能、贡献者
113
+ chatbot.append(["函数插件功能?", "根据自然语言执行插件命令, 作者: binary-husky, 插件初始化中 ..."])
114
+ yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
115
+
116
+ # # 尝试导入依赖, 如果缺少依赖, 则给出安装建议
117
+ # dep_ok = yield from inspect_dependency(chatbot=chatbot, history=history) # 刷新界面
118
+ # if not dep_ok: return
119
+
120
+ # 输入
121
+ i_say = prompt + txt
122
+ # 开始
123
+ gpt_say = yield from request_gpt_model_in_new_thread_with_ui_alive(
124
+ inputs=i_say, inputs_show_user=txt,
125
+ llm_kwargs=llm_kwargs, chatbot=chatbot, history=[],
126
+ sys_prompt=""
127
+ )
128
+
129
+ # 将代码转为动画
130
+ code = get_code_block(gpt_say)
131
+ yield from eval_code(code, llm_kwargs, plugin_kwargs, chatbot, history, system_prompt, web_port)
docker-compose.yml CHANGED
@@ -103,3 +103,30 @@ services:
103
  echo '[jittorllms] 正在从github拉取最新代码...' &&
104
  git --git-dir=request_llm/jittorllms/.git --work-tree=request_llm/jittorllms pull --force &&
105
  python3 -u main.py"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
103
  echo '[jittorllms] 正在从github拉取最新代码...' &&
104
  git --git-dir=request_llm/jittorllms/.git --work-tree=request_llm/jittorllms pull --force &&
105
  python3 -u main.py"
106
+
107
+
108
+ ## ===================================================
109
+ ## 【方案四】 chatgpt + Latex
110
+ ## ===================================================
111
+ version: '3'
112
+ services:
113
+ gpt_academic_with_latex:
114
+ image: ghcr.io/binary-husky/gpt_academic_with_latex:master
115
+ environment:
116
+ # 请查阅 `config.py` 以查看所有的配置信息
117
+ API_KEY: ' sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx '
118
+ USE_PROXY: ' True '
119
+ proxies: ' { "http": "socks5h://localhost:10880", "https": "socks5h://localhost:10880", } '
120
+ LLM_MODEL: ' gpt-3.5-turbo '
121
+ AVAIL_LLM_MODELS: ' ["gpt-3.5-turbo", "gpt-4"] '
122
+ LOCAL_MODEL_DEVICE: ' cuda '
123
+ DEFAULT_WORKER_NUM: ' 10 '
124
+ WEB_PORT: ' 12303 '
125
+
126
+ # 与宿主的网络融合
127
+ network_mode: "host"
128
+
129
+ # 不使用代理网络拉取最新代码
130
+ command: >
131
+ bash -c "python3 -u main.py"
132
+
docs/Dockerfile+NoLocal+Latex ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 此Dockerfile适用于“无本地模型”的环境构建,如果需要使用chatglm等本地模型,请参考 docs/Dockerfile+ChatGLM
2
+ # - 1 修改 `config.py`
3
+ # - 2 构建 docker build -t gpt-academic-nolocal-latex -f docs/Dockerfile+NoLocal+Latex .
4
+ # - 3 运行 docker run -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --rm -it --net=host gpt-academic-nolocal-latex
5
+
6
+ FROM fuqingxu/python311_texlive_ctex:latest
7
+
8
+ # 指定路径
9
+ WORKDIR /gpt
10
+
11
+ ARG useProxyNetwork=''
12
+
13
+ RUN $useProxyNetwork pip3 install gradio openai numpy arxiv rich -i https://pypi.douban.com/simple/
14
+ RUN $useProxyNetwork pip3 install colorama Markdown pygments pymupdf -i https://pypi.douban.com/simple/
15
+
16
+ # 装载项目文件
17
+ COPY . .
18
+
19
+
20
+ # 安装依赖
21
+ RUN $useProxyNetwork pip3 install -r requirements.txt -i https://pypi.douban.com/simple/
22
+
23
+ # 可选步骤,用于预热模块
24
+ RUN python3 -c 'from check_proxy import warm_up_modules; warm_up_modules()'
25
+
26
+ # 启动
27
+ CMD ["python3", "-u", "main.py"]
docs/GithubAction+NoLocal+Latex ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 此Dockerfile适用于“无本地模型”的环境构建,如果需要使用chatglm等本地模型,请参考 docs/Dockerfile+ChatGLM
2
+ # - 1 修改 `config.py`
3
+ # - 2 构建 docker build -t gpt-academic-nolocal-latex -f docs/Dockerfile+NoLocal+Latex .
4
+ # - 3 运行 docker run -v /home/fuqingxu/arxiv_cache:/root/arxiv_cache --rm -it --net=host gpt-academic-nolocal-latex
5
+
6
+ FROM fuqingxu/python311_texlive_ctex:latest
7
+
8
+ # 指定路径
9
+ WORKDIR /gpt
10
+
11
+ RUN pip3 install gradio openai numpy arxiv rich
12
+ RUN pip3 install colorama Markdown pygments pymupdf
13
+
14
+ # 装载项目文件
15
+ COPY . .
16
+
17
+
18
+ # 安装依赖
19
+ RUN pip3 install -r requirements.txt
20
+
21
+ # 可选步骤,用于预热模块
22
+ RUN python3 -c 'from check_proxy import warm_up_modules; warm_up_modules()'
23
+
24
+ # 启动
25
+ CMD ["python3", "-u", "main.py"]
docs/README.md.Italian.md CHANGED
@@ -2,11 +2,11 @@
2
  >
3
  > Durante l'installazione delle dipendenze, selezionare rigorosamente le **versioni specificate** nel file requirements.txt.
4
  >
5
- > ` pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/`
6
 
7
- # <img src="docs/logo.png" width="40" > GPT Ottimizzazione Accademica (GPT Academic)
8
 
9
- **Se ti piace questo progetto, ti preghiamo di dargli una stella. Se hai sviluppato scorciatoie accademiche o plugin funzionali più utili, non esitare ad aprire una issue o pull request. Abbiamo anche una README in [Inglese|](docs/README_EN.md)[Giapponese|](docs/README_JP.md)[Coreano|](https://github.com/mldljyh/ko_gpt_academic)[Russo|](docs/README_RS.md)[Francese](docs/README_FR.md) tradotta da questo stesso progetto.
10
  Per tradurre questo progetto in qualsiasi lingua con GPT, leggere e eseguire [`multi_language.py`](multi_language.py) (sperimentale).
11
 
12
  > **Nota**
@@ -17,7 +17,9 @@ Per tradurre questo progetto in qualsiasi lingua con GPT, leggere e eseguire [`m
17
  >
18
  > 3. Questo progetto è compatibile e incoraggia l'utilizzo di grandi modelli di linguaggio di produzione nazionale come chatglm, RWKV, Pangu ecc. Supporta la coesistenza di più api-key e può essere compilato nel file di configurazione come `API_KEY="openai-key1,openai-key2,api2d-key3"`. Per sostituire temporaneamente `API_KEY`, inserire `API_KEY` temporaneo nell'area di input e premere Invio per renderlo effettivo.
19
 
20
- <div align="center">Funzione | Descrizione
 
 
21
  --- | ---
22
  Correzione immediata | Supporta correzione immediata e ricerca degli errori di grammatica del documento con un solo clic
23
  Traduzione cinese-inglese immediata | Traduzione cinese-inglese immediata con un solo clic
@@ -41,6 +43,8 @@ Avvia il tema di gradio [scuro](https://github.com/binary-husky/chatgpt_academic
41
  Supporto per maggiori modelli LLM, supporto API2D | Sentirsi serviti simultaneamente da GPT3.5, GPT4, [Tsinghua ChatGLM](https://github.com/THUDM/ChatGLM-6B), [Fudan MOSS](https://github.com/OpenLMLab/MOSS) deve essere una grande sensazione, giusto?
42
  Ulteriori modelli LLM supportat,i supporto per l'implementazione di Huggingface | Aggiunta di un'interfaccia Newbing (Nuovo Bing), introdotta la compatibilità con Tsinghua [Jittorllms](https://github.com/Jittor/JittorLLMs), [LLaMA](https://github.com/facebookresearch/llama), [RWKV](https://github.com/BlinkDL/ChatRWKV) e [PanGu-α](https://openi.org.cn/pangu/)
43
  Ulteriori dimostrazioni di nuove funzionalità (generazione di immagini, ecc.)... | Vedere la fine di questo documento...
 
 
44
 
45
  - Nuova interfaccia (modificare l'opzione LAYOUT in `config.py` per passare dal layout a sinistra e a destra al layout superiore e inferiore)
46
  <div align="center">
@@ -202,11 +206,13 @@ ad esempio
202
  2. Plugin di funzione personalizzati
203
 
204
  Scrivi plugin di funzione personalizzati e esegui tutte le attività che desideri o non hai mai pensato di fare.
205
- La difficoltà di scrittura e debug dei plugin del nostro progetto è molto bassa. Se si dispone di una certa conoscenza di base di Python, è possibile realizzare la propria funzione del plugin seguendo il nostro modello. Per maggiori dettagli, consultare la [guida al plugin per funzioni] (https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97).
206
 
207
  ---
208
  # Ultimo aggiornamento
209
- ## Nuove funzionalità dinamiche1. Funzionalità di salvataggio della conversazione. Nell'area dei plugin della funzione, fare clic su "Salva la conversazione corrente" per salvare la conversazione corrente come file html leggibile e ripristinabile, inoltre, nell'area dei plugin della funzione (menu a discesa), fare clic su "Carica la cronologia della conversazione archiviata" per ripristinare la conversazione precedente. Suggerimento: fare clic su "Carica la cronologia della conversazione archiviata" senza specificare il file consente di visualizzare la cache degli archivi html di cronologia, fare clic su "Elimina tutti i record di cronologia delle conversazioni locali" per eliminare tutte le cache degli archivi html.
 
 
210
  <div align="center">
211
  <img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
212
  </div>
@@ -307,4 +313,4 @@ https://github.com/kaixindelele/ChatPaper
307
  # Altro:
308
  https://github.com/gradio-app/gradio
309
  https://github.com/fghrsh/live2d_demo
310
- ```
 
2
  >
3
  > Durante l'installazione delle dipendenze, selezionare rigorosamente le **versioni specificate** nel file requirements.txt.
4
  >
5
+ > ` pip install -r requirements.txt`
6
 
7
+ # <img src="logo.png" width="40" > GPT Ottimizzazione Accademica (GPT Academic)
8
 
9
+ **Se ti piace questo progetto, ti preghiamo di dargli una stella. Se hai sviluppato scorciatoie accademiche o plugin funzionali più utili, non esitare ad aprire una issue o pull request. Abbiamo anche una README in [Inglese|](README_EN.md)[Giapponese|](README_JP.md)[Coreano|](https://github.com/mldljyh/ko_gpt_academic)[Russo|](README_RS.md)[Francese](README_FR.md) tradotta da questo stesso progetto.
10
  Per tradurre questo progetto in qualsiasi lingua con GPT, leggere e eseguire [`multi_language.py`](multi_language.py) (sperimentale).
11
 
12
  > **Nota**
 
17
  >
18
  > 3. Questo progetto è compatibile e incoraggia l'utilizzo di grandi modelli di linguaggio di produzione nazionale come chatglm, RWKV, Pangu ecc. Supporta la coesistenza di più api-key e può essere compilato nel file di configurazione come `API_KEY="openai-key1,openai-key2,api2d-key3"`. Per sostituire temporaneamente `API_KEY`, inserire `API_KEY` temporaneo nell'area di input e premere Invio per renderlo effettivo.
19
 
20
+ <div align="center">
21
+
22
+ Funzione | Descrizione
23
  --- | ---
24
  Correzione immediata | Supporta correzione immediata e ricerca degli errori di grammatica del documento con un solo clic
25
  Traduzione cinese-inglese immediata | Traduzione cinese-inglese immediata con un solo clic
 
43
  Supporto per maggiori modelli LLM, supporto API2D | Sentirsi serviti simultaneamente da GPT3.5, GPT4, [Tsinghua ChatGLM](https://github.com/THUDM/ChatGLM-6B), [Fudan MOSS](https://github.com/OpenLMLab/MOSS) deve essere una grande sensazione, giusto?
44
  Ulteriori modelli LLM supportat,i supporto per l'implementazione di Huggingface | Aggiunta di un'interfaccia Newbing (Nuovo Bing), introdotta la compatibilità con Tsinghua [Jittorllms](https://github.com/Jittor/JittorLLMs), [LLaMA](https://github.com/facebookresearch/llama), [RWKV](https://github.com/BlinkDL/ChatRWKV) e [PanGu-α](https://openi.org.cn/pangu/)
45
  Ulteriori dimostrazioni di nuove funzionalità (generazione di immagini, ecc.)... | Vedere la fine di questo documento...
46
+ </div>
47
+
48
 
49
  - Nuova interfaccia (modificare l'opzione LAYOUT in `config.py` per passare dal layout a sinistra e a destra al layout superiore e inferiore)
50
  <div align="center">
 
206
  2. Plugin di funzione personalizzati
207
 
208
  Scrivi plugin di funzione personalizzati e esegui tutte le attività che desideri o non hai mai pensato di fare.
209
+ La difficoltà di scrittura e debug dei plugin del nostro progetto è molto bassa. Se si dispone di una certa conoscenza di base di Python, è possibile realizzare la propria funzione del plugin seguendo il nostro modello. Per maggiori dettagli, consultare la [guida al plugin per funzioni](https://github.com/binary-husky/chatgpt_academic/wiki/%E5%87%BD%E6%95%B0%E6%8F%92%E4%BB%B6%E6%8C%87%E5%8D%97).
210
 
211
  ---
212
  # Ultimo aggiornamento
213
+ ## Nuove funzionalità dinamiche
214
+
215
+ 1. Funzionalità di salvataggio della conversazione. Nell'area dei plugin della funzione, fare clic su "Salva la conversazione corrente" per salvare la conversazione corrente come file html leggibile e ripristinabile, inoltre, nell'area dei plugin della funzione (menu a discesa), fare clic su "Carica la cronologia della conversazione archiviata" per ripristinare la conversazione precedente. Suggerimento: fare clic su "Carica la cronologia della conversazione archiviata" senza specificare il file consente di visualizzare la cache degli archivi html di cronologia, fare clic su "Elimina tutti i record di cronologia delle conversazioni locali" per eliminare tutte le cache degli archivi html.
216
  <div align="center">
217
  <img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
218
  </div>
 
313
  # Altro:
314
  https://github.com/gradio-app/gradio
315
  https://github.com/fghrsh/live2d_demo
316
+ ```
docs/README.md.Korean.md CHANGED
@@ -17,7 +17,9 @@ GPT를 이용하여 프로젝트를 임의의 언어로 번역하려면 [`multi_
17
  >
18
  > 3. 이 프로젝트는 국내 언어 모델 chatglm과 RWKV, 판고 등의 시도와 호환 가능합니다. 여러 개의 api-key를 지원하며 설정 파일에 "API_KEY="openai-key1,openai-key2,api2d-key3""와 같이 작성할 수 있습니다. `API_KEY`를 임시로 변경해야하는 경우 입력 영역에 임시 `API_KEY`를 입력 한 후 엔터 키를 누르면 즉시 적용됩니다.
19
 
20
- <div align="center">기능 | 설명
 
 
21
  --- | ---
22
  원 키워드 | 원 키워드 및 논문 문법 오류를 찾는 기능 지원
23
  한-영 키워드 | 한-영 키워드 지원
@@ -265,4 +267,4 @@ https://github.com/kaixindelele/ChatPaper
265
  # 더 많은 :
266
  https://github.com/gradio-app/gradio
267
  https://github.com/fghrsh/live2d_demo
268
- ```
 
17
  >
18
  > 3. 이 프로젝트는 국내 언어 모델 chatglm과 RWKV, 판고 등의 시도와 호환 가능합니다. 여러 개의 api-key를 지원하며 설정 파일에 "API_KEY="openai-key1,openai-key2,api2d-key3""와 같이 작성할 수 있습니다. `API_KEY`를 임시로 변경해야하는 경우 입력 영역에 임시 `API_KEY`를 입력 한 후 엔터 키를 누르면 즉시 적용됩니다.
19
 
20
+ <div align="center">
21
+
22
+ 기능 | 설명
23
  --- | ---
24
  원 키워드 | 원 키워드 및 논문 문법 오류를 찾는 기능 지원
25
  한-영 키워드 | 한-영 키워드 지원
 
267
  # 더 많은 :
268
  https://github.com/gradio-app/gradio
269
  https://github.com/fghrsh/live2d_demo
270
+ ```
docs/README.md.Portuguese.md CHANGED
@@ -2,7 +2,7 @@
2
  >
3
  > Ao instalar as dependências, por favor, selecione rigorosamente as versões **especificadas** no arquivo requirements.txt.
4
  >
5
- > `pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/`
6
  >
7
 
8
  # <img src="logo.png" width="40" > Otimização acadêmica GPT (GPT Academic)
@@ -18,7 +18,9 @@ Para traduzir este projeto para qualquer idioma com o GPT, leia e execute [`mult
18
  >
19
  > 3. Este projeto é compatível com e incentiva o uso de modelos de linguagem nacionais, como chatglm e RWKV, Pangolin, etc. Suporta a coexistência de várias chaves de API e pode ser preenchido no arquivo de configuração como `API_KEY="openai-key1,openai-key2,api2d-key3"`. Quando precisar alterar temporariamente o `API_KEY`, basta digitar o `API_KEY` temporário na área de entrada e pressionar Enter para que ele entre em vigor.
20
 
21
- <div align="center">Funcionalidade | Descrição
 
 
22
  --- | ---
23
  Um clique de polimento | Suporte a um clique polimento, um clique encontrar erros de gramática no artigo
24
  Tradução chinês-inglês de um clique | Tradução chinês-inglês de um clique
@@ -216,7 +218,9 @@ Para mais detalhes, consulte o [Guia do plug-in de função.](https://github.com
216
 
217
  ---
218
  # Última atualização
219
- ## Novas funções dinâmicas.1. Função de salvamento de diálogo. Ao chamar o plug-in de função "Salvar diálogo atual", é possível salvar o diálogo atual em um arquivo html legível e reversível. Além disso, ao chamar o plug-in de função "Carregar arquivo de histórico de diálogo" no menu suspenso da área de plug-in, é possível restaurar uma conversa anterior. Dica: clicar em "Carregar arquivo de histórico de diálogo" sem especificar um arquivo permite visualizar o cache do arquivo html de histórico. Clicar em "Excluir todo o registro de histórico de diálogo local" permite excluir todo o cache de arquivo html.
 
 
220
  <div align="center">
221
  <img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
222
  </div>
@@ -317,4 +321,4 @@ https://github.com/kaixindelele/ChatPaper
317
  # Mais:
318
  https://github.com/gradio-app/gradio
319
  https://github.com/fghrsh/live2d_demo
320
- ```
 
2
  >
3
  > Ao instalar as dependências, por favor, selecione rigorosamente as versões **especificadas** no arquivo requirements.txt.
4
  >
5
+ > `pip install -r requirements.txt`
6
  >
7
 
8
  # <img src="logo.png" width="40" > Otimização acadêmica GPT (GPT Academic)
 
18
  >
19
  > 3. Este projeto é compatível com e incentiva o uso de modelos de linguagem nacionais, como chatglm e RWKV, Pangolin, etc. Suporta a coexistência de várias chaves de API e pode ser preenchido no arquivo de configuração como `API_KEY="openai-key1,openai-key2,api2d-key3"`. Quando precisar alterar temporariamente o `API_KEY`, basta digitar o `API_KEY` temporário na área de entrada e pressionar Enter para que ele entre em vigor.
20
 
21
+ <div align="center">
22
+
23
+ Funcionalidade | Descrição
24
  --- | ---
25
  Um clique de polimento | Suporte a um clique polimento, um clique encontrar erros de gramática no artigo
26
  Tradução chinês-inglês de um clique | Tradução chinês-inglês de um clique
 
218
 
219
  ---
220
  # Última atualização
221
+ ## Novas funções dinâmicas.
222
+
223
+ 1. Função de salvamento de diálogo. Ao chamar o plug-in de função "Salvar diálogo atual", é possível salvar o diálogo atual em um arquivo html legível e reversível. Além disso, ao chamar o plug-in de função "Carregar arquivo de histórico de diálogo" no menu suspenso da área de plug-in, é possível restaurar uma conversa anterior. Dica: clicar em "Carregar arquivo de histórico de diálogo" sem especificar um arquivo permite visualizar o cache do arquivo html de histórico. Clicar em "Excluir todo o registro de histórico de diálogo local" permite excluir todo o cache de arquivo html.
224
  <div align="center">
225
  <img src="https://user-images.githubusercontent.com/96192199/235222390-24a9acc0-680f-49f5-bc81-2f3161f1e049.png" width="500" >
226
  </div>
 
321
  # Mais:
322
  https://github.com/gradio-app/gradio
323
  https://github.com/fghrsh/live2d_demo
324
+ ```
docs/translate_english.json CHANGED
@@ -58,6 +58,8 @@
58
  "连接网络回答问题": "ConnectToNetworkToAnswerQuestions",
59
  "联网的ChatGPT": "ChatGPTConnectedToNetwork",
60
  "解析任意code项目": "ParseAnyCodeProject",
 
 
61
  "同时问询_指定模型": "InquireSimultaneously_SpecifiedModel",
62
  "图片生成": "ImageGeneration",
63
  "test_解析ipynb文件": "Test_ParseIpynbFile",
 
58
  "连接网络回答问题": "ConnectToNetworkToAnswerQuestions",
59
  "联网的ChatGPT": "ChatGPTConnectedToNetwork",
60
  "解析任意code项目": "ParseAnyCodeProject",
61
+ "读取知识库作答": "ReadKnowledgeArchiveAnswerQuestions",
62
+ "知识库问答": "UpdateKnowledgeArchive",
63
  "同时问询_指定模型": "InquireSimultaneously_SpecifiedModel",
64
  "图片生成": "ImageGeneration",
65
  "test_解析ipynb文件": "Test_ParseIpynbFile",
docs/use_azure.md ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 通过微软Azure云服务申请 Openai API
2
+
3
+ 由于Openai和微软的关系,现在是可以通过微软的Azure云计算服务直接访问openai的api,免去了注册和网络的问题。
4
+
5
+ 快速入门的官方文档的链接是:[快速入门 - 开始通过 Azure OpenAI 服务使用 ChatGPT 和 GPT-4 - Azure OpenAI Service | Microsoft Learn](https://learn.microsoft.com/zh-cn/azure/cognitive-services/openai/chatgpt-quickstart?pivots=programming-language-python)
6
+
7
+ # 申请API
8
+
9
+ 按文档中的“先决条件”的介绍,出了编程的环境以外,还需要以下三个条件:
10
+
11
+ 1.  Azure账号并创建订阅
12
+
13
+ 2.  为订阅添加Azure OpenAI 服务
14
+
15
+ 3.  部署模型
16
+
17
+ ## Azure账号并创建订阅
18
+
19
+ ### Azure账号
20
+
21
+ 创建Azure的账号时最好是有微软的账号,这样似乎更容易获得免费额度(第一个月的200美元,实测了一下,如果用一个刚注册的微软账号登录Azure的话,并没有这一个月的免费额度)。
22
+
23
+ 创建Azure账号的网址是:[立即创建 Azure 免费帐户 | Microsoft Azure](https://azure.microsoft.com/zh-cn/free/)
24
+
25
+ ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_944786_iH6AECuZ_tY0EaBd_1685327219?w=1327\&h=695\&type=image/png)
26
+
27
+ 打开网页后,点击 “免费开始使用” 会跳转到登录或注册页面,如果有微软的账户,直接登录即可,如果没有微软账户,那就需要到微软的网页再另行注册一个。
28
+
29
+ 注意,Azure的页面和政策时不时会变化,已实际最新显示的为准就好。
30
+
31
+ ### 创建订阅
32
+
33
+ 注册好Azure后便可进入主页:
34
+
35
+ ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_444847_tk-9S-pxOYuaLs_K_1685327675?w=1865\&h=969\&type=image/png)
36
+
37
+ 首先需要在订阅里进行添加操作,点开后即可进入订阅的页面:
38
+
39
+ ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_612820_z_1AlaEgnJR-rUl0_1685327892?w=1865\&h=969\&type=image/png)
40
+
41
+ 第一次进来应该是空的,点添加即可创建新的订阅(可以是“免费”或者“即付即用”的订阅),其中订阅ID是后面申请Azure OpenAI需要使用的。
42
+
43
+ ## 为订阅添加Azure OpenAI服务
44
+
45
+ 之后回到首页,点Azure OpenAI即可进入OpenAI服务的页面(如果不显示的话,则在首页上方的搜索栏里搜索“openai”即可)。
46
+
47
+ ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_269759_nExkGcPC0EuAR5cp_1685328130?w=1865\&h=969\&type=image/png)
48
+
49
+ 不过现在这个服务还不能用。在使用前,还需要在这个网址申请一下:
50
+
51
+ [Request Access to Azure OpenAI Service (microsoft.com)](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUOFA5Qk1UWDRBMjg0WFhPMkIzTzhKQ1dWNyQlQCN0PWcu)
52
+
53
+ 这里有二十来个问题,按照要求和自己的实际情况填写即可。
54
+
55
+ 其中需要注意的是
56
+
57
+ 1.  千万记得填对"订阅ID"
58
+
59
+ 2.  需要填一个公司邮箱(可以不是注册用的邮箱)和公司网址
60
+
61
+ 之后,在回到上面那个页面,点创建,就会进入创建页面了:
62
+
63
+ ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_72708_9d9JYhylPVz3dFWL_1685328372?w=824\&h=590\&type=image/png)
64
+
65
+ 需要填入“资源组”和“名称”,按照自己的需要填入即可。
66
+
67
+ 完成后,在主页的“资源”里就可以看到刚才创建的“资源”了,点击进入后,就可以进行最后的部署了。
68
+
69
+ ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_871541_CGCnbgtV9Uk1Jccy_1685329861?w=1217\&h=628\&type=image/png)
70
+
71
+ ## 部署模型
72
+
73
+ 进入资源页面后,在部署模型前,可以先点击“开发”,把密钥和终结点记下来。
74
+
75
+ ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_852567_dxCZOrkMlWDSLH0d_1685330736?w=856\&h=568\&type=image/png)
76
+
77
+ 之后,就可以去部署模型了,点击“部署”即可,会跳转到 Azure OpenAI Stuido 进行下面的操作:
78
+
79
+ ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_169225_uWs1gMhpNbnwW4h2_1685329901?w=1865\&h=969\&type=image/png)
80
+
81
+ 进入 Azure OpenAi Studio 后,点击新建部署,会弹出如下对话框:
82
+
83
+ ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_391255_iXUSZAzoud5qlxjJ_1685330224?w=656\&h=641\&type=image/png)
84
+
85
+ 在这里选 gpt-35-turbo 或需要的模型并按需要填入“部署名”即可完成模型的部署。
86
+
87
+ ![](https://wdcdn.qpic.cn/MTY4ODg1Mjk4NzI5NTU1NQ_724099_vBaHcUilsm1EtPgK_1685330396?w=1869\&h=482\&type=image/png)
88
+
89
+ 这个部署名需要记下来。
90
+
91
+ 到现在为止,申请操作就完成了,需要记下来的有下面几个东西:
92
+
93
+ ● 密钥(1或2都可以)
94
+
95
+ ● 终结点
96
+
97
+ ● 部署名(不是模型名)
98
+
99
+ # 修改 config.py
100
+
101
+ ```
102
+ AZURE_ENDPOINT = "填入终结点"
103
+ AZURE_API_KEY = "填入azure openai api的密钥"
104
+ AZURE_API_VERSION = "2023-05-15" # 默认使用 2023-05-15 版本,无需修改
105
+ AZURE_ENGINE = "填入部署名"
106
+
107
+ ```
108
+ # API的使用
109
+
110
+ 接下来就是具体怎么使用API了,还是可以参考官方文档:[快速入门 - 开始通过 Azure OpenAI 服务使用 ChatGPT 和 GPT-4 - Azure OpenAI Service | Microsoft Learn](https://learn.microsoft.com/zh-cn/azure/cognitive-services/openai/chatgpt-quickstart?pivots=programming-language-python)
111
+
112
+ 和openai自己的api调用有点类似,都需要安装openai库,不同的是调用方式
113
+
114
+ ```
115
+ import openai
116
+ openai.api_type = "azure" #固定格式,无需修改
117
+ openai.api_base = os.getenv("AZURE_OPENAI_ENDPOINT") #这里填入“终结点”
118
+ openai.api_version = "2023-05-15" #固定格式,无需修改
119
+ openai.api_key = os.getenv("AZURE_OPENAI_KEY") #这里填入“密钥1”或“密钥2”
120
+
121
+ response = openai.ChatCompletion.create(
122
+ engine="gpt-35-turbo", #这里填入的不是模型名,是部署名
123
+ messages=[
124
+ {"role": "system", "content": "You are a helpful assistant."},
125
+ {"role": "user", "content": "Does Azure OpenAI support customer managed keys?"},
126
+ {"role": "assistant", "content": "Yes, customer managed keys are supported by Azure OpenAI."},
127
+ {"role": "user", "content": "Do other Azure Cognitive Services support this too?"}
128
+ ]
129
+ )
130
+
131
+ print(response)
132
+ print(response['choices'][0]['message']['content'])
133
+
134
+ ```
135
+
136
+ 需要注意的是:
137
+
138
+ 1.  engine那里填入的是部署名,不是模型名
139
+
140
+ 2.  通过openai库获得的这个 response 和通过 request 库访问 url 获得的 response 不同,不需要 decode,已经是解析好的 json 了,直接根据键值读取即可。
141
+
142
+ 更细节的使用方法,详见官方API文档。
143
+
144
+ # 关于费用
145
+
146
+ Azure OpenAI API 还是需要一些费用的(免费订阅只有1个月有效期),费用如下:
147
+
148
+ ![image.png](https://note.youdao.com/yws/res/18095/WEBRESOURCEeba0ab6d3127b79e143ef2d5627c0e44)
149
+
150
+ 具体可以可以看这个网址 :[Azure OpenAI 服务 - 定价| Microsoft Azure](https://azure.microsoft.com/zh-cn/pricing/details/cognitive-services/openai-service/?cdn=disable)
151
+
152
+ 并非网上说的什么“一年白嫖”,但注册方法以及网络问题都比直接使用openai的api要简单一些。
request_llm/bridge_all.py CHANGED
@@ -16,6 +16,9 @@ from toolbox import get_conf, trimmed_format_exc
16
  from .bridge_chatgpt import predict_no_ui_long_connection as chatgpt_noui
17
  from .bridge_chatgpt import predict as chatgpt_ui
18
 
 
 
 
19
  from .bridge_chatglm import predict_no_ui_long_connection as chatglm_noui
20
  from .bridge_chatglm import predict as chatglm_ui
21
 
@@ -83,6 +86,33 @@ model_info = {
83
  "tokenizer": tokenizer_gpt35,
84
  "token_cnt": get_token_num_gpt35,
85
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86
 
87
  "gpt-4": {
88
  "fn_with_ui": chatgpt_ui,
@@ -93,6 +123,16 @@ model_info = {
93
  "token_cnt": get_token_num_gpt4,
94
  },
95
 
 
 
 
 
 
 
 
 
 
 
96
  # api_2d
97
  "api2d-gpt-3.5-turbo": {
98
  "fn_with_ui": chatgpt_ui,
 
16
  from .bridge_chatgpt import predict_no_ui_long_connection as chatgpt_noui
17
  from .bridge_chatgpt import predict as chatgpt_ui
18
 
19
+ from .bridge_azure_test import predict_no_ui_long_connection as azure_noui
20
+ from .bridge_azure_test import predict as azure_ui
21
+
22
  from .bridge_chatglm import predict_no_ui_long_connection as chatglm_noui
23
  from .bridge_chatglm import predict as chatglm_ui
24
 
 
86
  "tokenizer": tokenizer_gpt35,
87
  "token_cnt": get_token_num_gpt35,
88
  },
89
+
90
+ "gpt-3.5-turbo-16k": {
91
+ "fn_with_ui": chatgpt_ui,
92
+ "fn_without_ui": chatgpt_noui,
93
+ "endpoint": openai_endpoint,
94
+ "max_token": 1024*16,
95
+ "tokenizer": tokenizer_gpt35,
96
+ "token_cnt": get_token_num_gpt35,
97
+ },
98
+
99
+ "gpt-3.5-turbo-0613": {
100
+ "fn_with_ui": chatgpt_ui,
101
+ "fn_without_ui": chatgpt_noui,
102
+ "endpoint": openai_endpoint,
103
+ "max_token": 4096,
104
+ "tokenizer": tokenizer_gpt35,
105
+ "token_cnt": get_token_num_gpt35,
106
+ },
107
+
108
+ "gpt-3.5-turbo-16k-0613": {
109
+ "fn_with_ui": chatgpt_ui,
110
+ "fn_without_ui": chatgpt_noui,
111
+ "endpoint": openai_endpoint,
112
+ "max_token": 1024 * 16,
113
+ "tokenizer": tokenizer_gpt35,
114
+ "token_cnt": get_token_num_gpt35,
115
+ },
116
 
117
  "gpt-4": {
118
  "fn_with_ui": chatgpt_ui,
 
123
  "token_cnt": get_token_num_gpt4,
124
  },
125
 
126
+ # azure openai
127
+ "azure-gpt35":{
128
+ "fn_with_ui": azure_ui,
129
+ "fn_without_ui": azure_noui,
130
+ "endpoint": get_conf("AZURE_ENDPOINT"),
131
+ "max_token": 4096,
132
+ "tokenizer": tokenizer_gpt35,
133
+ "token_cnt": get_token_num_gpt35,
134
+ },
135
+
136
  # api_2d
137
  "api2d-gpt-3.5-turbo": {
138
  "fn_with_ui": chatgpt_ui,
request_llm/bridge_azure_test.py ADDED
@@ -0,0 +1,241 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ 该文件中主要包含三个函数
3
+
4
+ 不具备多线程能力的函数:
5
+ 1. predict: 正常对话时使用,具备完备的交互功能,不可多线程
6
+
7
+ 具备多线程调用能力的函数
8
+ 2. predict_no_ui:高级实验性功能模块调用,不会实时显示在界面上,参数简单,可以多线程并行,方便实现复杂的功能逻辑
9
+ 3. predict_no_ui_long_connection:在实验过程中发现调用predict_no_ui处理长文档时,和openai的连接容易断掉,这个函数用stream的方式解决这个问题,同样支持多线程
10
+ """
11
+
12
+ import logging
13
+ import traceback
14
+ import importlib
15
+ import openai
16
+ import time
17
+
18
+
19
+ # 读取config.py文件中关于AZURE OPENAI API的信息
20
+ from toolbox import get_conf, update_ui, clip_history, trimmed_format_exc
21
+ TIMEOUT_SECONDS, MAX_RETRY, AZURE_ENGINE, AZURE_ENDPOINT, AZURE_API_VERSION, AZURE_API_KEY = \
22
+ get_conf('TIMEOUT_SECONDS', 'MAX_RETRY',"AZURE_ENGINE","AZURE_ENDPOINT", "AZURE_API_VERSION", "AZURE_API_KEY")
23
+
24
+
25
+ def get_full_error(chunk, stream_response):
26
+ """
27
+ 获取完整的从Openai返回的报错
28
+ """
29
+ while True:
30
+ try:
31
+ chunk += next(stream_response)
32
+ except:
33
+ break
34
+ return chunk
35
+
36
+ def predict(inputs, llm_kwargs, plugin_kwargs, chatbot, history=[], system_prompt='', stream = True, additional_fn=None):
37
+ """
38
+ 发送至azure openai api,流式获取输出。
39
+ 用于基础的对话功能。
40
+ inputs 是本次问询的输入
41
+ top_p, temperature是chatGPT的内部调优参数
42
+ history 是之前的对话列表(注意无论是inputs还是history,内容太长了都会触发token数量溢出的错误)
43
+ chatbot 为WebUI中显示的对话列表,修改它,然后yeild出去,可以直接修改对话界面内容
44
+ additional_fn代表点击的哪个按钮,按钮见functional.py
45
+ """
46
+ print(llm_kwargs["llm_model"])
47
+
48
+ if additional_fn is not None:
49
+ import core_functional
50
+ importlib.reload(core_functional) # 热更新prompt
51
+ core_functional = core_functional.get_core_functions()
52
+ if "PreProcess" in core_functional[additional_fn]: inputs = core_functional[additional_fn]["PreProcess"](inputs) # 获取预处理函数(如果有的话)
53
+ inputs = core_functional[additional_fn]["Prefix"] + inputs + core_functional[additional_fn]["Suffix"]
54
+
55
+ raw_input = inputs
56
+ logging.info(f'[raw_input] {raw_input}')
57
+ chatbot.append((inputs, ""))
58
+ yield from update_ui(chatbot=chatbot, history=history, msg="等待响应") # 刷新界面
59
+
60
+
61
+ payload = generate_azure_payload(inputs, llm_kwargs, history, system_prompt, stream)
62
+
63
+ history.append(inputs); history.append("")
64
+
65
+ retry = 0
66
+ while True:
67
+ try:
68
+
69
+ openai.api_type = "azure"
70
+ openai.api_version = AZURE_API_VERSION
71
+ openai.api_base = AZURE_ENDPOINT
72
+ openai.api_key = AZURE_API_KEY
73
+ response = openai.ChatCompletion.create(timeout=TIMEOUT_SECONDS, **payload);break
74
+
75
+ except:
76
+ retry += 1
77
+ chatbot[-1] = ((chatbot[-1][0], "获取response失败,重试中。。。"))
78
+ retry_msg = f",正在重试 ({retry}/{MAX_RETRY}) ……" if MAX_RETRY > 0 else ""
79
+ yield from update_ui(chatbot=chatbot, history=history, msg="请求超时"+retry_msg) # 刷新界面
80
+ if retry > MAX_RETRY: raise TimeoutError
81
+
82
+ gpt_replying_buffer = ""
83
+ is_head_of_the_stream = True
84
+ if stream:
85
+
86
+ stream_response = response
87
+
88
+ while True:
89
+ try:
90
+ chunk = next(stream_response)
91
+
92
+ except StopIteration:
93
+ from toolbox import regular_txt_to_markdown; tb_str = '```\n' + trimmed_format_exc() + '```'
94
+ chatbot[-1] = (chatbot[-1][0], f"[Local Message] 远程返回错误: \n\n{tb_str} \n\n{regular_txt_to_markdown(chunk)}")
95
+ yield from update_ui(chatbot=chatbot, history=history, msg="远程返回错误:" + chunk) # 刷新界面
96
+ return
97
+
98
+ if is_head_of_the_stream and (r'"object":"error"' not in chunk):
99
+ # 数据流的第一帧不携带content
100
+ is_head_of_the_stream = False; continue
101
+
102
+ if chunk:
103
+ #print(chunk)
104
+ try:
105
+ if "delta" in chunk["choices"][0]:
106
+ if chunk["choices"][0]["finish_reason"] == "stop":
107
+ logging.info(f'[response] {gpt_replying_buffer}')
108
+ break
109
+ status_text = f"finish_reason: {chunk['choices'][0]['finish_reason']}"
110
+ gpt_replying_buffer = gpt_replying_buffer + chunk["choices"][0]["delta"]["content"]
111
+
112
+ history[-1] = gpt_replying_buffer
113
+ chatbot[-1] = (history[-2], history[-1])
114
+ yield from update_ui(chatbot=chatbot, history=history, msg=status_text) # 刷新界面
115
+
116
+ except Exception as e:
117
+ traceback.print_exc()
118
+ yield from update_ui(chatbot=chatbot, history=history, msg="Json解析不合常规") # 刷新界面
119
+ chunk = get_full_error(chunk, stream_response)
120
+
121
+ error_msg = chunk
122
+ yield from update_ui(chatbot=chatbot, history=history, msg="Json异常" + error_msg) # 刷新界面
123
+ return
124
+
125
+
126
+ def predict_no_ui_long_connection(inputs, llm_kwargs, history=[], sys_prompt="", observe_window=None, console_slience=False):
127
+ """
128
+ 发送至AZURE OPENAI API,等待回复,一次性完成,不显示中间过程。但内部用stream的方法避免中途网线被掐。
129
+ inputs:
130
+ 是本次问询的输入
131
+ sys_prompt:
132
+ 系统静默prompt
133
+ llm_kwargs:
134
+ chatGPT的内部调优参数
135
+ history:
136
+ 是之前的对话列表
137
+ observe_window = None:
138
+ 用于负责跨越线程传递已经输出的部分,大部分时候仅仅为了fancy的视觉效果,留空即可。observe_window[0]:观测窗。observe_window[1]:看门狗
139
+ """
140
+ watch_dog_patience = 5 # 看门狗的耐心, 设置5秒即可
141
+ payload = generate_azure_payload(inputs, llm_kwargs, history, system_prompt=sys_prompt, stream=True)
142
+ retry = 0
143
+ while True:
144
+
145
+ try:
146
+ openai.api_type = "azure"
147
+ openai.api_version = AZURE_API_VERSION
148
+ openai.api_base = AZURE_ENDPOINT
149
+ openai.api_key = AZURE_API_KEY
150
+ response = openai.ChatCompletion.create(timeout=TIMEOUT_SECONDS, **payload);break
151
+
152
+ except:
153
+ retry += 1
154
+ traceback.print_exc()
155
+ if retry > MAX_RETRY: raise TimeoutError
156
+ if MAX_RETRY!=0: print(f'请求超时,正在重试 ({retry}/{MAX_RETRY}) ……')
157
+
158
+
159
+ stream_response = response
160
+ result = ''
161
+ while True:
162
+ try: chunk = next(stream_response)
163
+ except StopIteration:
164
+ break
165
+ except:
166
+ chunk = next(stream_response) # 失败了,重试一次?再失败就没办法了。
167
+
168
+ if len(chunk)==0: continue
169
+ if not chunk.startswith('data:'):
170
+ error_msg = get_full_error(chunk, stream_response)
171
+ if "reduce the length" in error_msg:
172
+ raise ConnectionAbortedError("AZURE OPENAI API拒绝了请求:" + error_msg)
173
+ else:
174
+ raise RuntimeError("AZURE OPENAI API拒绝了请求:" + error_msg)
175
+ if ('data: [DONE]' in chunk): break
176
+
177
+ delta = chunk["delta"]
178
+ if len(delta) == 0: break
179
+ if "role" in delta: continue
180
+ if "content" in delta:
181
+ result += delta["content"]
182
+ if not console_slience: print(delta["content"], end='')
183
+ if observe_window is not None:
184
+ # 观测窗,把已经获取的数据显示出去
185
+ if len(observe_window) >= 1: observe_window[0] += delta["content"]
186
+ # 看门狗,如果超过期限没有喂狗,则终止
187
+ if len(observe_window) >= 2:
188
+ if (time.time()-observe_window[1]) > watch_dog_patience:
189
+ raise RuntimeError("用户取消了程序。")
190
+ else: raise RuntimeError("意外Json结构:"+delta)
191
+ if chunk['finish_reason'] == 'length':
192
+ raise ConnectionAbortedError("正常结束,但显示Token不足,导致输出不完整,请削减单次输入的文本量。")
193
+ return result
194
+
195
+
196
+ def generate_azure_payload(inputs, llm_kwargs, history, system_prompt, stream):
197
+ """
198
+ 整合所有信息,选择LLM模型,生成 azure openai api请求,为发送请求做准备
199
+ """
200
+
201
+ conversation_cnt = len(history) // 2
202
+
203
+ messages = [{"role": "system", "content": system_prompt}]
204
+ if conversation_cnt:
205
+ for index in range(0, 2*conversation_cnt, 2):
206
+ what_i_have_asked = {}
207
+ what_i_have_asked["role"] = "user"
208
+ what_i_have_asked["content"] = history[index]
209
+ what_gpt_answer = {}
210
+ what_gpt_answer["role"] = "assistant"
211
+ what_gpt_answer["content"] = history[index+1]
212
+ if what_i_have_asked["content"] != "":
213
+ if what_gpt_answer["content"] == "": continue
214
+ messages.append(what_i_have_asked)
215
+ messages.append(what_gpt_answer)
216
+ else:
217
+ messages[-1]['content'] = what_gpt_answer['content']
218
+
219
+ what_i_ask_now = {}
220
+ what_i_ask_now["role"] = "user"
221
+ what_i_ask_now["content"] = inputs
222
+ messages.append(what_i_ask_now)
223
+
224
+ payload = {
225
+ "model": llm_kwargs['llm_model'],
226
+ "messages": messages,
227
+ "temperature": llm_kwargs['temperature'], # 1.0,
228
+ "top_p": llm_kwargs['top_p'], # 1.0,
229
+ "n": 1,
230
+ "stream": stream,
231
+ "presence_penalty": 0,
232
+ "frequency_penalty": 0,
233
+ "engine": AZURE_ENGINE
234
+ }
235
+ try:
236
+ print(f" {llm_kwargs['llm_model']} : {conversation_cnt} : {inputs[:100]} ..........")
237
+ except:
238
+ print('输入中可能存在乱码。')
239
+ return payload
240
+
241
+
toolbox.py CHANGED
@@ -1,11 +1,12 @@
1
  import markdown
2
  import importlib
3
- import traceback
4
  import inspect
5
  import re
6
  import os
7
  from latex2mathml.converter import convert as tex2mathml
8
  from functools import wraps, lru_cache
 
9
 
10
  """
11
  ========================================================================
@@ -70,6 +71,17 @@ def update_ui(chatbot, history, msg='正常', **kwargs): # 刷新界面
70
  assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时,可用clear将其清空,然后用for+append循环重新赋值。"
71
  yield chatbot.get_cookies(), chatbot, history, msg
72
 
 
 
 
 
 
 
 
 
 
 
 
73
  def trimmed_format_exc():
74
  import os, traceback
75
  str = traceback.format_exc()
@@ -83,7 +95,7 @@ def CatchException(f):
83
  """
84
 
85
  @wraps(f)
86
- def decorated(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT):
87
  try:
88
  yield from f(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT)
89
  except Exception as e:
@@ -210,16 +222,21 @@ def text_divide_paragraph(text):
210
  """
211
  将文本按照段落分隔符分割开,生成带有段落标签的HTML代码。
212
  """
 
 
 
 
 
213
  if '```' in text:
214
  # careful input
215
- return text
216
  else:
217
  # wtf input
218
  lines = text.split("\n")
219
  for i, line in enumerate(lines):
220
  lines[i] = lines[i].replace(" ", "&nbsp;")
221
  text = "</br>".join(lines)
222
- return text
223
 
224
  @lru_cache(maxsize=128) # 使用 lru缓存 加快转换速度
225
  def markdown_convertion(txt):
@@ -331,8 +348,11 @@ def format_io(self, y):
331
  if y is None or y == []:
332
  return []
333
  i_ask, gpt_reply = y[-1]
334
- i_ask = text_divide_paragraph(i_ask) # 输入部分太自由,预处理一波
335
- gpt_reply = close_up_code_segment_during_stream(gpt_reply) # 当代码输出半截的时候,试着补上后个```
 
 
 
336
  y[-1] = (
337
  None if i_ask is None else markdown.markdown(i_ask, extensions=['fenced_code', 'tables']),
338
  None if gpt_reply is None else markdown_convertion(gpt_reply)
@@ -380,7 +400,7 @@ def extract_archive(file_path, dest_dir):
380
  print("Successfully extracted rar archive to {}".format(dest_dir))
381
  except:
382
  print("Rar format requires additional dependencies to install")
383
- return '\n\n需要安装pip install rarfile来解压rar文件'
384
 
385
  # 第三方库,需要预先pip install py7zr
386
  elif file_extension == '.7z':
@@ -391,7 +411,7 @@ def extract_archive(file_path, dest_dir):
391
  print("Successfully extracted 7z archive to {}".format(dest_dir))
392
  except:
393
  print("7z format requires additional dependencies to install")
394
- return '\n\n需要安装pip install py7zr来解压7z文件'
395
  else:
396
  return ''
397
  return ''
@@ -420,6 +440,17 @@ def find_recent_files(directory):
420
 
421
  return recent_files
422
 
 
 
 
 
 
 
 
 
 
 
 
423
 
424
  def on_file_uploaded(files, chatbot, txt, txt2, checkboxes):
425
  """
@@ -459,14 +490,20 @@ def on_file_uploaded(files, chatbot, txt, txt2, checkboxes):
459
  return chatbot, txt, txt2
460
 
461
 
462
- def on_report_generated(files, chatbot):
463
  from toolbox import find_recent_files
464
- report_files = find_recent_files('gpt_log')
 
 
 
 
465
  if len(report_files) == 0:
466
  return None, chatbot
467
  # files.extend(report_files)
468
- chatbot.append(['报告如何远程获取?', '报告已经添加到右侧“文件上传区”(可能处于折叠状态),请查收。'])
469
- return report_files, chatbot
 
 
470
 
471
  def is_openai_api_key(key):
472
  API_MATCH_ORIGINAL = re.match(r"sk-[a-zA-Z0-9]{48}$", key)
@@ -728,6 +765,8 @@ def clip_history(inputs, history, tokenizer, max_token_limit):
728
  其他小工具:
729
  - zip_folder: 把某个路径下所有文件压缩,然后转移到指定的另一个路径中(gpt写的)
730
  - gen_time_str: 生成时间戳
 
 
731
  ========================================================================
732
  """
733
 
@@ -762,11 +801,16 @@ def zip_folder(source_folder, dest_folder, zip_name):
762
 
763
  print(f"Zip file created at {zip_file}")
764
 
 
 
 
 
 
 
765
  def gen_time_str():
766
  import time
767
  return time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
768
 
769
-
770
  class ProxyNetworkActivate():
771
  """
772
  这段代码定义了一个名为TempProxy的空上下文管理器, 用于给一小段代码上代理
@@ -775,12 +819,27 @@ class ProxyNetworkActivate():
775
  from toolbox import get_conf
776
  proxies, = get_conf('proxies')
777
  if 'no_proxy' in os.environ: os.environ.pop('no_proxy')
778
- os.environ['HTTP_PROXY'] = proxies['http']
779
- os.environ['HTTPS_PROXY'] = proxies['https']
 
780
  return self
781
 
782
  def __exit__(self, exc_type, exc_value, traceback):
783
  os.environ['no_proxy'] = '*'
784
  if 'HTTP_PROXY' in os.environ: os.environ.pop('HTTP_PROXY')
785
  if 'HTTPS_PROXY' in os.environ: os.environ.pop('HTTPS_PROXY')
786
- return
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  import markdown
2
  import importlib
3
+ import time
4
  import inspect
5
  import re
6
  import os
7
  from latex2mathml.converter import convert as tex2mathml
8
  from functools import wraps, lru_cache
9
+ pj = os.path.join
10
 
11
  """
12
  ========================================================================
 
71
  assert isinstance(chatbot, ChatBotWithCookies), "在传递chatbot的过程中不要将其丢弃。必要时,可用clear将其清空,然后用for+append循环重新赋值。"
72
  yield chatbot.get_cookies(), chatbot, history, msg
73
 
74
+ def update_ui_lastest_msg(lastmsg, chatbot, history, delay=1): # 刷新界面
75
+ """
76
+ 刷新用户界面
77
+ """
78
+ if len(chatbot) == 0: chatbot.append(["update_ui_last_msg", lastmsg])
79
+ chatbot[-1] = list(chatbot[-1])
80
+ chatbot[-1][-1] = lastmsg
81
+ yield from update_ui(chatbot=chatbot, history=history)
82
+ time.sleep(delay)
83
+
84
+
85
  def trimmed_format_exc():
86
  import os, traceback
87
  str = traceback.format_exc()
 
95
  """
96
 
97
  @wraps(f)
98
+ def decorated(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT=-1):
99
  try:
100
  yield from f(txt, top_p, temperature, chatbot, history, systemPromptTxt, WEB_PORT)
101
  except Exception as e:
 
222
  """
223
  将文本按照段落分隔符分割开,生成带有段落标签的HTML代码。
224
  """
225
+ pre = '<div class="markdown-body">'
226
+ suf = '</div>'
227
+ if text.startswith(pre) and text.endswith(suf):
228
+ return text
229
+
230
  if '```' in text:
231
  # careful input
232
+ return pre + text + suf
233
  else:
234
  # wtf input
235
  lines = text.split("\n")
236
  for i, line in enumerate(lines):
237
  lines[i] = lines[i].replace(" ", "&nbsp;")
238
  text = "</br>".join(lines)
239
+ return pre + text + suf
240
 
241
  @lru_cache(maxsize=128) # 使用 lru缓存 加快转换速度
242
  def markdown_convertion(txt):
 
348
  if y is None or y == []:
349
  return []
350
  i_ask, gpt_reply = y[-1]
351
+ # 输入部分太自由,预处理一波
352
+ if i_ask is not None: i_ask = text_divide_paragraph(i_ask)
353
+ # 当代码输出半截的时候,试着补上后个```
354
+ if gpt_reply is not None: gpt_reply = close_up_code_segment_during_stream(gpt_reply)
355
+ # process
356
  y[-1] = (
357
  None if i_ask is None else markdown.markdown(i_ask, extensions=['fenced_code', 'tables']),
358
  None if gpt_reply is None else markdown_convertion(gpt_reply)
 
400
  print("Successfully extracted rar archive to {}".format(dest_dir))
401
  except:
402
  print("Rar format requires additional dependencies to install")
403
+ return '\n\n解压失败! 需要安装pip install rarfile来解压rar文件'
404
 
405
  # 第三方库,需要预先pip install py7zr
406
  elif file_extension == '.7z':
 
411
  print("Successfully extracted 7z archive to {}".format(dest_dir))
412
  except:
413
  print("7z format requires additional dependencies to install")
414
+ return '\n\n解压失败! 需要安装pip install py7zr来解压7z文件'
415
  else:
416
  return ''
417
  return ''
 
440
 
441
  return recent_files
442
 
443
+ def promote_file_to_downloadzone(file, rename_file=None, chatbot=None):
444
+ # 将文件复制一份到下载区
445
+ import shutil
446
+ if rename_file is None: rename_file = f'{gen_time_str()}-{os.path.basename(file)}'
447
+ new_path = os.path.join(f'./gpt_log/', rename_file)
448
+ if os.path.exists(new_path) and not os.path.samefile(new_path, file): os.remove(new_path)
449
+ if not os.path.exists(new_path): shutil.copyfile(file, new_path)
450
+ if chatbot:
451
+ if 'file_to_promote' in chatbot._cookies: current = chatbot._cookies['file_to_promote']
452
+ else: current = []
453
+ chatbot._cookies.update({'file_to_promote': [new_path] + current})
454
 
455
  def on_file_uploaded(files, chatbot, txt, txt2, checkboxes):
456
  """
 
490
  return chatbot, txt, txt2
491
 
492
 
493
+ def on_report_generated(cookies, files, chatbot):
494
  from toolbox import find_recent_files
495
+ if 'file_to_promote' in cookies:
496
+ report_files = cookies['file_to_promote']
497
+ cookies.pop('file_to_promote')
498
+ else:
499
+ report_files = find_recent_files('gpt_log')
500
  if len(report_files) == 0:
501
  return None, chatbot
502
  # files.extend(report_files)
503
+ file_links = ''
504
+ for f in report_files: file_links += f'<br/><a href="file={os.path.abspath(f)}" target="_blank">{f}</a>'
505
+ chatbot.append(['报告如何远程获取?', f'报告已经添加到右侧“文件上传区”(可能处于折叠状态),请查收。{file_links}'])
506
+ return cookies, report_files, chatbot
507
 
508
  def is_openai_api_key(key):
509
  API_MATCH_ORIGINAL = re.match(r"sk-[a-zA-Z0-9]{48}$", key)
 
765
  其他小工具:
766
  - zip_folder: 把某个路径下所有文件压缩,然后转移到指定的另一个路径中(gpt写的)
767
  - gen_time_str: 生成时间戳
768
+ - ProxyNetworkActivate: 临时地启动代理网络(如果有)
769
+ - objdump/objload: 快捷的调试函数
770
  ========================================================================
771
  """
772
 
 
801
 
802
  print(f"Zip file created at {zip_file}")
803
 
804
+ def zip_result(folder):
805
+ import time
806
+ t = time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
807
+ zip_folder(folder, './gpt_log/', f'{t}-result.zip')
808
+ return pj('./gpt_log/', f'{t}-result.zip')
809
+
810
  def gen_time_str():
811
  import time
812
  return time.strftime("%Y-%m-%d-%H-%M-%S", time.localtime())
813
 
 
814
  class ProxyNetworkActivate():
815
  """
816
  这段代码定义了一个名为TempProxy的空上下文管理器, 用于给一小段代码上代理
 
819
  from toolbox import get_conf
820
  proxies, = get_conf('proxies')
821
  if 'no_proxy' in os.environ: os.environ.pop('no_proxy')
822
+ if proxies is not None:
823
+ if 'http' in proxies: os.environ['HTTP_PROXY'] = proxies['http']
824
+ if 'https' in proxies: os.environ['HTTPS_PROXY'] = proxies['https']
825
  return self
826
 
827
  def __exit__(self, exc_type, exc_value, traceback):
828
  os.environ['no_proxy'] = '*'
829
  if 'HTTP_PROXY' in os.environ: os.environ.pop('HTTP_PROXY')
830
  if 'HTTPS_PROXY' in os.environ: os.environ.pop('HTTPS_PROXY')
831
+ return
832
+
833
+ def objdump(obj, file='objdump.tmp'):
834
+ import pickle
835
+ with open(file, 'wb+') as f:
836
+ pickle.dump(obj, f)
837
+ return
838
+
839
+ def objload(file='objdump.tmp'):
840
+ import pickle, os
841
+ if not os.path.exists(file):
842
+ return
843
+ with open(file, 'rb') as f:
844
+ return pickle.load(f)
845
+
version CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "version": 3.37,
3
  "show_feature": true,
4
- "new_feature": "修复gradio复制按钮BUG <-> 修复PDF翻译的BUG, 新增HTML中英双栏对照 <-> 添加了OpenAI图片生成插件 <-> 添加了OpenAI音频转文本总结插件 <-> 通过Slack添加对Claude的支持 <-> 提供复旦MOSS模型适配(启用需额外依赖) <-> 提供docker-compose方案兼容LLAMA盘古RWKV等模型的后端 <-> 新增Live2D装饰 <-> 完善对话历史的保存/载入/删除 <-> 保存对话功能"
5
  }
 
1
  {
2
+ "version": 3.42,
3
  "show_feature": true,
4
+ "new_feature": "完善本地Latex矫错和翻译功能 <-> 增加gpt-3.5-16k的支持 <-> 新增最强Arxiv论文翻译插件 <-> 修复gradio复制按钮BUG <-> 修复PDF翻译的BUG, 新增HTML中英双栏对照 <-> 添加了OpenAI图片生成插件 <-> 添加了OpenAI音频转文本总结插件 <-> 通过Slack添加对Claude的支持"
5
  }