Tonic commited on
Commit
59c3706
β€’
1 Parent(s): dd4ffcb

Your commit message

Browse files
.github/workflows/update_readme.yml ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: Update README
2
+
3
+ on:
4
+ push:
5
+ paths:
6
+ - '.src/documentation/PROJECT.md'
7
+ - '.src/documentation/CODE_OF_CONDUCT.md'
8
+ - '.src/documentation/CONTRIBUTING.md'
9
+
10
+ jobs:
11
+ update-readme:
12
+ runs-on: ubuntu-latest
13
+
14
+ steps:
15
+ - name: Checkout repository
16
+ uses: actions/checkout@v2
17
+
18
+ - name: Combine markdown files
19
+ run: |
20
+ cat .src/documentation/PROJECT.md > README.md
21
+ echo -e "\n\n" >> README.md
22
+ cat .src/documentation/INSTALL.md > README.md
23
+ echo -e "\n\n" >> README.md
24
+ cat .src/documentation/CODE_OF_CONDUCT.md >> README.md
25
+ echo -e "\n\n" >> README.md
26
+ cat .src/documentation/CONTRIBUTING.md >> README.md
27
+
28
+ - name: Commit and push if changed
29
+ run: |
30
+ git config --global user.email "[email protected]"
31
+ git config --global user.name "GitHub Action"
32
+ git add README.md
33
+ git commit -m "Update README" || exit 0
34
+ git push
README.md CHANGED
@@ -1,10 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- license: mit
3
- title: SciTonic
4
- sdk: gradio
5
- emoji: πŸ†
6
- colorFrom: green
7
- colorTo: indigo
8
- app_file: main.py
9
- pinned: true
10
- ---
 
1
+ # Introducing πŸ§ͺπŸ‘©πŸ»β€πŸ”¬Sci-Tonic - Your Ultimate Technical Research Assistant πŸš€
2
+
3
+ ### Welcome to the Future of Technical Research: Sci-Tonic 🌐
4
+
5
+ In an era where data is king πŸ‘‘, the ability to efficiently gather, analyze, and present information is crucial for success across various fields. Today, we are thrilled to introduce Sci-Tonic πŸ€–, a state-of-the-art technical research assistant that revolutionizes how professionals, researchers, and enthusiasts interact with data. Whether it's financial figures πŸ’Ή, scientific articles 🧬, or complex texts πŸ“š, Sci-Tonic is your go-to solution for turning data into insights.
6
+
7
+ ## Features of Sci-Tonic 🌈
8
+
9
+ ### 1. Data Retrieval: A Gateway to Information πŸšͺπŸ“Š
10
+ - **Broad Spectrum Access**: From financial reports to scientific papers, Sci-Tonic accesses a wide array of data sources.
11
+ - **Efficiency and Precision**: Quickly fetches relevant data, saving you time and effort β°πŸ’Ό.
12
+
13
+ ### 2. Advanced Analysis: Deep Insights from Cutting-Edge AI πŸ§ πŸ’‘
14
+ - **Intelligent Interpretation**: Utilizes advanced AI algorithms to analyze and interpret complex data sets.
15
+ - **Customizable Analysis**: Tailored to meet specific research needs, providing targeted insights πŸ”.
16
+
17
+ ### 3. Multimedia Output: Diverse and Dynamic Presentation πŸ“πŸŽ₯πŸ“Š
18
+ - **Versatile Formats**: Outputs range from text and infographics to video summaries.
19
+ - **Engaging and Informative**: Enhances understanding and retention of information 🌟.
20
+
21
+ ### 4. User-Friendly Interface: Accessible to All πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»
22
+ - **Intuitive Design**: Easy to navigate for both tech experts and novices.
23
+ - **Seamless Experience**: Makes research not just productive but also enjoyable πŸŽ‰.
24
+
25
+ ### 5. Adaptive Technical Operator πŸ€–
26
+ - **High Performance**: Capable of handling complex analyses with ease.
27
+ - **On-the-Fly Adaptability**: Quickly adjusts to new data and user requests πŸŒͺ️.
28
+
29
+ ## Applications of Sci-Tonic πŸ› οΈ
30
+ - **Academic Research**: Streamlines the process of gathering and analyzing scientific data πŸŽ“πŸ”¬.
31
+ - **Financial Analysis**: Provides comprehensive insights into market trends and financial reports πŸ’Ή.
32
+ - **Business Intelligence**: Assists in making data-driven decisions for business strategies πŸ“ˆ.
33
+ - **Personal Use**: Aids enthusiasts in exploring data in their fields of interest 🌍.
34
+
35
+ ## Choose Sci-Tonic? πŸ€”
36
+ - **Efficiency**: Saves time and effort in data collection and analysis ⏳.
37
+ - **Accuracy**: Provides reliable and precise insights πŸ”Ž.
38
+ - **Customization**: Adapts to specific user needs and preferences πŸ› οΈ.
39
+ - **Innovation**: Employs the latest AI technology for data analysis πŸš€.
40
+
41
+
42
+ ### Installation πŸ“₯
43
+ ```bash
44
+ # Clone the repository
45
+ git clone https://github.com/Tonic-AI/scitonic.git
46
+
47
+ # Navigate to the repository
48
+ cd scitonic
49
+
50
+ # Install dependencies
51
+ pip install -r requirements.txt
52
+
53
+ # Run the application
54
+ python main.py
55
+ ```
56
+
57
+ ## Usage 🚦
58
+
59
+ 1. **Installation**: Before you begin, ensure you have Sci-Tonic installed. If not, refer to our installation guide. πŸ“₯
60
+
61
+ 2. **Open the Application**: Launch Sci-Tonic to start your journey into data exploration. 🌐
62
+
63
+ ## Setting Up Your Environment πŸ› οΈ
64
+
65
+ 1. **Enter OpenAI API Key**:
66
+ - Locate the `OpenAI API Key` textbox.
67
+ - Enter your API key securely. This key powers the AI models in Sci-Tonic. πŸ”‘
68
+
69
+ 2. **Enter Clarifai PAT**:
70
+ - Find the `Clarifai PAT` textbox.
71
+ - Input your Clarifai Personal Access Token. This is crucial for image and audio processing functionalities. πŸ–ΌοΈπŸŽ™οΈ
72
+
73
+ ## Describing Your Problem πŸ“
74
+
75
+ 1. **Text Input**:
76
+ - Use the `Describe your problem in detail:` textbox to type in your query or problem statement.
77
+ - Be as detailed as possible for the best results. πŸ“ƒ
78
+
79
+ 2. **Audio Input** (Optional):
80
+ - Click on `Or speak your problem here:` to record or upload an audio clip.
81
+ - Sci-Tonic will transcribe and process your spoken words. 🎀
82
+
83
+ 3. **Image Input** (Optional):
84
+ - Use `Or upload an image related to your problem:` to add an image.
85
+ - This can provide visual context to your query. πŸ–ΌοΈ
86
+
87
+ ## Submitting Your Query πŸš€
88
+
89
+ - Click the `Submit` button after entering your information and query.
90
+ - Sci-Tonic will process your inputs and start generating insights. ✨
91
+
92
+ ## Receiving Output πŸ“Š
93
+
94
+ - The `Output` textbox will display the results, insights, or answers generated by Sci-Tonic.
95
+ - **Scitonic produces files** so check the scitonic folder
96
+ - Review the output to gain valuable information related to your query. 🧐
97
+
98
+ ## Tips for Optimal Use 🌈
99
+
100
+ - **Clear Descriptions**: The more specific your query, the better the output. 🎯
101
+ - **Utilize Multimedia Inputs**: Leverage audio and image inputs for a more comprehensive analysis. πŸ“ΈπŸ”Š
102
+ - **Regular Updates**: Keep your API keys and tokens updated for uninterrupted service. πŸ”
103
+
104
+ # CONTRIBUTING GUIDE
105
+
106
+ ## Introduction
107
+ Welcome to the `scitonic` repository! This guide is designed to provide a streamlined process for contributing to our project. We value your input and are excited to collaborate with you.
108
+
109
+ ## Prerequisites
110
+ Before contributing, make sure you have a GitHub account. You should also join our Tonic-AI Discord to communicate with other contributors and the core team.
111
+
112
+ ## How to Contribute
113
+
114
+ ### Reporting Issues
115
+ - **Create an Issue**: If you find a bug or have a feature request, please create an issue to report it. Use clear and descriptive titles and provide as much information as possible.
116
+ - **Use the Issue Template**: Follow the issue template provided to ensure all relevant information is included.
117
+ - **Discuss in Discord**: For immediate feedback or discussion, bring up your issue in the `#scitonic-discussion` channel on Discord.
118
+
119
+ ### Making Changes
120
+ - **Fork the Repository**: Start by forking the repository to your own GitHub account.
121
+ - **Create a Branch**: Create a branch in your forked repository for your proposed changes. Name the branch something relevant to the changes you're making (e.g., `feature-add-login` or `bugfix-header-alignment`).
122
+ ```bash
123
+ git checkout -b your-branch-name
124
+ ```
125
+ - **Make Your Changes**: Perform the necessary changes to the codebase or documentation.
126
+ - **Commit Your Changes**: Use meaningful commit messages that describe what you've done.
127
+
128
+ ```bash
129
+ git commit -m "Your detailed commit message"
130
+ ```
131
+
132
+ - **Push to Your Fork**: Push your changes to your forked repository on GitHub.
133
+
134
+ ```bash
135
+ git push origin your-branch-name
136
+ ```
137
+
138
+ ### Submitting a Pull Request
139
+ - **Pull Request (PR)**: Go to the original `scitonic` repository and click on "Pull Request" to start the process.
140
+ - **PR Template**: Fill in the PR template with all the necessary details, linking the issue you're addressing.
141
+ - **Code Review**: Wait for the core team or community to review your PR. Be responsive to feedback.
142
+ - **Merge**: Once your PR has been approved and passes all checks, it will be merged into the main codebase.
143
+
144
+ ## Code of Conduct
145
+ Please adhere to the Code of Conduct laid out in the `CODE_OF_CONDUCT.md` [file](src/documentation/CODE_OF_CONDUCT.md). Respectful collaboration is key to a healthy open-source environment.
146
+
147
+ ## Questions or Additional Help
148
+ If you need further assistance or have any questions, please don't hesitate to ask in our Discord community or directly in GitHub issues.
149
+
150
+ Thank you for contributing to `scitonic`!
151
+
152
  ---
153
+
154
+ 🌟 Thank you for considering Sci-Tonic as your ultimate technical research assistant. Together, let's turn data into discoveries! πŸš€πŸŒŸπŸ”πŸ§¬πŸ“ˆπŸ“ŠπŸ“šπŸ€–πŸ‘©β€πŸ”¬πŸ‘¨β€πŸ’Ό
 
 
 
 
 
 
 
eval/generate_bookmarks.ipynb ADDED
@@ -0,0 +1,175 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "nbformat": 4,
3
+ "nbformat_minor": 0,
4
+ "metadata": {
5
+ "colab": {
6
+ "provenance": []
7
+ },
8
+ "kernelspec": {
9
+ "name": "python3",
10
+ "display_name": "Python 3"
11
+ },
12
+ "language_info": {
13
+ "name": "python"
14
+ }
15
+ },
16
+ "cells": [
17
+ {
18
+ "cell_type": "code",
19
+ "execution_count": 19,
20
+ "metadata": {
21
+ "colab": {
22
+ "base_uri": "https://localhost:8080/"
23
+ },
24
+ "id": "9CrIbR0AK3d1",
25
+ "outputId": "8624a380-d370-43b0-969c-1b21e275b322"
26
+ },
27
+ "outputs": [
28
+ {
29
+ "output_type": "stream",
30
+ "name": "stdout",
31
+ "text": [
32
+ "Requirement already satisfied: typing_extensions in /usr/local/lib/python3.10/dist-packages (4.9.0)\n"
33
+ ]
34
+ }
35
+ ],
36
+ "source": [
37
+ "# !pip install openai sentence-transformers\n",
38
+ "# !pip install langchain\n",
39
+ "!pip install typing_extensions\n"
40
+ ]
41
+ },
42
+ {
43
+ "cell_type": "code",
44
+ "source": [
45
+ "import os\n",
46
+ "import openai\n",
47
+ "from langchain_community.document_loaders import TextLoader, PyPDFLoader, CSVLoader, DirectoryLoader\n",
48
+ "from transformers import AutoModel\n",
49
+ "from langchain_community.embeddings.sentence_transformer import (\n",
50
+ " SentenceTransformerEmbeddings,\n",
51
+ ")\n",
52
+ "from langchain_community.vectorstores import Chroma\n",
53
+ "import torch\n",
54
+ "import json"
55
+ ],
56
+ "metadata": {
57
+ "id": "xOFM83MoLQ-B"
58
+ },
59
+ "execution_count": 20,
60
+ "outputs": []
61
+ },
62
+ {
63
+ "cell_type": "code",
64
+ "source": [
65
+ "from google.colab import drive\n",
66
+ "drive.mount('new_articles')"
67
+ ],
68
+ "metadata": {
69
+ "colab": {
70
+ "base_uri": "https://localhost:8080/"
71
+ },
72
+ "id": "WMvNDl83M7Xb",
73
+ "outputId": "d59ab804-42ce-4b10-fee6-f01f19d60b38"
74
+ },
75
+ "execution_count": 53,
76
+ "outputs": [
77
+ {
78
+ "output_type": "stream",
79
+ "name": "stdout",
80
+ "text": [
81
+ "Drive already mounted at new_articles; to attempt to forcibly remount, call drive.mount(\"new_articles\", force_remount=True).\n"
82
+ ]
83
+ }
84
+ ]
85
+ },
86
+ {
87
+ "cell_type": "code",
88
+ "source": [
89
+ "def document_loader(directory):\n",
90
+ " documents = {}\n",
91
+ " for filename in os.listdir(directory):\n",
92
+ " file_path = os.path.join(directory, filename)\n",
93
+ " if filename.endswith(\".csv\"):\n",
94
+ " loader = CSVLoader(file_path)\n",
95
+ " elif filename.endswith(\".pdf\"):\n",
96
+ " loader = PyPDFLoader(file_path)\n",
97
+ " elif filename.endswith(\".txt\"):\n",
98
+ " loader = TextLoader(file_path)\n",
99
+ " else:\n",
100
+ " break\n",
101
+ "\n",
102
+ " document = loader.load()\n",
103
+ " documents[filename] = document\n",
104
+ " return (documents)\n"
105
+ ],
106
+ "metadata": {
107
+ "id": "QxVY8IyNL3Zp"
108
+ },
109
+ "execution_count": 54,
110
+ "outputs": []
111
+ },
112
+ {
113
+ "cell_type": "code",
114
+ "source": [
115
+ "openai.api_key = \"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\"\n",
116
+ "JSON_DATA = []\n",
117
+ "directory = \"/content/new_articles/MyDrive/new_articles\"\n",
118
+ "documents = document_loader(directory)\n",
119
+ "for filename, document in documents.items():\n",
120
+ " doc = document[0].page_content\n",
121
+ " # print(filename)\n",
122
+ " # print(document)\n",
123
+ " response = openai.chat.completions.create(\n",
124
+ " model=\"gpt-3.5-turbo\",\n",
125
+ " messages = [\n",
126
+ " {\"role\": \"system\", \"content\": f\"Generate one Question, Answer,Reference_Article:(use {filename}), Reference_Text from(use block of text which you've used to generate answer {doc})\"},\n",
127
+ " ], temperature = 0.3\n",
128
+ " )\n",
129
+ " #print(response)\n",
130
+ " result = response.choices[0].message.content.split(\"\\n\")\n",
131
+ " # print(result)\n",
132
+ " json_data = {\n",
133
+ " \"Question\": result[0].split(\"Question: \")[1].strip() if len(result) > 0 and \"Question:\" in result[0] else \"Not provided\",\n",
134
+ " \"Answer\": result[2].split(\"Answer: \")[1].strip() if len(result) > 2 and \"Answer:\" in result[2] else \"Not provided\",\n",
135
+ " \"Reference_article\": result[4].split(\"Reference_article: \")[1].strip() if len(result) > 4 and \"Reference_article:\" in result[4] else \"Not provided\",\n",
136
+ " \"Reference_text\": result[6].split(\"Reference_text: \")[1].strip() if len(result) > 6 and \"Reference_text:\" in result[6] else \"Not provided\",\n",
137
+ " }\n",
138
+ "\n",
139
+ " # print(json_data)\n",
140
+ "\n",
141
+ " JSON_DATA.append(json_data)\n",
142
+ "\n",
143
+ "with open('question_and_answer_list.json', 'w') as json_file:\n",
144
+ " json.dump(JSON_DATA, json_file, indent=2)\n",
145
+ "\n",
146
+ "print(\"JSON data saved to question_and_answer_list.json\")\n",
147
+ "\n",
148
+ "print(JSON_DATA)\n"
149
+ ],
150
+ "metadata": {
151
+ "id": "LO9imR5SMA1u"
152
+ },
153
+ "execution_count": null,
154
+ "outputs": []
155
+ },
156
+ {
157
+ "cell_type": "code",
158
+ "source": [],
159
+ "metadata": {
160
+ "id": "eOAr3cy6iA9J"
161
+ },
162
+ "execution_count": 46,
163
+ "outputs": []
164
+ },
165
+ {
166
+ "cell_type": "code",
167
+ "source": [],
168
+ "metadata": {
169
+ "id": "E86P5xBqizsG"
170
+ },
171
+ "execution_count": null,
172
+ "outputs": []
173
+ }
174
+ ]
175
+ }
eval/question_and_answer_list.json CHANGED
@@ -1,63 +1,68 @@
1
  [
2
  {
3
- "question": "What does Pando plan to use the $30 million raised in its recent Series B round for?",
4
- "answer": "Pando intends to use the funds for expanding its global sales, marketing, and delivery capabilities.",
5
- "reference_article": "Pando Raises $30M in Series B Funding for Fulfillment Management Technologies",
6
- "reference_text": "Signaling that investments in the supply chain sector remain robust, Pando, a startup developing fulfillment management technologies, today announced that it raised $30 million in a Series B round, bringing its total raised to $45 million. Iron Pillar and Uncorrelated Ventures led the round, with participation from existing investors Nexus Venture Partners, Chiratae Ventures and Next47. CEO and founder Nitin Jayakrishnan says that the new capital will be put toward expanding Pando’s global sales, marketing and delivery capabilities."
7
-
8
- },
9
- {
10
- "question": "What is ChatGPT, and how has it been used in various applications?",
11
- "answer": "ChatGPT is a text-generating AI chatbot developed by OpenAI. It has been widely used for writing essays, code, and more based on short text prompts, enhancing productivity. Major brands have experimented with it for generating ad and marketing copy. OpenAI continually invests in ChatGPT, upgrading it to GPT-4, a more advanced language-writing model. The chatbot has been integrated into various applications, including search engines, customer service, and even an iPhone customization app called SuperChat.",
12
- "reference_article": "ChatGPT: Everything you need to know about the AI-powered chatbot",
13
- "reference_text": "ChatGPT, OpenAI’s text-generating AI chatbot, has taken the world by storm. It’s able to write essays, code and more given short text prompts, hyper-charging productivity. But it also has a more…nefarious side... (full article)"
14
- },
15
- {
16
- "question": "What is Checks, and how is it transitioning within Google?",
17
- "answer": "Checks is an AI-powered tool developed at Google's in-house incubator Area 120 to check mobile apps for compliance with privacy rules and regulations. Originally part of Area 120, Checks is now officially moving into Google as a privacy product for mobile developers. Co-founders Fergus Hurley and Nia Castelly will hold the titles of GM and Legal Lead, respectively, for Checks under Google. The tool utilizes artificial intelligence and machine learning to scan apps and their code, identifying potential privacy and data protection rule violations. It provides suggestions for remediation, making it easier for developers to ensure compliance.",
18
- "reference_article": "Google integrates AI tool Checks into its privacy-focused products",
19
- "reference_text": "After Google cut all but three of the projects at its in-house incubator Area 120 and shifted it to work on AI projects across Google, one of the legacy efforts β€” coincidentally also an AI project β€” is now officially exiting to Google. Checks, an AI-powered tool to check mobile apps for compliance with various privacy rules and regulations, is moving into Google proper as a privacy product aimed at mobile developers..."
20
- },
21
- {
22
- "question": "What acquisition has Databricks recently announced, and what is the focus of the acquired company's technology?",
23
- "answer": "Databricks has recently acquired Okera, a data governance platform with a focus on AI. Okera's technology uses an AI-powered system to automatically discover, classify, and apply rules to personally identifiable information, with a particular emphasis on metadata. Additionally, Okera's isolation technology enforces governance control on arbitrary workloads without significant overhead. Databricks plans to integrate Okera's technology into its Unity Catalog, enhancing its existing governance solution for data and AI assets.",
24
- "reference_article": "Databricks acquires AI-focused data governance platform Okera",
25
- "reference_text": "Databricks today announced that it has acquired Okera, a data governance platform with a focus on AI. The two companies did not disclose the purchase price. According to Crunchbase, Okera previously raised just under $30 million. Investors include Felicis, Bessemer Venture Partners, Cyber Mentor Fund, ClearSky and Emergent Ventures."
26
- },
27
- {
28
- "question": "What is the latest evolution in Slack's platform, particularly concerning AI, as announced at the Salesforce World Tour event?",
29
- "answer": "Slack has advanced from a pure communications platform to one facilitating direct integration with enterprise applications. At the Salesforce World Tour event in NYC, the company unveiled plans to place AI at the forefront of the user experience, aiming to enhance information retrieval and workflow creation. Notably, these features are still in development. The incorporation of AI into Slack involves various integrations, including SlackGPT, the company's generative AI built on the Slack platform. SlackGPT leverages the wealth of institutional knowledge within Slack's messages, files, and shared content to enable users and developers to build AI-driven experiences. The goal is to bring AI natively into the user experience with features like AI-powered conversation summaries and writing assistance directly available in Slack. Additionally, developers can integrate AI into workflows, tapping into external apps and large language models. EinsteinGPT, Salesforce's generative AI, will also be integrated into Slack, allowing employees to ask questions directly related to Salesforce content, enhancing teams' understanding of customer data. While these capabilities are still in development, Slack aims to provide users with flexibility and choice in incorporating AI into their work. SlackGPT and EinsteinGPT integration are in the development phase, but developers can already build custom integrations with various large language models (LLMs). Workflow Builder with SlackGPT AI connectors will be available this summer, allowing customers to connect ChatGPT or Claude to workflows or build custom connectors for their own LLMs.",
30
- "reference_article": "Slack integrates AI into its platform, unveiling plans for AI-driven experiences",
31
- "reference_text": "Slack has evolved from a pure communications platform to one that enables companies to link directly to enterprise applications without having to resort to dreaded task switching. Today, at the Salesforce World Tour event in NYC, the company announced the next step in its platform’s evolution where it will be putting AI at the forefront of the user experience, making it easier to get information and build workflows...\n"
32
- },
33
- {
34
- "question": "What are the two new products announced by Nova, the startup building generative AI tools to protect brand integrity?",
35
- "answer": "Nova has announced two new products: BrandGuard and BrandGPT. BrandGuard ingests a company's brand guidelines and style guide, using a series of models to check content against those rules for compliance, quality, adherence to style, and alignment with campaign goals. BrandGPT serves as an interface for asking questions about a brand's content rules in a ChatGPT-style interaction. These tools are designed to help brands safeguard their brand integrity when incorporating generative AI into their creative workflows.",
36
- "reference_article": "Nova introduces BrandGuard and BrandGPT to protect brand integrity in AI-generated content",
37
- "reference_text": "Nova is an early-stage startup building a suite of generative AI tools designed to protect brand integrity, and today, the company is announcing two new products to help brands police AI-generated content: BrandGuard and BrandGPT."
38
- },
39
- {
40
- "question": "What is the startup Spawning AI doing to address the legal issues between artists and companies training AI on their artwork?",
41
- "answer": "Spawning AI, co-founded by Jordan Meyer and Mathew Dryhurst, has created HaveIBeenTrained, a website that allows creators to opt out of the training dataset for one art-generating AI model called Stable Diffusion v3. Spawning raised $3 million in a seed round led by True Ventures to further develop IP standards for the AI era, establish more robust opt-out and opt-in standards, and build the consent layer for AI. The company aims to make it easier for AI model trainers to honor opt-out requests, offer more services to organizations protecting artists' work, and grow to address different domains in the AI economy.",
42
- "reference_article": "Spawning raises $3M to help artists opt out of AI training data",
43
- "reference_text": "In an effort to grant artists more control over how β€” and where β€” their art’s used, Jordan Meyer and Mathew Dryhurst co-founded the startup Spawning AI. Spawning created HaveIBeenTrained, a website that allows creators to opt out of the training dataset for one art-generating AI model, Stable Diffusion v3, due to be released in the coming months."
44
- },
45
- {
46
- "question": "What is the U.K.'s Competition and Markets Authority (CMA) reviewing regarding AI?",
47
- "answer": "The CMA is conducting an initial review of 'AI foundational models,' which include large language models (LLMs) like OpenAI's ChatGPT and Microsoft's New Bing. The review aims to explore competition and consumer protection considerations in the development and use of AI foundational models. The CMA will examine how competitive markets for these models could evolve, explore opportunities and risks for competition and consumer protection, and produce guiding principles to support competition and protect consumers as AI foundation models develop. The review is in line with the U.K. government's instructions to regulators to analyze potential enforcements related to dangerous, unfair, and unaccountable applications of AI.",
48
- "reference_article": "U.K. competition watchdog launches review of AI foundational models",
49
- "reference_text": "The U.K.’s competition watchdog has announced an initial review of β€œAI foundational models”, such as the large language models (LLMs) which underpin OpenAI’s ChatGPT and Microsoft’s New Bing. Generative AI models which power AI art platforms such as OpenAI’s DALL-E or Midjourney will also likely fall in scope."
50
- },
51
- {
52
- "question": "What is StarCoder and who developed it?",
53
- "answer": "StarCoder is a free alternative to code-generating AI systems, similar to GitHub's Copilot, developed by AI startup Hugging Face and ServiceNow Research, ServiceNow’s R&D division. It is part of Hugging Face's and ServiceNow’s BigCode project, which involves over 600 contributors. StarCoder is licensed for royalty-free use and was trained on over 80 programming languages using text from GitHub repositories. It integrates with Microsoft's Visual Studio Code code editor and claims to match or outperform the AI model from OpenAI used in the initial versions of Copilot.",
54
- "reference_article": "Hugging Face and ServiceNow Research release StarCoder, a free alternative to GitHub Copilot",
55
- "reference_text": "AI startup Hugging Face and ServiceNow Research, ServiceNow’s R&D division, have released StarCoder, a free alternative to code-generating AI systems along the lines of GitHub’s Copilot."
56
- },
57
- {
58
- "question": "What are the new features coming to Bing, and how does Microsoft plan to enhance its search experience?",
59
- "answer": "Microsoft is introducing new features to enhance Bing's search experience, focusing on AI and visual elements. Bing Chat, powered by OpenAI's GPT-4 and DALL-E 2 models, will offer more image- and graphic-centric answers. The chatbot will become more visual and personalized, allowing users to export their chat histories and integrate content from third-party plugins. Bing Chat will answer questions within the context of images. Bing will also improve transparency by providing citations for fact-based responses. The Bing Image Creator tool will understand more languages, and Bing Chat will gain the ability to create charts and graphs. Microsoft aims to make Bing more multimodal, allowing users to upload images for related searches. New chat features include chat history storage, export and share functionalities, and the addition of plugins from partners like OpenTable and Wolfram Alpha. Edge, Microsoft's browser, will also receive updates, featuring rounded corners, improved design elements, and actions that translate Bing Chat prompts into automations within the browser.",
60
- "reference_article": "Microsoft doubles down on AI with new Bing features",
61
- "reference_text": "Microsoft is embarking on the next phase of Bing’s expansion. And β€” no surprise β€” it heavily revolves around AI."
62
- }
63
- ]
 
 
 
 
 
 
1
  [
2
  {
3
+ "Question": "What are some of the new integrations and features that Slack announced to incorporate AI into its platform?",
4
+ "Answer": "Slack announced several new integrations and features to incorporate AI into its platform. These include SlackGPT, a generative AI built on top of the Slack platform that users and developers can tap into to build AI-driven experiences. Slack is also bringing AI natively into the user experience with features like AI-powered conversation summaries and writing assistance. Additionally, Slack will incorporate EinsteinGPT, Salesforce's generative AI, to provide insights from real-time customer data in Salesforce directly into Slack. These integrations are still in development, but developers can currently build custom integrations with a variety of large language models (LLMs).",
5
+ "Reference_article": "Not provided",
6
+ "Reference_text": "Not provided"
7
+ },
8
+ {
9
+ "Question": "What is Checks and how does it help mobile developers with privacy compliance?",
10
+ "Answer": "Checks is an AI-powered tool developed by Google to help mobile developers ensure compliance with privacy rules and regulations. It uses artificial intelligence and machine learning to scan apps and their code, identifying potential violations of privacy and data protection rules. Checks provides remediation suggestions on how to fix these issues, making it easier for developers to address privacy concerns. The tool is integrated with Google's language models and app understanding technologies to power its identification and suggestion capabilities. It offers a dashboard for monitoring and triaging compliance issues in areas such as compliance monitoring, data monitoring, and store disclosure support. While it currently focuses on Google Play data safety, it may expand to include Apple App Store data safety in the future.",
11
+ "Reference_article": "Not provided",
12
+ "Reference_text": "Not provided"
13
+ },
14
+ {
15
+ "Question": "What are the two new products that Nova has announced to help brands police AI-generated content?",
16
+ "Answer": "The two new products that Nova has announced to help brands police AI-generated content are BrandGuard and BrandGPT.",
17
+ "Reference_article": "Not provided",
18
+ "Reference_text": "Not provided"
19
+ },
20
+ {
21
+ "Question": "What is Pando's approach to solving supply chain challenges and how does it differentiate itself from other vendors?",
22
+ "Answer": "Pando aims to solve supply chain challenges by consolidating supply chain data from various sources and providing tools and apps for different tasks across freight procurement, trade and transport management, and more. It differentiates itself from other vendors like SAP and Oracle by offering no-code capabilities, allowing business users to customize the apps without the need for IT resources. Pando also utilizes algorithms and machine learning to make predictions and detect anomalies in the supply chain.",
23
+ "Reference_article": "Not provided",
24
+ "Reference_text": "Not provided"
25
+ },
26
+ {
27
+ "Question": "What are some controversies surrounding ChatGPT?",
28
+ "Answer": "ChatGPT has been involved in several controversies. Discord integrated OpenAI's technology into its bot named Clyde, which was tricked into providing instructions for making illegal drugs and incendiary mixtures. There have been cases of ChatGPT accusing individuals of false crimes, and it has been banned by some school systems and colleges for promoting plagiarism and misinformation. Additionally, there have been concerns about defamation and the use of AI-generated content for SEO farming.",
29
+ "Reference_article": "Not provided",
30
+ "Reference_text": "Not provided"
31
+ },
32
+ {
33
+ "Question": "What are the concerns of writers regarding the use of AI in the entertainment industry?",
34
+ "Answer": "Writers are concerned that the use of AI in the entertainment industry could undermine their working conditions and devalue their labor. They argue that AI-generated content should not be considered as writers' work and that their job involves more than just scriptwriting. They also worry that studios may use AI as a way to demand more from writers in a shorter period of time without adequately compensating them. Additionally, the legal status of AI-generated content remains unclear, which further complicates the issue.",
35
+ "Reference_article": "Not provided",
36
+ "Reference_text": "Not provided"
37
+ },
38
+ {
39
+ "Question": "What new features is Microsoft adding to Bing?",
40
+ "Answer": "Not provided",
41
+ "Reference_article": "Not provided",
42
+ "Reference_text": "Not provided"
43
+ },
44
+ {
45
+ "Question": "What is the U.K.'s competition watchdog reviewing in relation to AI?",
46
+ "Answer": "Not provided",
47
+ "Reference_article": "Not provided",
48
+ "Reference_text": "Not provided"
49
+ },
50
+ {
51
+ "Question": "What is Spawning AI's solution to give artists more control over how their art is used in generative AI models?",
52
+ "Answer": "Not provided",
53
+ "Reference_article": "Not provided",
54
+ "Reference_text": "Not provided"
55
+ },
56
+ {
57
+ "Question": "What is StarCoder and how does it compare to other code-generating AI systems?",
58
+ "Answer": "Not provided",
59
+ "Reference_article": "Not provided",
60
+ "Reference_text": "Not provided"
61
+ },
62
+ {
63
+ "Question": "What is the focus of Okera's data governance platform and how does it use AI technology?",
64
+ "Answer": "Okera's data governance platform focuses on AI and uses AI-powered systems to automatically discover and classify personally identifiable information, tag it, and apply rules to it. It utilizes a no-code interface and emphasizes metadata.",
65
+ "Reference_article": "Not provided",
66
+ "Reference_text": "Not provided"
67
+ }
68
+ ]
main.py CHANGED
@@ -1,6 +1,7 @@
1
  import os
2
  import gradio as gr
3
  import autogen
 
4
  from src.mapper.e5map import E5Mapper
5
  from src.mapper.scimap import scimap
6
  from src.mapper.parser import MapperParser
@@ -12,7 +13,7 @@ this is a highly adaptive technical operator that will listen to your query and
12
  """
13
 
14
  def update_config_file(api_key):
15
- config_path = "./config/OAI_CONFIG_LIST.json"
16
  with open(config_path, "r") as file:
17
  config = json.load(file)
18
 
@@ -67,7 +68,7 @@ def process_query(oai_key, query, max_auto_reply):
67
  update_config_file(oai_key)
68
  os.environ['OAI_KEY'] = oai_key
69
  llm_config = autogen.config_list_from_json(
70
- env_or_file="./config/OAI_CONFIG_LIST.json",
71
  filter_dict={"model": {"gpt-4", "gpt-3.5-turbo-16k", "gpt-4-1106-preview"}}
72
  )
73
 
@@ -118,10 +119,10 @@ def main():
118
  txt_pat = gr.Textbox(label="Clarifai PAT", type="password", placeholder="Enter Clarifai PAT here")
119
  txt_query = gr.Textbox(label="Describe your problem in detail:")
120
  txt_max_auto_reply = gr.Number(label="Max Auto Replies", value=50)
121
- audio_input = gr.Audio(label="Or speak your problem here:", type="numpy")
122
- image_input = gr.Image(label="Or upload an image related to your problem:", type="numpy")
123
  btn_submit = gr.Button("Submit")
124
- output = gr.Textbox(label="Output")
125
 
126
  def process_and_submit(oai_key, pat, query, max_auto_reply, audio, image):
127
  os.environ['CLARIFAI_PAT'] = pat
 
1
  import os
2
  import gradio as gr
3
  import autogen
4
+ import json
5
  from src.mapper.e5map import E5Mapper
6
  from src.mapper.scimap import scimap
7
  from src.mapper.parser import MapperParser
 
13
  """
14
 
15
  def update_config_file(api_key):
16
+ config_path = "./src/config/OAI_CONFIG_LIST.json"
17
  with open(config_path, "r") as file:
18
  config = json.load(file)
19
 
 
68
  update_config_file(oai_key)
69
  os.environ['OAI_KEY'] = oai_key
70
  llm_config = autogen.config_list_from_json(
71
+ env_or_file="./src/config/OAI_CONFIG_LIST.json",
72
  filter_dict={"model": {"gpt-4", "gpt-3.5-turbo-16k", "gpt-4-1106-preview"}}
73
  )
74
 
 
119
  txt_pat = gr.Textbox(label="Clarifai PAT", type="password", placeholder="Enter Clarifai PAT here")
120
  txt_query = gr.Textbox(label="Describe your problem in detail:")
121
  txt_max_auto_reply = gr.Number(label="Max Auto Replies", value=50)
122
+ audio_input = gr.Audio(label="Or speak your problem here:", type="numpy",)
123
+ image_input = gr.Image(label="Or upload an image related to your problem:", type="numpy", )
124
  btn_submit = gr.Button("Submit")
125
+ output = gr.Textbox(label="Output",)
126
 
127
  def process_and_submit(oai_key, pat, query, max_auto_reply, audio, image):
128
  os.environ['CLARIFAI_PAT'] = pat
requirements.txt CHANGED
@@ -1,7 +1,7 @@
1
  streamlit
2
  gradio
3
  datasets
4
- pyautogen
5
  chromadb
6
  semantic-kernel
7
  llama-index
@@ -9,5 +9,6 @@ llama-hub
9
  langchain
10
  huggingface_hub
11
  openai
12
- pypdf
13
- ipython
 
 
1
  streamlit
2
  gradio
3
  datasets
4
+ autogen
5
  chromadb
6
  semantic-kernel
7
  llama-index
 
9
  langchain
10
  huggingface_hub
11
  openai
12
+ Ipython
13
+ pyautogen
14
+ pypdf
src/agentics/agents.py CHANGED
@@ -20,13 +20,15 @@ llm_config = {
20
  "temperature": 0,
21
  }
22
 
 
 
 
23
  class AgentsFactory:
24
  def __init__(self, llm_config, db_path):
25
  self.llm_config = llm_config
26
  self.db_path = db_path
27
 
28
- def termination_msg(self, x):
29
- return isinstance(x, dict) and "TERMINATE" == str(x.get("content", ""))[-9:].upper()
30
 
31
  def tonic(self) :
32
  return autogen.UserProxyAgent(
 
20
  "temperature": 0,
21
  }
22
 
23
+ def termination_msg(self, x):
24
+ return isinstance(x, dict) and "TERMINATE" == str(x.get("content", ""))[-9:].upper()
25
+
26
  class AgentsFactory:
27
  def __init__(self, llm_config, db_path):
28
  self.llm_config = llm_config
29
  self.db_path = db_path
30
 
31
+
 
32
 
33
  def tonic(self) :
34
  return autogen.UserProxyAgent(
src/config/OAI_CONFIG_LIST.json CHANGED
@@ -1,25 +1,25 @@
1
  [
2
  {
3
  "model": "gpt-3.5-turbo-preview",
4
- "api_key": "your OpenAI Key goes here",
5
  "base_url": "https://api.openai.com/v1",
6
  "api_version": "2023-06-01-preview"
7
  },
8
  {
9
  "model": "gpt-4-preview",
10
- "api_key": "your OpenAI Key goes here",
11
  "base_url": "https://api.openai.com/v1",
12
  "api_version": "2023-06-01-preview"
13
  },
14
  {
15
  "model": "gpt-4-vision-preview",
16
- "api_key": "your OpenAI Key goes here",
17
  "base_url": "https://api.openai.com/v1",
18
  "api_version": "2023-06-01-preview"
19
  },
20
  {
21
  "model": "dall-e-3",
22
- "api_key": "your OpenAI Key goes here",
23
  "base_url": "https://api.openai.com/v1",
24
  "api_version": "2023-06-01-preview"
25
  }
 
1
  [
2
  {
3
  "model": "gpt-3.5-turbo-preview",
4
+ "api_key": "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",
5
  "base_url": "https://api.openai.com/v1",
6
  "api_version": "2023-06-01-preview"
7
  },
8
  {
9
  "model": "gpt-4-preview",
10
+ "api_key": "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",
11
  "base_url": "https://api.openai.com/v1",
12
  "api_version": "2023-06-01-preview"
13
  },
14
  {
15
  "model": "gpt-4-vision-preview",
16
+ "api_key": "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",
17
  "base_url": "https://api.openai.com/v1",
18
  "api_version": "2023-06-01-preview"
19
  },
20
  {
21
  "model": "dall-e-3",
22
+ "api_key": "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",
23
  "base_url": "https://api.openai.com/v1",
24
  "api_version": "2023-06-01-preview"
25
  }
src/datatonic/dataloader.py CHANGED
@@ -89,7 +89,10 @@ class DataLoader:
89
  if dataset_name in self.datasets:
90
  return self.datasets[dataset_name]()
91
  else:
92
- raise ValueError(f"Dataset {dataset_name} not supported.")
 
 
 
93
 
94
  def save_to_json(self, data, file_name):
95
  with open(file_name, 'w') as f:
 
89
  if dataset_name in self.datasets:
90
  return self.datasets[dataset_name]()
91
  else:
92
+ # Log or return an error message and default to "gpl-arguana"
93
+ error_message = f"Dataset '{dataset_name}' not supported. Defaulting to 'gpl-arguana'."
94
+ print(error_message) # or handle this message as needed
95
+ return self.load_gpl_arguana() # Default to the 'gpl-arguana' dataset
96
 
97
  def save_to_json(self, data, file_name):
98
  with open(file_name, 'w') as f:
src/documentation/PROJECT.md CHANGED
@@ -1,13 +1,39 @@
1
- # 🌟 Sci-Tonic: Your Ultimate Technical Research Assistant πŸš€
2
 
3
- Welcome to **Sci-Tonic** πŸŽ‰, the groundbreaking technical research assistant designed for professionals, researchers, and enthusiasts alike! If you're looking to dive deep into the world of data, ranging from financial figures πŸ“ˆ to scientific articles 🧬, and transform them into insightful, long-form multimedia outputs πŸ“ŠπŸ“š, you've just found your new best friend! πŸ€–πŸ‘©β€πŸ”¬πŸ‘¨β€πŸ’Ό
4
 
5
- ## Features 🌈
6
 
7
- Sci-Tonic is packed with amazing features:
8
 
9
- - **Data Retrieval**: Effortlessly fetch data from a vast array of sources. Financial reports, scientific papers, complex texts - you name it, Sci-Tonic retrieves it! πŸŒπŸ”
10
- - **Advanced Analysis**: Using cutting-edge AI, Sci-Tonic analyzes and interprets your data, providing you with deep insights. πŸ§ πŸ’‘
11
- - **Multimedia Output**: Get your results the way you want them. Text, infographics, video summaries - Sci-Tonic does it all! πŸ“πŸŽ₯πŸ“Š
12
- - **User-Friendly Interface**: Whether you're a tech guru or a newbie, our intuitive interface makes your research journey smooth and enjoyable. πŸ–₯️😊
13
- - **Collaboration Tools**: Teamwork makes the dream work! Collaborate seamlessly with colleagues or classmates. πŸ‘₯🀝
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Introducing πŸ§ͺπŸ‘©πŸ»β€πŸ”¬Sci-Tonic - Your Ultimate Technical Research Assistant πŸš€
2
 
3
+ ### Welcome to the Future of Technical Research: Sci-Tonic 🌐
4
 
5
+ In an era where data is king πŸ‘‘, the ability to efficiently gather, analyze, and present information is crucial for success across various fields. Today, we are thrilled to introduce Sci-Tonic πŸ€–, a state-of-the-art technical research assistant that revolutionizes how professionals, researchers, and enthusiasts interact with data. Whether it's financial figures πŸ’Ή, scientific articles 🧬, or complex texts πŸ“š, Sci-Tonic is your go-to solution for turning data into insights.
6
 
7
+ ## Features of Sci-Tonic 🌈
8
 
9
+ ### 1. Data Retrieval: A Gateway to Information πŸšͺπŸ“Š
10
+ - **Broad Spectrum Access**: From financial reports to scientific papers, Sci-Tonic accesses a wide array of data sources.
11
+ - **Efficiency and Precision**: Quickly fetches relevant data, saving you time and effort β°πŸ’Ό.
12
+
13
+ ### 2. Advanced Analysis: Deep Insights from Cutting-Edge AI πŸ§ πŸ’‘
14
+ - **Intelligent Interpretation**: Utilizes advanced AI algorithms to analyze and interpret complex data sets.
15
+ - **Customizable Analysis**: Tailored to meet specific research needs, providing targeted insights πŸ”.
16
+
17
+ ### 3. Multimedia Output: Diverse and Dynamic Presentation πŸ“πŸŽ₯πŸ“Š
18
+ - **Versatile Formats**: Outputs range from text and infographics to video summaries.
19
+ - **Engaging and Informative**: Enhances understanding and retention of information 🌟.
20
+
21
+ ### 4. User-Friendly Interface: Accessible to All πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»
22
+ - **Intuitive Design**: Easy to navigate for both tech experts and novices.
23
+ - **Seamless Experience**: Makes research not just productive but also enjoyable πŸŽ‰.
24
+
25
+ ### 5. Adaptive Technical Operator πŸ€–
26
+ - **High Performance**: Capable of handling complex analyses with ease.
27
+ - **On-the-Fly Adaptability**: Quickly adjusts to new data and user requests πŸŒͺ️.
28
+
29
+ ## Applications of Sci-Tonic πŸ› οΈ
30
+ - **Academic Research**: Streamlines the process of gathering and analyzing scientific data πŸŽ“πŸ”¬.
31
+ - **Financial Analysis**: Provides comprehensive insights into market trends and financial reports πŸ’Ή.
32
+ - **Business Intelligence**: Assists in making data-driven decisions for business strategies πŸ“ˆ.
33
+ - **Personal Use**: Aids enthusiasts in exploring data in their fields of interest 🌍.
34
+
35
+ ## Choose Sci-Tonic? πŸ€”
36
+ - **Efficiency**: Saves time and effort in data collection and analysis ⏳.
37
+ - **Accuracy**: Provides reliable and precise insights πŸ”Ž.
38
+ - **Customization**: Adapts to specific user needs and preferences πŸ› οΈ.
39
+ - **Innovation**: Employs the latest AI technology for data analysis πŸš€.