QA

#39
by kareem22 - opened

hello all ,

how i can use dally2 for question answering task ?

Databricks org

See the langchain examples in the repo, and see https://python.langchain.com/en/latest/use_cases/question_answering.html for an example of applying langchain for QA. It can be used with Dolly.

what about the error

ValidationError: 1 validation error for OpenAIEmbeddings
root
Did not find openai_api_key, please add an environment variable OPENAI_API_KEY which contains it, or pass openai_api_key as a named parameter. (type=value_error)

Databricks org

The error pretty much tells you exactly what's wrong :)
However, you are asking about OpenAI, not anything to do with this model.

hello srowen

i mean what is the relation why i need OPENAI_API_KEY ?

when im using dolly2 i need an OPENAI_API_KEY ?

Databricks org

No. It sounds like you are writing code that uses langchain's OpenAI integration. You want to use its Hugging Face integration to use a model on Hugging Face, like Dolly.
See https://github.com/databrickslabs/dolly/blob/master/examples/langchain.py

srowen changed discussion status to closed

hello srowen

i tried the examble you sent to me
i got this error : ValueError: The following model_kwargs are not used by the model: ['return_full_text'] (note: typos in the generate arguments will also show up in this list)

when i removed return_full_text got that : TypeError: string indices must be integers

Databricks org

Which example? if you load a different pipeline, you may need model_kwargs={'return_full_text':True} instead. But I'm not sure what you're running. You must set this when working with langchain.

import torch
from transformers import pipeline

generate_text = pipeline(model="databricks/dolly-v2-12b", torch_dtype=torch.bfloat16,
trust_remote_code=True, device_map="auto", return_full_text=True)

from langchain import PromptTemplate, LLMChain
from langchain.llms import HuggingFacePipeline

template for an instrution with no input

prompt = PromptTemplate(
input_variables=["instruction"],
template="{instruction}")

template for an instruction with input

prompt_with_context = PromptTemplate(
input_variables=["instruction", "context"],
template="{instruction}\n\nInput:\n{context}")

hf_pipeline = HuggingFacePipeline(pipeline=generate_text)

llm_chain = LLMChain(llm=hf_pipeline, prompt=prompt)
llm_context_chain = LLMChain(llm=hf_pipeline, prompt=prompt_with_context)

context = """George Washington (February 22, 1732[b] – December 14, 1799) was an American military officer, statesman,
and Founding Father who served as the first president of the United States from 1789 to 1797."""

print(llm_context_chain.predict(instruction="When was George Washington president?", context=context).lstrip())

this example from model card

Databricks org

That works as-is for me. Make sure you have the latest code.

finally its work but take a lot of time , any suggest ?

Databricks org

You're not running fully on a GPU, probably. Not using an A100? then see https://github.com/databrickslabs/dolly#training-on-other-instances

ValueError: The following model_kwargs are not used by the model: ['load_in_8bit'] (note: typos in the generate arguments will also show up in this list)

Databricks org

Not sure, where are you putting that?

In my particular case I was loading the index before my environment variables:

This is how I had it:

from flask import Flask, request, jsonify, send_from_directory, send_file
from flask_cors import CORS
from gpt_index import GPTSimpleVectorIndex
import os
import requests
import json
import openai
from dotenv import load_dotenv

app = Flask(__name__)
CORS(app) # This will enable CORS for all routes

index = GPTSimpleVectorIndex.load_from_disk('DOCBOT.json')

load_dotenv()

# API Key de OpenAI
openai_api_key = os.getenv("OPENAI_API_KEY")

Here's how it works:

from flask import Flask, request, jsonify, send_from_directory, send_file
from flask_cors import CORS
from gpt_index import GPTSimpleVectorIndex
import os
import requests
import json
import openai
from dotenv import load_dotenv

app = Flask(__name__)
CORS(app) # This will enable CORS for all routes


load_dotenv()

index = GPTSimpleVectorIndex.load_from_disk('DOCBOT.json')

# API Key de OpenAI
openai_api_key = os.getenv("OPENAI_API_KEY")

I hope it works for someone.

Sign up or log in to comment