openai tiktoken langchain gradio pypdf requests unstructured validators pytesseract pdf2image tabulate nltk python-dotenv faiss-cpu