openai langchain beautifulsoup4 chromadb tiktoken pypdf gradio PyMuPDF