beautifulsoup4 docx2txt gradio langchain openai pypdf requests