Model Card for Mistral-Interact
- Base IN3: https://huggingface.co/datasets/hbx/IN3
- IN3-interaction: https://huggingface.co/datasets/hbx/IN3-interaction
- Paper: https://arxiv.org/abs/2402.09205
- Model: https://huggingface.co/hbx/Mistral-Interact
- Repo: https://github.com/HBX-hbx/Mistral-Interact
Using the constructed interaction data, we adapt Mistral-7B into Mistral-Interact, a powerful and robust variant of Mistral, capable of judging the vagueness of user instruction, actively querying for missing details with suggestions, and explicitly summarizing the detailed and clear user intentions. It has the following features:
- Better understanding of user judgments: Among all the open-source models, Mistral-Interact is the best at predicting task vagueness and missing details that users regard as necessary.
- Comprehensive summarization of user intentions: Mistral-Interact is effective in making an explicit and comprehensive summary based on detailed user intentions.
- Enhanced model-user interaction experience: Mistral-Interact inquires about missing details in vague tasks more reasonably and friendly than other open-source models, thus promoting a clearer understanding of the user’s implicit intentions.
- Comparable performance with closed-source GPT-4: We prove that smaller-scale model experts can approach or even exceed general-purpose large-scale models across various aspects including vagueness judgment, comprehensiveness of summaries, and friendliness of interaction.
We utilize the model-center framework to conduct full-parameter fine-tuning of Mistral-7B-v0.1 using Intention-in-Interaction(IN3) dataset on two 80GB A800s. For full details and the usage of this model please read our paper and repo.
Citation
Feel free to cite our paper if you find it is useful.
@article{cheng2024tell,
title={Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents},
author={Cheng Qian, Bingxiang He, Zhong Zhuang, Jia Deng, Yujia Qin, Xin Cong, Zhong Zhang, Jie Zhou, Yankai Lin, Zhiyuan Liu, Maosong Sun},
journal={arXiv preprint arXiv:2402.09205},
year={2024}
}
- Downloads last month
- 56
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.