Multi-turn voice conversations

#3
by yumemio - opened

Hello πŸ‘, congratulations on releasing such a groundbreaking model!

I'm interested in multi-turn voice conversations:

  1. A user gives an initial instruction,
  2. The model responds,
  3. The user asks a follow-up question,
  4. ...and so on.

I did a quick test (concatenated the audio clips from the previous turn with another instruction), but the model responded to the first question (1) instead of the most recent one (3).

  1. Does the demo/model support such use cases?
  2. If not, what kind of modifications are necessary for the model to handle multi-turn conversations?

I appreciate your insights. Thanks!

Hi, Yuki. Currently, the model does not support multi-turn dialogue as the model was only trained on one-turn dialogue datasets.

Hi @gpt-omni , thanks for the clarification! Gotcha - I imagine training the model with a multi-turn dataset (and inserting an EOS token at the end of each turn?) will make it capable of handling follow-up questions.

Closing the issue as resolved. Thanks again!

yumemio changed discussion status to closed

Sign up or log in to comment