Spaces:

hallucinations-leaderboard
/

leaderboard

Running on CPU Upgrade

App Files Files Community

Resources

View closed (14)

Accessing examples used for n-shot evals

#26 opened 19 days ago by

Certain models perhaps clogging up the leaderboard?, Check logs?

#25 opened 6 months ago by

How are Faithfulness and Factuality calculated?

#22 opened 7 months ago by

How could #parameter of a model be 0?

#20 opened 8 months ago by

Why is the score for RACE so low?

#18 opened 8 months ago by

Adding German Faithfulness Detection Task

#16 opened 9 months ago by

mtc

Adding SummEdits to leaderboard?

#12 opened 9 months ago by

Adding tasks from the USB benchmark (for summarization)

#11 opened 9 months ago by

Adding the Snowball Hallucination detection datasets

#9 opened 10 months ago by

Longform QA

#8 opened 10 months ago by

Metrics for hallucination detection for summarization.

#6 opened 10 months ago by

Hello all!

#5 opened 10 months ago by