Clémentine Fourrier's picture

Clémentine Fourrier

clefourrier

·

http://clefourrier.github.io

AI & ML interests

None yet

Recent Activity

New activity 29 minutes ago

open-llm-leaderboard/open_llm_leaderboard

New activity 30 minutes ago

HuggingFaceFW/fineweb

reacted to malhajar's post with 🔥 about 2 hours ago

Articles

Introduction to the Open Leaderboard for Japanese LLMs

Judge Arena: Benchmarking LLMs as Evaluators

Introducing the Open FinLLM Leaderboard

BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks

Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages

CyberSecEval 2 - A Comprehensive Evaluation Framework for Cybersecurity Risks and Capabilities of Large Language Models

Let's talk about LLM evaluation

Introducing the Open Arabic LLM Leaderboard

Introducing the Open Leaderboard for Hebrew LLMs!

Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face

Improving Prompt Consistency with Structured Generations

Introducing the Open Chain of Thought Leaderboard

The Open Medical-LLM Leaderboard: Benchmarking Large Language Models in Healthcare

Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

Introducing the Chatbot Guardrails Arena

Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?

TTS Arena: Benchmarking Text-to-Speech Models in the Wild

Introducing the Red-Teaming Resistance Leaderboard

Introducing the Open Ko-LLM Leaderboard: Leading the Korean LLM Evaluation Ecosystem

NPHardEval Leaderboard: Unveiling the Reasoning Abilities of Large Language Models through Complexity Classes and Dynamic Updates

Introducing the Enterprise Scenarios Leaderboard: a Leaderboard for Real World Use Cases

The Hallucinations Leaderboard, an Open Effort to Measure Hallucinations in Large Language Models

A guide to setting up your own Hugging Face leaderboard: an end-to-end example with Vectara's hallucination leaderboard

2023, year of open LLMs

Open LLM Leaderboard: DROP deep dive

Overview of natively supported quantization schemes in 🤗 Transformers

What's going on with the Open LLM Leaderboard?

Introduction to Graph Machine Learning

Organizations

clefourrier's activity

New activity in open-llm-leaderboard/open_llm_leaderboard 29 minutes ago

FLAG - `newsbang/Homer-v0.5-Qwen2.5-7B` MATH contamination

#1022 opened about 8 hours ago by

New activity in HuggingFaceFW/fineweb 30 minutes ago

Rename README.md to WILSON.md

#54 opened about 13 hours ago by

New activity in open-llm-leaderboard-old/open_llm_leaderboard 6 days ago

Interpretation of result details?

#1 opened 4 months ago by

are benchmark scores normalised to a baseline?

#2 opened 2 months ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 8 days ago

I can't replicate results.

#1016 opened 9 days ago by

New activity in gaia-benchmark/leaderboard 13 days ago

The web site is not working

#24 opened 29 days ago by

New activity in le-leadboard/gpqa-fr 13 days ago

Problème d'accès du dataset

#2 opened 13 days ago by

New activity in demo-leaderboard-backend/leaderboard 16 days ago

Apply CSS to the model name column

#14 opened 22 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 24 days ago

New collection needs to be looked at. Some numbers arent adding up.

#995 opened 24 days ago by

New activity in open-llm-leaderboard/open_llm_leaderboard 28 days ago

merged models marked as non merged in the leaderboard

#993 opened 28 days ago by

New activity in gaia-benchmark/leaderboard 29 days ago

🚩 Report: Not working

#20 opened 4 months ago by

New activity in open-llm-leaderboard/open_llm_leaderboard about 1 month ago

Normalization for MMLU-Pro doesn't make sense

#947 opened about 2 months ago by

Incorrect ifeval benchmark

#879 opened 3 months ago by

How do I view the results of my submission?

#980 opened about 1 month ago by

Update README.md

#988 opened about 1 month ago by

Results are not showing up

#987 opened about 1 month ago by

New activity in gaia-benchmark/leaderboard about 2 months ago

Subit for Gaia testing

#23 opened about 2 months ago by

New activity in open-llm-leaderboard/open_llm_leaderboard about 2 months ago

ssmits/Qwen2.5-95B-Instruct not running

#962 opened about 2 months ago by

BoltMonkey/NeuralDaredevil-SuperNova-Lite-7B-DARETIES-abliterated

#961 opened about 2 months ago by

New activity in demo-leaderboard-backend/leaderboard about 2 months ago

os.isfile --> os.path.isfile

#12 opened about 2 months ago by

meg