Spaces:
Running
on
CPU Upgrade
New model doesn't appear after Refresh
Hi!
I made this model: https://huggingface.co/Mihaiii/gte-micro
I was following the guide from here: https://github.com/embeddings-benchmark/mteb/blob/main/docs/adding_a_model.md
But when I click on the Refresh button, the model still doesn't appear in the leaderboard.
Am I doing something wrong?
I was applying filters after refresh (model size <100M). In that case, the model does't appear.
If I press refresh and don't apply filters, the model is there.
If I first filter and then press refresh, the model is there, but the filter is ignored.
Basically I can only see it in the view that has all the models, in which case I need to carefully scroll through the list and look for it.
Is this expected? The doxs say once per week you refresh the cache - in which day of the week?
Also, it appear not all beckmarks ran. It's not clear to me if I interrupted the run by mistake or if there is a script dependency issue or something else.
Script: https://github.com/embeddings-benchmark/mteb/blob/main/scripts/run_mteb_english.py
Hello!
Is this expected
That is a bit odd indeed, I'm not quite sure what happened there.
The doxs say once per week you refresh the cache - in which day of the week?
Whenever someone creates an issue exactly like this one. I've restarted the leaderboard.
It indeed looks like not all benchmarks ran:
Reranking:
and Retrieval, STS and Summarization did not run at all. Can you check locally if you have results files for those benchmarks?
- Tom Aarsen
Thanks for your response.
The run just stopped at some point so I assumed it was over. I used > /dev/null 2>&1
because there are lots of print statements so I don't have logs.
But I looked into it: if I do TASK_LIST = (TASK_LIST_STS)
in the script (click me), then all is good.
Next I tried TASK_LIST = (TASK_LIST_RERANKING)
and I get the following error when running the script run_mteb_english.py.
Could you please confirm the issue?
********************** Evaluating MindSmallReranking **********************
INFO:mteb.evaluation.MTEB:Loading dataset for MindSmallReranking
Repo card metadata block was not found. Setting CardData to empty.
WARNING:huggingface_hub.repocard:Repo card metadata block was not found. Setting CardData to empty.
Failed to read file 'gzip://7a742da40ba0425a72301598ce27d63296c468da48cd98c4ae479b1d88a755a8::/root/.cache/huggingface/datasets/downloads/7a742da40ba0425a72301598ce27d63296c468da48cd98c4ae479b1d88a755a8' with error <class 'pyarrow.lib.ArrowInvalid'>: JSON parse error: Invalid value. in row 0
ERROR:datasets.packaged_modules.json.json:Failed to read file 'gzip://7a742da40ba0425a72301598ce27d63296c468da48cd98c4ae479b1d88a755a8::/root/.cache/huggingface/datasets/downloads/7a742da40ba0425a72301598ce27d63296c468da48cd98c4ae479b1d88a755a8' with error <class 'pyarrow.lib.ArrowInvalid'>: JSON parse error: Invalid value. in row 0
ERROR:mteb.evaluation.MTEB:Error while evaluating MindSmallReranking: An error occurred while generating the dataset
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/datasets/packaged_modules/json/json.py", line 145, in _generate_tables
dataset = json.load(f)
File "/usr/lib/python3.10/json/__init__.py", line 293, in load
return loads(fp.read(),
File "/usr/lib/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.10/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 1995, in _prepare_split_single
for _, table in generator:
File "/usr/local/lib/python3.10/dist-packages/datasets/packaged_modules/json/json.py", line 148, in _generate_tables
raise e
File "/usr/local/lib/python3.10/dist-packages/datasets/packaged_modules/json/json.py", line 122, in _generate_tables
pa_table = paj.read_json(
File "pyarrow/_json.pyx", line 308, in pyarrow._json.read_json
File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: JSON parse error: Invalid value. in row 0
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/workspace/mteb/scripts/run_mteb_english.py", line 112, in <module>
evaluation.run(
File "/usr/local/lib/python3.10/dist-packages/mteb/evaluation/MTEB.py", line 324, in run
raise e
File "/usr/local/lib/python3.10/dist-packages/mteb/evaluation/MTEB.py", line 288, in run
task.load_data(eval_splits=task_eval_splits, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/mteb/abstasks/AbsTask.py", line 39, in load_data
self.dataset = datasets.load_dataset(**self.metadata_dict["dataset"])
File "/usr/local/lib/python3.10/dist-packages/datasets/load.py", line 2609, in load_dataset
builder_instance.download_and_prepare(
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 1027, in download_and_prepare
self._download_and_prepare(
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 1122, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 1882, in _prepare_split
for job_id, done, content in self._prepare_split_single(
File "/usr/local/lib/python3.10/dist-packages/datasets/builder.py", line 2038, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset
Hello!
There was a short outage of Hugging Face, perhaps it tried to load the dataset, but couldn't as a result: https://status.huggingface.co/
Could you retry?
Also, if you're saving the models into the same directory as before, it will recognize existing results & skip those tests. With other words, you should be able to just run it with the full task set again, and it'll speedily skip over everything that you've done already.
- Tom Aarsen
I retried right before writing that comment. I ran the script for about 4 different models today and it never provided all results.
@tomaarsen
besides, the initial run (for gte-micro) was before the outage. I installed mteb with pip install mteb
. Let me know if I need to use a specific version.
I'll try and run MindSmallReranking myself
I have no issues myself.
********************** Evaluating MindSmallReranking **********************
INFO:mteb.evaluation.MTEB:Loading dataset for MindSmallReranking
C:\Users\tom\.conda\envs\mteb\lib\site-packages\huggingface_hub\repocard.py:105: UserWarning: Repo card metadata block was not found. Setting CardData to empty.
warnings.warn("Repo card metadata block was not found. Setting CardData to empty.")
Calling download_and_prepare
Called download_and_prepare
INFO:mteb.evaluation.evaluators.RerankingEvaluator:Encoding queries...
Batches: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 2308/2308 [01:48<00:00, 21.25it/s]
INFO:mteb.evaluation.evaluators.RerankingEvaluator:Encoding candidates...
Batches: 100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 2596/2596 [01:52<00:00, 22.99it/s]
INFO:mteb.evaluation.evaluators.RerankingEvaluator:Evaluating...
INFO:mteb.evaluation.MTEB:Evaluation for MindSmallReranking on test took 299.14 seconds
INFO:mteb.evaluation.MTEB:Scores: {'map': 0.30999281853124455, 'mrr': 0.31961836112643854, 'evaluation_time': 299.14}
I think that your downloaded file at root/.cache/huggingface/datasets/downloads/7a742da40ba0425a72301598ce27d63296c468da48cd98c4ae479b1d88a755a8
may be corrupted, i.e. the JSON can't be loaded. You can have a look at this file and see if you can observe a problem. You might be best off to delete it and retry.
- Tom Aarsen
@tomaarsen I killed that pod, but I'll retry. Thank you for running it on your side.
I rented a pod on runpod just for this.
On a clean environment, do the following:
!pip install mteb
from datasets import load_dataset
dataset = load_dataset("mteb/mind_small")
And you should get: https://gist.github.com/Mihaiii/6bf3cfb441a01daadf0cba47d7dab6dc
@tomaarsen Could you please confirm? :)
Closing.
For anyone else experiencing this:
This looks like a regression in the datasets library when reading tar files.
Just uninstall and use an older version:
!echo 'Y' | pip uninstall datasets
!pip install datasets==2.16.0