Spaces:

AlhitawiMohammed22
/

CER_Hu-Evaluation-Metrics

Sleeping

App Files Files Community

AlhitawiMohammed22 commited on Apr 16, 2023

Commit

e676e40

•

1 Parent(s): 61b1e91

update readme

Browse files

Files changed (1) hide show

README.md +153 -5

README.md CHANGED Viewed

@@ -1,13 +1,161 @@
 ---
-title: Hu Evaluation Metrics
-emoji: 🏃
-colorFrom: gray
 colorTo: red
 sdk: gradio
-sdk_version: 3.24.1
 app_file: app.py
 pinned: false
 license: apache-2.0
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: CER
+emoji: 🤗🏃🤗🏃🤗🏃🤗🏃🤗
+colorFrom: blue
 colorTo: red
 sdk: gradio
+sdk_version: 3.19.1
 app_file: app.py
 pinned: false
+tags:
+- evaluate
+- metric
 license: apache-2.0
 ---
+---
+description: >-
+  Character error rate (CER) is a common metric of the performance of an automatic speech recognition system.
+  CER is similar to Word Error Rate (WER), but operates on character instead of word. Please refer to docs of WER for further information.
+  Character error rate can be computed as:
+  CER = (S + D + I) / N = (S + D + I) / (S + D + C)
+  where
+  S is the number of substitutions,
+  D is the number of deletions,
+  I is the number of insertions,
+  C is the number of correct characters,
+  N is the number of characters in the reference (N=S+D+C).
+  CER's output is not always a number between 0 and 1, in particular when there is a high number of insertions. This value is often associated to the percentage of characters that were incorrectly predicted. The lower the value, the better the
+  performance of the ASR system with a CER of 0 being a perfect score.
+---
+# Metric Card for CER
+## Metric description
+Character error rate (CER) is a common metric of the performance of an automatic speech recognition (ASR) system. CER is similar to Word Error Rate (WER), but operates on character instead of word.
+Character error rate can be computed as:
+`CER = (S + D + I) / N = (S + D + I) / (S + D + C)`
+where
+`S` is the number of substitutions,
+`D` is the number of deletions,
+`I` is the number of insertions,
+`C` is the number of correct characters,
+`N` is the number of characters in the reference (`N=S+D+C`).
+## How to use
+The metric takes two inputs: references (a list of references for each speech input) and predictions (a list of transcriptions to score).
+```python
+from evaluate import load
+cer = load("cer")
+cer_score = cer.compute(predictions=predictions, references=references)
+```
+## Output values
+This metric outputs a float representing the character error rate.
+```
+print(cer_score)
+0.34146341463414637
+```
+The **lower** the CER value, the **better** the performance of the ASR system, with a CER of 0 being a perfect score.
+However, CER's output is not always a number between 0 and 1, in particular when there is a high number of insertions (see [Examples](#Examples) below).
+### Values from popular papers
+## Examples
+Perfect match between prediction and reference:
+```python
+!pip install evaluate jiwer
+from evaluate import load
+cer = load("cer")
+predictions = ["hello világ", "jó éjszakát hold"]
+references = ["hello világ", "jó éjszakát hold"]
+cer_score = cer.compute(predictions=predictions, references=references)
+print(cer_score)
+0.0
+```
+Partial match between prediction and reference:
+```python
+from evaluate import load
+cer = load("cer")
+predictions = ["ez a jóslat", "van egy másik minta is"]
+references = ["ez a hivatkozás", "van még egy"]
+cer = evaluate.load("cer")
+cer_score = cer.compute(predictions=predictions, references=references)
+print(cer_score)
+0.9615384615384616
+```
+No match between prediction and reference:
+```python
+from evaluate import load
+cer = load("cer")
+predictions = ["üdvözlet"]
+references = ["jó!"]
+cer_score = cer.compute(predictions=predictions, references=references)
+print(cer_score)
+1.5
+```
+CER above 1 due to insertion errors:
+```python
+from evaluate import load
+cer = load("cer")
+predictions = ["Helló Világ"]
+references = ["Helló"]
+cer_score = cer.compute(predictions=predictions, references=references)
+print(cer_score)
+1.2
+```
+## Limitations and bias
+.
+Also, in some cases, instead of reporting the raw CER, a normalized CER is reported where the number of mistakes is divided by the sum of the number of edit operations (`I` + `S` + `D`) and `C` (the number of correct characters), which results in CER values that fall within the range of 0–100%.
+## Citation
+```bibtex
+@inproceedings{morris2004,
+author = {Morris, Andrew and Maier, Viktoria and Green, Phil},
+year = {2004},
+month = {01},
+pages = {},
+title = {From WER and RIL to MER and WIL: improved evaluation measures for connected speech recognition.}
+}
+```
+##  References
+- [Hugging Face Tasks -- Automatic Speech Recognition](https://huggingface.co/tasks/automatic-speech-recognition)
+- https://github.com/huggingface/evaluate