5roop
/

wav2vecbert2-filledPause

Audio Classification

Model card Files Files and versions Community

5roop commited on Oct 10

Commit

132d375

•

1 Parent(s): ffa29b0

Update README.md

Files changed (1) hide show

README.md +32 -1

README.md CHANGED Viewed

@@ -16,7 +16,38 @@ metrics:
 This model classifies individual 20ms frames of audio based on presence of filled pauses ("eee", "errm", ...).
-It was trained on human-annotated Slovenian speech corpus ROG-Artur and achieves F1 of 0.952868.
 # Example use:
 ```python

 This model classifies individual 20ms frames of audio based on presence of filled pauses ("eee", "errm", ...).
+It was trained on human-annotated Slovenian speech corpus ROG-Artur and achieves F1 of 0.952868 on the test split of the same dataset.
+Evaluation on 800 human-annotated instances  ParlaSpeech-HR and ParlaSpeech-RS produced the following metrics:
+```
+Performance on RS:
+Classification report for human vs model on event level:
+              precision    recall  f1-score   support
+           0       0.97      0.87      0.92       234
+           1       0.95      0.99      0.97       542
+    accuracy                           0.95       776
+   macro avg       0.96      0.93      0.94       776
+weighted avg       0.95      0.95      0.95       776
+Performance on HR:
+Classification report for human vs model on event level:
+              precision    recall  f1-score   support
+           0       0.94      0.84      0.89       242
+           1       0.93      0.98      0.95       531
+    accuracy                           0.93       773
+   macro avg       0.93      0.91      0.92       773
+weighted avg       0.93      0.93      0.93       773
+```
+The metrics reported are on event level, which means that if true and
+predicted filled pauses at least partially overlap, we count them as a
+True Positive event.
 # Example use:
 ```python