Add new SentenceTransformer model.
Browse files
README.md
CHANGED
@@ -21,6 +21,7 @@ tags:
|
|
21 |
- loss:MSELoss
|
22 |
- dataset_size:5000
|
23 |
- dataset_size:8000
|
|
|
24 |
widget:
|
25 |
- source_sentence: 'The aggressive semi-employed religion workshop of Razzak, (EFP).
|
26 |
|
@@ -112,7 +113,7 @@ model-index:
|
|
112 |
type: unknown
|
113 |
metrics:
|
114 |
- type: negative_mse
|
115 |
-
value: -0.
|
116 |
name: Negative Mse
|
117 |
- task:
|
118 |
type: translation
|
@@ -122,13 +123,13 @@ model-index:
|
|
122 |
type: unknown
|
123 |
metrics:
|
124 |
- type: src2trg_accuracy
|
125 |
-
value: 0.
|
126 |
name: Src2Trg Accuracy
|
127 |
- type: trg2src_accuracy
|
128 |
-
value: 0.
|
129 |
name: Trg2Src Accuracy
|
130 |
- type: mean_accuracy
|
131 |
-
value: 0.
|
132 |
name: Mean Accuracy
|
133 |
---
|
134 |
|
@@ -231,7 +232,7 @@ You can finetune this model on your own dataset.
|
|
231 |
|
232 |
| Metric | Value |
|
233 |
|:-----------------|:------------|
|
234 |
-
| **negative_mse** | **-0.
|
235 |
|
236 |
#### Translation
|
237 |
|
@@ -239,9 +240,9 @@ You can finetune this model on your own dataset.
|
|
239 |
|
240 |
| Metric | Value |
|
241 |
|:------------------|:-----------|
|
242 |
-
| src2trg_accuracy | 0.
|
243 |
-
| trg2src_accuracy | 0.
|
244 |
-
| **mean_accuracy** | **0.
|
245 |
|
246 |
<!--
|
247 |
## Bias, Risks and Limitations
|
@@ -262,7 +263,7 @@ You can finetune this model on your own dataset.
|
|
262 |
#### momo22/eng2nep
|
263 |
|
264 |
* Dataset: [momo22/eng2nep](https://huggingface.co/datasets/momo22/eng2nep) at [57da8d4](https://huggingface.co/datasets/momo22/eng2nep/tree/57da8d44266896e334c1d8f2528cbbf666fbd0ca)
|
265 |
-
* Size:
|
266 |
* Columns: <code>English</code>, <code>Nepali</code>, and <code>label</code>
|
267 |
* Approximate statistics based on the first 1000 samples:
|
268 |
| | English | Nepali | label |
|
@@ -282,13 +283,13 @@ You can finetune this model on your own dataset.
|
|
282 |
#### momo22/eng2nep
|
283 |
|
284 |
* Dataset: [momo22/eng2nep](https://huggingface.co/datasets/momo22/eng2nep) at [57da8d4](https://huggingface.co/datasets/momo22/eng2nep/tree/57da8d44266896e334c1d8f2528cbbf666fbd0ca)
|
285 |
-
* Size:
|
286 |
* Columns: <code>English</code>, <code>Nepali</code>, and <code>label</code>
|
287 |
* Approximate statistics based on the first 1000 samples:
|
288 |
-
| | English | Nepali
|
289 |
-
|
290 |
-
| type | string | string
|
291 |
-
| details | <ul><li>min: 4 tokens</li><li>mean: 26.
|
292 |
* Samples:
|
293 |
| English | Nepali | label |
|
294 |
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------|
|
@@ -304,7 +305,6 @@ You can finetune this model on your own dataset.
|
|
304 |
- `per_device_train_batch_size`: 64
|
305 |
- `per_device_eval_batch_size`: 64
|
306 |
- `learning_rate`: 2e-05
|
307 |
-
- `num_train_epochs`: 1
|
308 |
- `warmup_ratio`: 0.1
|
309 |
- `bf16`: True
|
310 |
- `push_to_hub`: True
|
@@ -330,7 +330,7 @@ You can finetune this model on your own dataset.
|
|
330 |
- `adam_beta2`: 0.999
|
331 |
- `adam_epsilon`: 1e-08
|
332 |
- `max_grad_norm`: 1.0
|
333 |
-
- `num_train_epochs`:
|
334 |
- `max_steps`: -1
|
335 |
- `lr_scheduler_type`: linear
|
336 |
- `lr_scheduler_kwargs`: {}
|
@@ -427,12 +427,21 @@ You can finetune this model on your own dataset.
|
|
427 |
</details>
|
428 |
|
429 |
### Training Logs
|
430 |
-
| Epoch
|
431 |
-
|
432 |
-
| 0.4
|
433 |
-
| 0.8
|
434 |
-
| 0.4
|
435 |
-
| 0.8
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
436 |
|
437 |
|
438 |
### Framework Versions
|
|
|
21 |
- loss:MSELoss
|
22 |
- dataset_size:5000
|
23 |
- dataset_size:8000
|
24 |
+
- dataset_size:100000
|
25 |
widget:
|
26 |
- source_sentence: 'The aggressive semi-employed religion workshop of Razzak, (EFP).
|
27 |
|
|
|
113 |
type: unknown
|
114 |
metrics:
|
115 |
- type: negative_mse
|
116 |
+
value: -0.32407890539616346
|
117 |
name: Negative Mse
|
118 |
- task:
|
119 |
type: translation
|
|
|
123 |
type: unknown
|
124 |
metrics:
|
125 |
- type: src2trg_accuracy
|
126 |
+
value: 0.05445
|
127 |
name: Src2Trg Accuracy
|
128 |
- type: trg2src_accuracy
|
129 |
+
value: 0.02105
|
130 |
name: Trg2Src Accuracy
|
131 |
- type: mean_accuracy
|
132 |
+
value: 0.03775
|
133 |
name: Mean Accuracy
|
134 |
---
|
135 |
|
|
|
232 |
|
233 |
| Metric | Value |
|
234 |
|:-----------------|:------------|
|
235 |
+
| **negative_mse** | **-0.3241** |
|
236 |
|
237 |
#### Translation
|
238 |
|
|
|
240 |
|
241 |
| Metric | Value |
|
242 |
|:------------------|:-----------|
|
243 |
+
| src2trg_accuracy | 0.0544 |
|
244 |
+
| trg2src_accuracy | 0.021 |
|
245 |
+
| **mean_accuracy** | **0.0377** |
|
246 |
|
247 |
<!--
|
248 |
## Bias, Risks and Limitations
|
|
|
263 |
#### momo22/eng2nep
|
264 |
|
265 |
* Dataset: [momo22/eng2nep](https://huggingface.co/datasets/momo22/eng2nep) at [57da8d4](https://huggingface.co/datasets/momo22/eng2nep/tree/57da8d44266896e334c1d8f2528cbbf666fbd0ca)
|
266 |
+
* Size: 100,000 training samples
|
267 |
* Columns: <code>English</code>, <code>Nepali</code>, and <code>label</code>
|
268 |
* Approximate statistics based on the first 1000 samples:
|
269 |
| | English | Nepali | label |
|
|
|
283 |
#### momo22/eng2nep
|
284 |
|
285 |
* Dataset: [momo22/eng2nep](https://huggingface.co/datasets/momo22/eng2nep) at [57da8d4](https://huggingface.co/datasets/momo22/eng2nep/tree/57da8d44266896e334c1d8f2528cbbf666fbd0ca)
|
286 |
+
* Size: 8,000 evaluation samples
|
287 |
* Columns: <code>English</code>, <code>Nepali</code>, and <code>label</code>
|
288 |
* Approximate statistics based on the first 1000 samples:
|
289 |
+
| | English | Nepali | label |
|
290 |
+
|:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:-------------------------------------|
|
291 |
+
| type | string | string | list |
|
292 |
+
| details | <ul><li>min: 4 tokens</li><li>mean: 26.48 tokens</li><li>max: 213 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 63.73 tokens</li><li>max: 256 tokens</li></ul> | <ul><li>size: 384 elements</li></ul> |
|
293 |
* Samples:
|
294 |
| English | Nepali | label |
|
295 |
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------|
|
|
|
305 |
- `per_device_train_batch_size`: 64
|
306 |
- `per_device_eval_batch_size`: 64
|
307 |
- `learning_rate`: 2e-05
|
|
|
308 |
- `warmup_ratio`: 0.1
|
309 |
- `bf16`: True
|
310 |
- `push_to_hub`: True
|
|
|
330 |
- `adam_beta2`: 0.999
|
331 |
- `adam_epsilon`: 1e-08
|
332 |
- `max_grad_norm`: 1.0
|
333 |
+
- `num_train_epochs`: 3
|
334 |
- `max_steps`: -1
|
335 |
- `lr_scheduler_type`: linear
|
336 |
- `lr_scheduler_kwargs`: {}
|
|
|
427 |
</details>
|
428 |
|
429 |
### Training Logs
|
430 |
+
| Epoch | Step | Training Loss | loss | mean_accuracy | negative_mse |
|
431 |
+
|:------:|:----:|:-------------:|:------:|:-------------:|:------------:|
|
432 |
+
| 0.4 | 50 | 0.0021 | 0.0019 | 0.0111 | -0.3837 |
|
433 |
+
| 0.8 | 100 | 0.002 | 0.0019 | 0.0123 | -0.3794 |
|
434 |
+
| 0.4 | 50 | 0.002 | 0.0019 | 0.0130 | -0.3773 |
|
435 |
+
| 0.8 | 100 | 0.002 | 0.0019 | 0.0135 | -0.3744 |
|
436 |
+
| 0.3199 | 500 | 0.002 | 0.0018 | 0.0166 | -0.3597 |
|
437 |
+
| 0.6398 | 1000 | 0.0019 | 0.0018 | 0.0204 | -0.3461 |
|
438 |
+
| 0.9597 | 1500 | 0.0018 | 0.0017 | 0.0241 | -0.3389 |
|
439 |
+
| 1.2796 | 2000 | 0.0018 | 0.0017 | 0.0273 | -0.3351 |
|
440 |
+
| 1.5995 | 2500 | 0.0018 | 0.0017 | 0.0312 | -0.3302 |
|
441 |
+
| 1.9194 | 3000 | 0.0018 | 0.0017 | 0.0328 | -0.3284 |
|
442 |
+
| 2.2393 | 3500 | 0.0018 | 0.0017 | 0.0353 | -0.3264 |
|
443 |
+
| 2.5592 | 4000 | 0.0018 | 0.0016 | 0.0374 | -0.3246 |
|
444 |
+
| 2.8791 | 4500 | 0.0018 | 0.0016 | 0.0377 | -0.3241 |
|
445 |
|
446 |
|
447 |
### Framework Versions
|