|
INFO: 2024-10-16 01:02:49,290: llmtf.base.evaluator: Starting eval on ['darumeru/multiq', 'darumeru/parus', 'darumeru/rcb', 'darumeru/ruopenbookqa', 'darumeru/ruworldtree', 'darumeru/rwsd', 'darumeru/use', 'russiannlp/rucola_custom'] |
|
INFO: 2024-10-16 01:02:49,293: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:02:49,293: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:02:49,570: llmtf.base.evaluator: Starting eval on ['darumeru/rummlu'] |
|
INFO: 2024-10-16 01:02:49,571: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:02:49,571: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:02:51,977: llmtf.base.evaluator: Starting eval on ['nlpcoreteam/rummlu'] |
|
INFO: 2024-10-16 01:02:51,979: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:02:51,979: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:02:53,943: llmtf.base.evaluator: Starting eval on ['nlpcoreteam/enmmlu'] |
|
INFO: 2024-10-16 01:02:53,943: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:02:53,943: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:02:55,483: llmtf.base.evaluator: Starting eval on ['daru/treewayabstractive'] |
|
INFO: 2024-10-16 01:02:55,483: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:02:55,483: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:02:57,009: llmtf.base.evaluator: Starting eval on ['daru/treewayextractive'] |
|
INFO: 2024-10-16 01:02:57,010: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:02:57,010: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:02:58,829: llmtf.base.evaluator: Starting eval on ['darumeru/cp_sent_ru', 'darumeru/cp_sent_en', 'darumeru/cp_para_ru', 'darumeru/cp_para_en'] |
|
INFO: 2024-10-16 01:02:58,830: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:02:58,830: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:03:03,414: llmtf.base.darumeru/cp_sent_ru: Loading Dataset: 4.58s |
|
INFO: 2024-10-16 01:03:10,009: llmtf.base.daru/treewayextractive: Loading Dataset: 13.00s |
|
INFO: 2024-10-16 01:03:12,441: llmtf.base.darumeru/MultiQ: Loading Dataset: 23.15s |
|
INFO: 2024-10-16 01:03:14,674: llmtf.base.daru/treewayabstractive: Loading Dataset: 19.19s |
|
INFO: 2024-10-16 01:04:20,981: llmtf.base.darumeru/ruMMLU: Loading Dataset: 91.41s |
|
INFO: 2024-10-16 01:06:08,702: llmtf.base.daru/treewayextractive: Processing Dataset: 178.69s |
|
INFO: 2024-10-16 01:06:08,705: llmtf.base.daru/treewayextractive: Results for daru/treewayextractive: |
|
INFO: 2024-10-16 01:06:08,931: llmtf.base.daru/treewayextractive: {'r-prec': 0.392920202020202} |
|
INFO: 2024-10-16 01:06:08,968: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-16 01:06:08,972: llmtf.base.evaluator: |
|
mean daru/treewayextractive |
|
0.393 0.393 |
|
INFO: 2024-10-16 01:06:21,426: llmtf.base.darumeru/MultiQ: Processing Dataset: 188.98s |
|
INFO: 2024-10-16 01:06:21,427: llmtf.base.darumeru/MultiQ: Results for darumeru/MultiQ: |
|
INFO: 2024-10-16 01:06:21,432: llmtf.base.darumeru/MultiQ: {'f1': 0.49389044741748994, 'em': 0.3738049713193117} |
|
INFO: 2024-10-16 01:06:21,442: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:06:21,442: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:06:24,486: llmtf.base.darumeru/PARus: Loading Dataset: 3.04s |
|
INFO: 2024-10-16 01:06:30,785: llmtf.base.darumeru/PARus: Processing Dataset: 6.30s |
|
INFO: 2024-10-16 01:06:30,787: llmtf.base.darumeru/PARus: Results for darumeru/PARus: |
|
INFO: 2024-10-16 01:06:30,815: llmtf.base.darumeru/PARus: {'acc': 0.42} |
|
INFO: 2024-10-16 01:06:30,816: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:06:30,816: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:06:34,602: llmtf.base.darumeru/RCB: Loading Dataset: 3.79s |
|
INFO: 2024-10-16 01:06:35,622: llmtf.base.nlpcoreteam/enMMLU: Loading Dataset: 221.68s |
|
INFO: 2024-10-16 01:06:45,442: llmtf.base.darumeru/RCB: Processing Dataset: 10.84s |
|
INFO: 2024-10-16 01:06:45,444: llmtf.base.darumeru/RCB: Results for darumeru/RCB: |
|
INFO: 2024-10-16 01:06:45,451: llmtf.base.darumeru/RCB: {'acc': 0.5, 'f1_macro': 0.4788148636316176} |
|
INFO: 2024-10-16 01:06:45,459: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:06:45,460: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:07:00,710: llmtf.base.darumeru/ruOpenBookQA: Loading Dataset: 15.25s |
|
INFO: 2024-10-16 01:07:31,366: llmtf.base.nlpcoreteam/ruMMLU: Loading Dataset: 279.39s |
|
INFO: 2024-10-16 01:08:12,346: llmtf.base.darumeru/ruOpenBookQA: Processing Dataset: 71.63s |
|
INFO: 2024-10-16 01:08:12,347: llmtf.base.darumeru/ruOpenBookQA: Results for darumeru/ruOpenBookQA: |
|
INFO: 2024-10-16 01:08:12,361: llmtf.base.darumeru/ruOpenBookQA: {'acc': 0.5451030927835051, 'f1_macro': 0.5357593897587198} |
|
INFO: 2024-10-16 01:08:12,377: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:08:12,378: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:08:15,157: llmtf.base.darumeru/ruWorldTree: Loading Dataset: 2.78s |
|
INFO: 2024-10-16 01:08:19,267: llmtf.base.darumeru/ruWorldTree: Processing Dataset: 4.11s |
|
INFO: 2024-10-16 01:08:19,268: llmtf.base.darumeru/ruWorldTree: Results for darumeru/ruWorldTree: |
|
INFO: 2024-10-16 01:08:19,273: llmtf.base.darumeru/ruWorldTree: {'acc': 0.7238095238095238, 'f1_macro': 0.719567254381039} |
|
INFO: 2024-10-16 01:08:19,274: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:08:19,274: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:08:23,048: llmtf.base.darumeru/RWSD: Loading Dataset: 3.77s |
|
INFO: 2024-10-16 01:08:32,809: llmtf.base.darumeru/RWSD: Processing Dataset: 9.76s |
|
INFO: 2024-10-16 01:08:32,811: llmtf.base.darumeru/RWSD: Results for darumeru/RWSD: |
|
INFO: 2024-10-16 01:08:32,815: llmtf.base.darumeru/RWSD: {'acc': 0.43137254901960786} |
|
INFO: 2024-10-16 01:08:32,817: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:08:32,817: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:08:48,749: llmtf.base.darumeru/USE: Loading Dataset: 15.93s |
|
INFO: 2024-10-16 01:09:43,090: llmtf.base.darumeru/cp_sent_ru: Processing Dataset: 399.68s |
|
INFO: 2024-10-16 01:09:43,093: llmtf.base.darumeru/cp_sent_ru: Results for darumeru/cp_sent_ru: |
|
INFO: 2024-10-16 01:09:43,097: llmtf.base.darumeru/cp_sent_ru: {'symbol_per_token': 3.900423523440511, 'len': 0.9417878040943446, 'lcs': 0.6981519507186859} |
|
INFO: 2024-10-16 01:09:43,101: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:09:43,101: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:09:46,551: llmtf.base.darumeru/cp_sent_en: Loading Dataset: 3.45s |
|
INFO: 2024-10-16 01:12:13,957: llmtf.base.darumeru/USE: Processing Dataset: 205.20s |
|
INFO: 2024-10-16 01:12:13,963: llmtf.base.darumeru/USE: Results for darumeru/USE: |
|
INFO: 2024-10-16 01:12:13,968: llmtf.base.darumeru/USE: {'grade_norm': 0.049999999999999996} |
|
INFO: 2024-10-16 01:12:13,975: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:12:13,975: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:12:35,082: llmtf.base.russiannlp/rucola_custom: Loading Dataset: 21.11s |
|
INFO: 2024-10-16 01:14:03,862: llmtf.base.russiannlp/rucola_custom: Processing Dataset: 88.78s |
|
INFO: 2024-10-16 01:14:03,866: llmtf.base.russiannlp/rucola_custom: Results for russiannlp/rucola_custom: |
|
INFO: 2024-10-16 01:14:03,877: llmtf.base.russiannlp/rucola_custom: {'acc': 0.4628632938643703, 'mcc': 0.14354674065192544} |
|
INFO: 2024-10-16 01:14:03,888: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-16 01:14:03,934: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_ru darumeru/ruOpenBookQA darumeru/ruWorldTree russiannlp/rucola_custom |
|
0.472 0.393 0.434 0.420 0.489 0.431 0.050 0.942 0.540 0.722 0.303 |
|
INFO: 2024-10-16 01:14:37,979: llmtf.base.darumeru/ruMMLU: Processing Dataset: 616.99s |
|
INFO: 2024-10-16 01:14:37,996: llmtf.base.darumeru/ruMMLU: Results for darumeru/ruMMLU: |
|
INFO: 2024-10-16 01:14:38,020: llmtf.base.darumeru/ruMMLU: {'acc': 0.37793075925371644} |
|
INFO: 2024-10-16 01:14:38,092: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-16 01:14:38,102: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruWorldTree russiannlp/rucola_custom |
|
0.464 0.393 0.434 0.420 0.489 0.431 0.050 0.942 0.378 0.540 0.722 0.303 |
|
INFO: 2024-10-16 01:16:29,537: llmtf.base.darumeru/cp_sent_en: Processing Dataset: 402.97s |
|
INFO: 2024-10-16 01:16:29,540: llmtf.base.darumeru/cp_sent_en: Results for darumeru/cp_sent_en: |
|
INFO: 2024-10-16 01:16:29,544: llmtf.base.darumeru/cp_sent_en: {'symbol_per_token': 4.420697143481598, 'len': 0.9613784781803254, 'lcs': 1.0} |
|
INFO: 2024-10-16 01:16:29,547: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:16:29,547: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:16:32,809: llmtf.base.darumeru/cp_para_ru: Loading Dataset: 3.26s |
|
INFO: 2024-10-16 01:17:34,153: llmtf.base.nlpcoreteam/enMMLU: Processing Dataset: 658.53s |
|
INFO: 2024-10-16 01:17:34,155: llmtf.base.nlpcoreteam/enMMLU: Results for nlpcoreteam/enMMLU: |
|
INFO: 2024-10-16 01:17:34,201: llmtf.base.nlpcoreteam/enMMLU: metric |
|
subject |
|
abstract_algebra 0.340000 |
|
anatomy 0.592593 |
|
astronomy 0.684211 |
|
business_ethics 0.690000 |
|
clinical_knowledge 0.720755 |
|
college_biology 0.722222 |
|
college_chemistry 0.420000 |
|
college_computer_science 0.560000 |
|
college_mathematics 0.460000 |
|
college_medicine 0.682081 |
|
college_physics 0.519608 |
|
computer_security 0.800000 |
|
conceptual_physics 0.625532 |
|
econometrics 0.473684 |
|
electrical_engineering 0.682759 |
|
elementary_mathematics 0.595238 |
|
formal_logic 0.365079 |
|
global_facts 0.340000 |
|
high_school_biology 0.800000 |
|
high_school_chemistry 0.571429 |
|
high_school_computer_science 0.760000 |
|
high_school_european_history 0.775758 |
|
high_school_geography 0.797980 |
|
high_school_government_and_politics 0.860104 |
|
high_school_macroeconomics 0.702564 |
|
high_school_mathematics 0.481481 |
|
high_school_microeconomics 0.810924 |
|
high_school_physics 0.430464 |
|
high_school_psychology 0.840367 |
|
high_school_statistics 0.625000 |
|
high_school_us_history 0.799020 |
|
high_school_world_history 0.827004 |
|
human_aging 0.672646 |
|
human_sexuality 0.763359 |
|
international_law 0.760331 |
|
jurisprudence 0.814815 |
|
logical_fallacies 0.760736 |
|
machine_learning 0.482143 |
|
management 0.825243 |
|
marketing 0.897436 |
|
medical_genetics 0.760000 |
|
miscellaneous 0.777778 |
|
moral_disputes 0.661850 |
|
moral_scenarios 0.293855 |
|
nutrition 0.751634 |
|
philosophy 0.697749 |
|
prehistory 0.700617 |
|
professional_accounting 0.496454 |
|
professional_law 0.453064 |
|
professional_medicine 0.636029 |
|
professional_psychology 0.660131 |
|
public_relations 0.736364 |
|
security_studies 0.734694 |
|
sociology 0.810945 |
|
us_foreign_policy 0.870000 |
|
virology 0.530120 |
|
world_religions 0.801170 |
|
INFO: 2024-10-16 01:17:34,209: llmtf.base.nlpcoreteam/enMMLU: metric |
|
subject |
|
STEM 0.586671 |
|
humanities 0.670081 |
|
other (business, health, misc.) 0.669483 |
|
social sciences 0.755093 |
|
INFO: 2024-10-16 01:17:34,217: llmtf.base.nlpcoreteam/enMMLU: {'acc': 0.6703320839095225} |
|
INFO: 2024-10-16 01:17:34,284: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-16 01:17:34,298: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruWorldTree nlpcoreteam/enMMLU russiannlp/rucola_custom |
|
0.518 0.393 0.434 0.420 0.489 0.431 0.050 0.961 0.942 0.378 0.540 0.722 0.670 0.303 |
|
INFO: 2024-10-16 01:20:37,498: llmtf.base.nlpcoreteam/ruMMLU: Processing Dataset: 786.13s |
|
INFO: 2024-10-16 01:20:37,501: llmtf.base.nlpcoreteam/ruMMLU: Results for nlpcoreteam/ruMMLU: |
|
INFO: 2024-10-16 01:20:37,548: llmtf.base.nlpcoreteam/ruMMLU: metric |
|
subject |
|
abstract_algebra 0.280000 |
|
anatomy 0.259259 |
|
astronomy 0.394737 |
|
business_ethics 0.520000 |
|
clinical_knowledge 0.392453 |
|
college_biology 0.375000 |
|
college_chemistry 0.350000 |
|
college_computer_science 0.460000 |
|
college_mathematics 0.390000 |
|
college_medicine 0.358382 |
|
college_physics 0.411765 |
|
computer_security 0.580000 |
|
conceptual_physics 0.446809 |
|
econometrics 0.307018 |
|
electrical_engineering 0.427586 |
|
elementary_mathematics 0.515873 |
|
formal_logic 0.357143 |
|
global_facts 0.260000 |
|
high_school_biology 0.490323 |
|
high_school_chemistry 0.325123 |
|
high_school_computer_science 0.620000 |
|
high_school_european_history 0.496970 |
|
high_school_geography 0.545455 |
|
high_school_government_and_politics 0.476684 |
|
high_school_macroeconomics 0.471795 |
|
high_school_mathematics 0.362963 |
|
high_school_microeconomics 0.487395 |
|
high_school_physics 0.344371 |
|
high_school_psychology 0.467890 |
|
high_school_statistics 0.453704 |
|
high_school_us_history 0.455882 |
|
high_school_world_history 0.556962 |
|
human_aging 0.443946 |
|
human_sexuality 0.541985 |
|
international_law 0.619835 |
|
jurisprudence 0.509259 |
|
logical_fallacies 0.447853 |
|
machine_learning 0.401786 |
|
management 0.466019 |
|
marketing 0.675214 |
|
medical_genetics 0.510000 |
|
miscellaneous 0.413793 |
|
moral_disputes 0.468208 |
|
moral_scenarios 0.237989 |
|
nutrition 0.539216 |
|
philosophy 0.501608 |
|
prehistory 0.425926 |
|
professional_accounting 0.333333 |
|
professional_law 0.344198 |
|
professional_medicine 0.345588 |
|
professional_psychology 0.444444 |
|
public_relations 0.454545 |
|
security_studies 0.534694 |
|
sociology 0.601990 |
|
us_foreign_policy 0.610000 |
|
virology 0.463855 |
|
world_religions 0.438596 |
|
INFO: 2024-10-16 01:20:37,555: llmtf.base.nlpcoreteam/ruMMLU: metric |
|
subject |
|
STEM 0.423891 |
|
humanities 0.450802 |
|
other (business, health, misc.) 0.427218 |
|
social sciences 0.495325 |
|
INFO: 2024-10-16 01:20:37,563: llmtf.base.nlpcoreteam/ruMMLU: {'acc': 0.4493090597333154} |
|
INFO: 2024-10-16 01:20:37,641: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-16 01:20:37,655: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruWorldTree nlpcoreteam/enMMLU nlpcoreteam/ruMMLU russiannlp/rucola_custom |
|
0.513 0.393 0.434 0.420 0.489 0.431 0.050 0.961 0.942 0.378 0.540 0.722 0.670 0.449 0.303 |
|
INFO: 2024-10-16 01:30:09,544: llmtf.base.darumeru/cp_para_ru: Processing Dataset: 816.73s |
|
INFO: 2024-10-16 01:30:09,547: llmtf.base.darumeru/cp_para_ru: Results for darumeru/cp_para_ru: |
|
INFO: 2024-10-16 01:30:09,581: llmtf.base.darumeru/cp_para_ru: {'symbol_per_token': 3.828156059562894, 'len': 0.9521096414122393, 'lcs': 0.37} |
|
INFO: 2024-10-16 01:30:09,583: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-16 01:30:09,583: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-16 01:30:12,888: llmtf.base.darumeru/cp_para_en: Loading Dataset: 3.30s |
|
INFO: 2024-10-16 01:43:00,225: llmtf.base.daru/treewayabstractive: Processing Dataset: 2385.55s |
|
INFO: 2024-10-16 01:43:00,233: llmtf.base.daru/treewayabstractive: Results for daru/treewayabstractive: |
|
INFO: 2024-10-16 01:43:00,251: llmtf.base.daru/treewayabstractive: {'rouge1': 0.2792817830105651, 'rouge2': 0.09909942829468928} |
|
INFO: 2024-10-16 01:43:00,256: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-16 01:43:00,267: llmtf.base.evaluator: |
|
mean daru/treewayabstractive daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_para_ru darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruWorldTree nlpcoreteam/enMMLU nlpcoreteam/ruMMLU russiannlp/rucola_custom |
|
0.484 0.189 0.393 0.434 0.420 0.489 0.431 0.050 0.370 0.961 0.942 0.378 0.540 0.722 0.670 0.449 0.303 |
|
INFO: 2024-10-16 01:44:44,037: llmtf.base.darumeru/cp_para_en: Processing Dataset: 871.15s |
|
INFO: 2024-10-16 01:44:44,040: llmtf.base.darumeru/cp_para_en: Results for darumeru/cp_para_en: |
|
INFO: 2024-10-16 01:44:44,044: llmtf.base.darumeru/cp_para_en: {'symbol_per_token': 4.439886096406914, 'len': 0.9905967745509706, 'lcs': 1.0} |
|
INFO: 2024-10-16 01:44:44,045: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-16 01:44:44,054: llmtf.base.evaluator: |
|
mean daru/treewayabstractive daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_para_en darumeru/cp_para_ru darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruWorldTree nlpcoreteam/enMMLU nlpcoreteam/ruMMLU russiannlp/rucola_custom |
|
0.514 0.189 0.393 0.434 0.420 0.489 0.431 0.050 1.000 0.370 0.961 0.942 0.378 0.540 0.722 0.670 0.449 0.303 |
|
|