|
INFO: 2024-10-15 23:27:34,633: llmtf.base.evaluator: Starting eval on ['darumeru/multiq', 'darumeru/parus', 'darumeru/rcb', 'darumeru/ruopenbookqa', 'darumeru/ruworldtree', 'darumeru/rwsd', 'darumeru/use', 'russiannlp/rucola_custom'] |
|
INFO: 2024-10-15 23:27:34,634: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:27:34,634: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:27:35,580: llmtf.base.evaluator: Starting eval on ['darumeru/rummlu'] |
|
INFO: 2024-10-15 23:27:35,581: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:27:35,581: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:27:37,434: llmtf.base.evaluator: Starting eval on ['nlpcoreteam/rummlu'] |
|
INFO: 2024-10-15 23:27:37,435: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:27:37,435: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:27:39,620: llmtf.base.evaluator: Starting eval on ['nlpcoreteam/enmmlu'] |
|
INFO: 2024-10-15 23:27:39,621: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:27:39,621: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:27:41,015: llmtf.base.evaluator: Starting eval on ['daru/treewayabstractive'] |
|
INFO: 2024-10-15 23:27:41,022: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:27:41,022: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:27:43,196: llmtf.base.evaluator: Starting eval on ['daru/treewayextractive'] |
|
INFO: 2024-10-15 23:27:43,196: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:27:43,196: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:27:45,192: llmtf.base.evaluator: Starting eval on ['darumeru/cp_sent_ru', 'darumeru/cp_sent_en', 'darumeru/cp_para_ru', 'darumeru/cp_para_en'] |
|
INFO: 2024-10-15 23:27:45,193: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:27:45,193: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:27:49,559: llmtf.base.darumeru/cp_sent_ru: Loading Dataset: 4.37s |
|
INFO: 2024-10-15 23:27:56,068: llmtf.base.daru/treewayextractive: Loading Dataset: 12.87s |
|
INFO: 2024-10-15 23:27:57,445: llmtf.base.darumeru/MultiQ: Loading Dataset: 22.81s |
|
INFO: 2024-10-15 23:27:59,762: llmtf.base.daru/treewayabstractive: Loading Dataset: 18.74s |
|
INFO: 2024-10-15 23:29:07,300: llmtf.base.darumeru/ruMMLU: Loading Dataset: 91.72s |
|
INFO: 2024-10-15 23:30:55,098: llmtf.base.daru/treewayextractive: Processing Dataset: 179.02s |
|
INFO: 2024-10-15 23:30:55,100: llmtf.base.daru/treewayextractive: Results for daru/treewayextractive: |
|
INFO: 2024-10-15 23:30:55,311: llmtf.base.daru/treewayextractive: {'r-prec': 0.35143051948051945} |
|
INFO: 2024-10-15 23:30:55,347: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-15 23:30:55,351: llmtf.base.evaluator: |
|
mean daru/treewayextractive |
|
0.351 0.351 |
|
INFO: 2024-10-15 23:31:07,598: llmtf.base.darumeru/MultiQ: Processing Dataset: 190.15s |
|
INFO: 2024-10-15 23:31:07,614: llmtf.base.darumeru/MultiQ: Results for darumeru/MultiQ: |
|
INFO: 2024-10-15 23:31:07,618: llmtf.base.darumeru/MultiQ: {'f1': 0.4149813636918578, 'em': 0.2762906309751434} |
|
INFO: 2024-10-15 23:31:07,628: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:31:07,628: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:31:10,999: llmtf.base.darumeru/PARus: Loading Dataset: 3.37s |
|
INFO: 2024-10-15 23:31:13,775: llmtf.base.nlpcoreteam/enMMLU: Loading Dataset: 214.15s |
|
INFO: 2024-10-15 23:31:17,369: llmtf.base.darumeru/PARus: Processing Dataset: 6.37s |
|
INFO: 2024-10-15 23:31:17,372: llmtf.base.darumeru/PARus: Results for darumeru/PARus: |
|
INFO: 2024-10-15 23:31:17,384: llmtf.base.darumeru/PARus: {'acc': 0.32} |
|
INFO: 2024-10-15 23:31:17,386: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:31:17,386: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:31:21,212: llmtf.base.darumeru/RCB: Loading Dataset: 3.82s |
|
INFO: 2024-10-15 23:31:32,123: llmtf.base.darumeru/RCB: Processing Dataset: 10.91s |
|
INFO: 2024-10-15 23:31:32,141: llmtf.base.darumeru/RCB: Results for darumeru/RCB: |
|
INFO: 2024-10-15 23:31:32,148: llmtf.base.darumeru/RCB: {'acc': 0.42727272727272725, 'f1_macro': 0.4213976946667694} |
|
INFO: 2024-10-15 23:31:32,150: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:31:32,151: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:31:47,498: llmtf.base.darumeru/ruOpenBookQA: Loading Dataset: 15.35s |
|
INFO: 2024-10-15 23:32:07,223: llmtf.base.nlpcoreteam/ruMMLU: Loading Dataset: 269.79s |
|
INFO: 2024-10-15 23:32:59,383: llmtf.base.darumeru/ruOpenBookQA: Processing Dataset: 71.88s |
|
INFO: 2024-10-15 23:32:59,384: llmtf.base.darumeru/ruOpenBookQA: Results for darumeru/ruOpenBookQA: |
|
INFO: 2024-10-15 23:32:59,398: llmtf.base.darumeru/ruOpenBookQA: {'acc': 0.49312714776632305, 'f1_macro': 0.48517498211245624} |
|
INFO: 2024-10-15 23:32:59,414: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:32:59,415: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:33:02,154: llmtf.base.darumeru/ruWorldTree: Loading Dataset: 2.74s |
|
INFO: 2024-10-15 23:33:06,332: llmtf.base.darumeru/ruWorldTree: Processing Dataset: 4.18s |
|
INFO: 2024-10-15 23:33:06,333: llmtf.base.darumeru/ruWorldTree: Results for darumeru/ruWorldTree: |
|
INFO: 2024-10-15 23:33:06,338: llmtf.base.darumeru/ruWorldTree: {'acc': 0.5619047619047619, 'f1_macro': 0.5387530387530388} |
|
INFO: 2024-10-15 23:33:06,340: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:33:06,340: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:33:10,149: llmtf.base.darumeru/RWSD: Loading Dataset: 3.81s |
|
INFO: 2024-10-15 23:33:19,994: llmtf.base.darumeru/RWSD: Processing Dataset: 9.84s |
|
INFO: 2024-10-15 23:33:19,996: llmtf.base.darumeru/RWSD: Results for darumeru/RWSD: |
|
INFO: 2024-10-15 23:33:20,000: llmtf.base.darumeru/RWSD: {'acc': 0.44607843137254904} |
|
INFO: 2024-10-15 23:33:20,002: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:33:20,002: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:33:36,147: llmtf.base.darumeru/USE: Loading Dataset: 16.14s |
|
INFO: 2024-10-15 23:34:28,409: llmtf.base.darumeru/cp_sent_ru: Processing Dataset: 398.85s |
|
INFO: 2024-10-15 23:34:28,413: llmtf.base.darumeru/cp_sent_ru: Results for darumeru/cp_sent_ru: |
|
INFO: 2024-10-15 23:34:28,417: llmtf.base.darumeru/cp_sent_ru: {'symbol_per_token': 3.552729392451623, 'len': 0.912449166502956, 'lcs': 0.4414784394250513} |
|
INFO: 2024-10-15 23:34:28,421: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:34:28,421: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:34:31,828: llmtf.base.darumeru/cp_sent_en: Loading Dataset: 3.41s |
|
INFO: 2024-10-15 23:36:47,199: llmtf.base.darumeru/USE: Processing Dataset: 191.05s |
|
INFO: 2024-10-15 23:36:47,203: llmtf.base.darumeru/USE: Results for darumeru/USE: |
|
INFO: 2024-10-15 23:36:47,225: llmtf.base.darumeru/USE: {'grade_norm': 0.04607843137254901} |
|
INFO: 2024-10-15 23:36:47,231: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:36:47,232: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:37:08,589: llmtf.base.russiannlp/rucola_custom: Loading Dataset: 21.36s |
|
INFO: 2024-10-15 23:38:37,411: llmtf.base.russiannlp/rucola_custom: Processing Dataset: 88.82s |
|
INFO: 2024-10-15 23:38:37,415: llmtf.base.russiannlp/rucola_custom: Results for russiannlp/rucola_custom: |
|
INFO: 2024-10-15 23:38:37,428: llmtf.base.russiannlp/rucola_custom: {'acc': 0.5601004664513815, 'mcc': 0.09897569327848366} |
|
INFO: 2024-10-15 23:38:37,439: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-15 23:38:37,451: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_ru darumeru/ruOpenBookQA darumeru/ruWorldTree russiannlp/rucola_custom |
|
0.422 0.351 0.346 0.320 0.424 0.446 0.046 0.912 0.489 0.550 0.330 |
|
INFO: 2024-10-15 23:39:22,591: llmtf.base.darumeru/ruMMLU: Processing Dataset: 615.29s |
|
INFO: 2024-10-15 23:39:22,593: llmtf.base.darumeru/ruMMLU: Results for darumeru/ruMMLU: |
|
INFO: 2024-10-15 23:39:22,633: llmtf.base.darumeru/ruMMLU: {'acc': 0.37663374239249725} |
|
INFO: 2024-10-15 23:39:22,704: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-15 23:39:22,714: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruWorldTree russiannlp/rucola_custom |
|
0.417 0.351 0.346 0.320 0.424 0.446 0.046 0.912 0.377 0.489 0.550 0.330 |
|
INFO: 2024-10-15 23:41:14,360: llmtf.base.darumeru/cp_sent_en: Processing Dataset: 402.53s |
|
INFO: 2024-10-15 23:41:14,363: llmtf.base.darumeru/cp_sent_en: Results for darumeru/cp_sent_en: |
|
INFO: 2024-10-15 23:41:14,368: llmtf.base.darumeru/cp_sent_en: {'symbol_per_token': 4.420697143481598, 'len': 0.9613784781803254, 'lcs': 1.0} |
|
INFO: 2024-10-15 23:41:14,371: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:41:14,371: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:41:17,993: llmtf.base.darumeru/cp_para_ru: Loading Dataset: 3.62s |
|
INFO: 2024-10-15 23:42:06,906: llmtf.base.nlpcoreteam/enMMLU: Processing Dataset: 653.13s |
|
INFO: 2024-10-15 23:42:06,908: llmtf.base.nlpcoreteam/enMMLU: Results for nlpcoreteam/enMMLU: |
|
INFO: 2024-10-15 23:42:06,954: llmtf.base.nlpcoreteam/enMMLU: metric |
|
subject |
|
abstract_algebra 0.400000 |
|
anatomy 0.651852 |
|
astronomy 0.677632 |
|
business_ethics 0.670000 |
|
clinical_knowledge 0.735849 |
|
college_biology 0.729167 |
|
college_chemistry 0.440000 |
|
college_computer_science 0.550000 |
|
college_mathematics 0.470000 |
|
college_medicine 0.670520 |
|
college_physics 0.441176 |
|
computer_security 0.790000 |
|
conceptual_physics 0.604255 |
|
econometrics 0.543860 |
|
electrical_engineering 0.648276 |
|
elementary_mathematics 0.542328 |
|
formal_logic 0.365079 |
|
global_facts 0.330000 |
|
high_school_biology 0.806452 |
|
high_school_chemistry 0.566502 |
|
high_school_computer_science 0.780000 |
|
high_school_european_history 0.787879 |
|
high_school_geography 0.797980 |
|
high_school_government_and_politics 0.849741 |
|
high_school_macroeconomics 0.720513 |
|
high_school_mathematics 0.488889 |
|
high_school_microeconomics 0.743697 |
|
high_school_physics 0.443709 |
|
high_school_psychology 0.851376 |
|
high_school_statistics 0.587963 |
|
high_school_us_history 0.808824 |
|
high_school_world_history 0.822785 |
|
human_aging 0.726457 |
|
human_sexuality 0.778626 |
|
international_law 0.752066 |
|
jurisprudence 0.824074 |
|
logical_fallacies 0.742331 |
|
machine_learning 0.508929 |
|
management 0.834951 |
|
marketing 0.888889 |
|
medical_genetics 0.740000 |
|
miscellaneous 0.802043 |
|
moral_disputes 0.676301 |
|
moral_scenarios 0.288268 |
|
nutrition 0.754902 |
|
philosophy 0.707395 |
|
prehistory 0.753086 |
|
professional_accounting 0.510638 |
|
professional_law 0.479791 |
|
professional_medicine 0.610294 |
|
professional_psychology 0.710784 |
|
public_relations 0.709091 |
|
security_studies 0.734694 |
|
sociology 0.830846 |
|
us_foreign_policy 0.840000 |
|
virology 0.518072 |
|
world_religions 0.812865 |
|
INFO: 2024-10-15 23:42:06,961: llmtf.base.nlpcoreteam/enMMLU: metric |
|
subject |
|
STEM 0.581960 |
|
humanities 0.678519 |
|
other (business, health, misc.) 0.674605 |
|
social sciences 0.759267 |
|
INFO: 2024-10-15 23:42:06,984: llmtf.base.nlpcoreteam/enMMLU: {'acc': 0.6735877381763914} |
|
INFO: 2024-10-15 23:42:07,051: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-15 23:42:07,099: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruWorldTree nlpcoreteam/enMMLU russiannlp/rucola_custom |
|
0.479 0.351 0.346 0.320 0.424 0.446 0.046 0.961 0.912 0.377 0.489 0.550 0.674 0.330 |
|
INFO: 2024-10-15 23:45:18,092: llmtf.base.nlpcoreteam/ruMMLU: Processing Dataset: 790.87s |
|
INFO: 2024-10-15 23:45:18,095: llmtf.base.nlpcoreteam/ruMMLU: Results for nlpcoreteam/ruMMLU: |
|
INFO: 2024-10-15 23:45:18,142: llmtf.base.nlpcoreteam/ruMMLU: metric |
|
subject |
|
abstract_algebra 0.360000 |
|
anatomy 0.333333 |
|
astronomy 0.388158 |
|
business_ethics 0.420000 |
|
clinical_knowledge 0.411321 |
|
college_biology 0.298611 |
|
college_chemistry 0.350000 |
|
college_computer_science 0.400000 |
|
college_mathematics 0.350000 |
|
college_medicine 0.398844 |
|
college_physics 0.362745 |
|
computer_security 0.560000 |
|
conceptual_physics 0.425532 |
|
econometrics 0.350877 |
|
electrical_engineering 0.475862 |
|
elementary_mathematics 0.529101 |
|
formal_logic 0.333333 |
|
global_facts 0.250000 |
|
high_school_biology 0.435484 |
|
high_school_chemistry 0.349754 |
|
high_school_computer_science 0.600000 |
|
high_school_european_history 0.448485 |
|
high_school_geography 0.449495 |
|
high_school_government_and_politics 0.404145 |
|
high_school_macroeconomics 0.412821 |
|
high_school_mathematics 0.414815 |
|
high_school_microeconomics 0.449580 |
|
high_school_physics 0.337748 |
|
high_school_psychology 0.381651 |
|
high_school_statistics 0.398148 |
|
high_school_us_history 0.441176 |
|
high_school_world_history 0.468354 |
|
human_aging 0.426009 |
|
human_sexuality 0.442748 |
|
international_law 0.628099 |
|
jurisprudence 0.481481 |
|
logical_fallacies 0.411043 |
|
machine_learning 0.428571 |
|
management 0.398058 |
|
marketing 0.628205 |
|
medical_genetics 0.530000 |
|
miscellaneous 0.381865 |
|
moral_disputes 0.485549 |
|
moral_scenarios 0.244693 |
|
nutrition 0.529412 |
|
philosophy 0.466238 |
|
prehistory 0.410494 |
|
professional_accounting 0.343972 |
|
professional_law 0.344850 |
|
professional_medicine 0.312500 |
|
professional_psychology 0.406863 |
|
public_relations 0.418182 |
|
security_studies 0.502041 |
|
sociology 0.572139 |
|
us_foreign_policy 0.620000 |
|
virology 0.403614 |
|
world_religions 0.415205 |
|
INFO: 2024-10-15 23:45:18,149: llmtf.base.nlpcoreteam/ruMMLU: metric |
|
subject |
|
STEM 0.414696 |
|
humanities 0.429154 |
|
other (business, health, misc.) 0.411938 |
|
social sciences 0.450878 |
|
INFO: 2024-10-15 23:45:18,174: llmtf.base.nlpcoreteam/ruMMLU: {'acc': 0.4266666289558874} |
|
INFO: 2024-10-15 23:45:18,249: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-15 23:45:18,263: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruWorldTree nlpcoreteam/enMMLU nlpcoreteam/ruMMLU russiannlp/rucola_custom |
|
0.475 0.351 0.346 0.320 0.424 0.446 0.046 0.961 0.912 0.377 0.489 0.550 0.674 0.427 0.330 |
|
INFO: 2024-10-15 23:55:25,025: llmtf.base.darumeru/cp_para_ru: Processing Dataset: 847.01s |
|
INFO: 2024-10-15 23:55:25,029: llmtf.base.darumeru/cp_para_ru: Results for darumeru/cp_para_ru: |
|
INFO: 2024-10-15 23:55:25,033: llmtf.base.darumeru/cp_para_ru: {'symbol_per_token': 3.317211095060652, 'len': 0.7885274694294595, 'lcs': 0.11} |
|
INFO: 2024-10-15 23:55:25,034: llmtf.base.hfmodel: Updated generation_config.eos_token_id: [147075, 198, 271] |
|
INFO: 2024-10-15 23:55:25,034: llmtf.base.hfmodel: Updated generation_config.stop_strings: ['\n', '\n\n'] |
|
INFO: 2024-10-15 23:55:28,357: llmtf.base.darumeru/cp_para_en: Loading Dataset: 3.32s |
|
INFO: 2024-10-16 00:10:03,216: llmtf.base.darumeru/cp_para_en: Processing Dataset: 874.86s |
|
INFO: 2024-10-16 00:10:03,219: llmtf.base.darumeru/cp_para_en: Results for darumeru/cp_para_en: |
|
INFO: 2024-10-16 00:10:03,224: llmtf.base.darumeru/cp_para_en: {'symbol_per_token': 4.4363376335695355, 'len': 0.9918057024834991, 'lcs': 1.0} |
|
INFO: 2024-10-16 00:10:03,225: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-16 00:10:03,237: llmtf.base.evaluator: |
|
mean daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_para_en darumeru/cp_para_ru darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruWorldTree nlpcoreteam/enMMLU nlpcoreteam/ruMMLU russiannlp/rucola_custom |
|
0.485 0.351 0.346 0.320 0.424 0.446 0.046 1.000 0.110 0.961 0.912 0.377 0.489 0.550 0.674 0.427 0.330 |
|
INFO: 2024-10-16 00:13:39,224: llmtf.base.daru/treewayabstractive: Processing Dataset: 2739.46s |
|
INFO: 2024-10-16 00:13:39,228: llmtf.base.daru/treewayabstractive: Results for daru/treewayabstractive: |
|
INFO: 2024-10-16 00:13:39,233: llmtf.base.daru/treewayabstractive: {'rouge1': 0.24324312209419716, 'rouge2': 0.07780431416219476} |
|
INFO: 2024-10-16 00:13:39,238: llmtf.base.evaluator: Ended eval |
|
INFO: 2024-10-16 00:13:39,261: llmtf.base.evaluator: |
|
mean daru/treewayabstractive daru/treewayextractive darumeru/MultiQ darumeru/PARus darumeru/RCB darumeru/RWSD darumeru/USE darumeru/cp_para_en darumeru/cp_para_ru darumeru/cp_sent_en darumeru/cp_sent_ru darumeru/ruMMLU darumeru/ruOpenBookQA darumeru/ruWorldTree nlpcoreteam/enMMLU nlpcoreteam/ruMMLU russiannlp/rucola_custom |
|
0.466 0.161 0.351 0.346 0.320 0.424 0.446 0.046 1.000 0.110 0.961 0.912 0.377 0.489 0.550 0.674 0.427 0.330 |
|
|