metadata

language:
  - en
license: mit
library_name: transformers
inference:
  parameters:
    max_new_tokens: 64
    do_sample: true
    temperature: 0.1
    repetition_penalty: 10
    no_repeat_ngram_size: 4
    eta_cutoff: 0.0006
    renormalize_logits: true
widget:
  - text: My name is El Microondas the Wise, and
    example_title: El Microondas
  - text: Kennesaw State University is a public
    example_title: Kennesaw State University
  - text: >-
      Bungie Studios is an American video game developer. They are most famous
      for developing the award winning Halo series of video games. They also
      made Destiny. The studio was founded
    example_title: Bungie
  - text: The Mona Lisa is a world-renowned painting created by
    example_title: Mona Lisa
  - text: >-
      The Harry Potter series, written by J.K. Rowling, begins with the book
      titled
    example_title: Harry Potter Series
  - text: >-
      Question: I have cities, but no houses. I have mountains, but no trees. I
      have water, but no fish. What am I?

      Answer:
    example_title: Riddle
  - text: The process of photosynthesis involves the conversion of
    example_title: Photosynthesis
  - text: >-
      Jane went to the store to buy some groceries. She picked up apples,
      oranges, and a loaf of bread. When she got home, she realized she forgot
    example_title: Story Continuation
  - text: >-
      Problem 2: If a train leaves Station A at 9:00 AM and travels at 60 mph,
      and another train leaves Station B at 10:00 AM and travels at 80 mph, when
      will they meet if the distance between the stations is 300 miles?

      To determine
    example_title: Math Problem
  - text: In the context of computer programming, an algorithm is
    example_title: Algorithm Definition
pipeline_tag: text-generation
model-index:
  - name: nano-phi-115M-v0.1
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
          args:
            num_few_shot: 25
        metrics:
          - type: acc_norm
            value: 24.15
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kenhktsui/nano-phi-115M-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HellaSwag (10-Shot)
          type: hellaswag
          split: validation
          args:
            num_few_shot: 10
        metrics:
          - type: acc_norm
            value: 29.99
            name: normalized accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kenhktsui/nano-phi-115M-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 25.46
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kenhktsui/nano-phi-115M-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: TruthfulQA (0-shot)
          type: truthful_qa
          config: multiple_choice
          split: validation
          args:
            num_few_shot: 0
        metrics:
          - type: mc2
            value: 44.3
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kenhktsui/nano-phi-115M-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: Winogrande (5-shot)
          type: winogrande
          config: winogrande_xl
          split: validation
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 51.45
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kenhktsui/nano-phi-115M-v0.1
          name: Open LLM Leaderboard
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
          args:
            num_few_shot: 5
        metrics:
          - type: acc
            value: 0
            name: accuracy
        source:
          url: >-
            https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=kenhktsui/nano-phi-115M-v0.1
          name: Open LLM Leaderboard
datasets:
  - kenhktsui/minipile_quality_score_v1
  - kenhktsui/simple_wikipedia_LM_quality_score_v1
  - kenhktsui/refinedweb-3m_quality_score_v1
  - kenhktsui/TM-DATA_quality_score_v1
  - kenhktsui/openwebtext_quality_score_v1
  - HuggingFaceTB/cosmopedia

Model Card for nano-phi-192M-v0.1

This is a continual effort from kenhktsui/nano-phi-115M-v0.1.
The model is not aligned.

Major differences:

bigger tokenizer's vocab size
addition of HuggingFaceTB/cosmopedia as training dataset
training token: 19B vs 7B

How to use

To use the model, you will need transformer version >= 4.37.2

pip install transformers>=4.37.2

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="kenhktsui/nano-phi-192M-v0.1")
pipe("I am a machine learning researcher. I work on", max_new_tokens=50, repetition_penalty=10.0)

Some metrics

model
- hidden_size: 768
- num_key_value_heads: 8 (grouped query attention)
- num_attention_heads: 24
- num_hidden_layers: 6
- context length: 1024
- total params: 192M
training:
- global steps: 36,000

Open LLM Leaderboard Evaluation Results

Metric	kenhktsui/nano-phi-191M-v0.1	kenhktsui/nano-phi-115M-v0.1	microsoft/phi-2 (Reproduced)
Avg.	29.24	28.68	61.53
ARC (25-shot)	24.15	21.93	61.52
HellaSwag (10-shot)	29.99	27.87	75.13
MMLU (5-shot)	25.46	25.30	58.23
TruthfulQA (0-shot)	44.30	46.01	44.46
Winogrande (5-shot)	51.54	50.99	74.51
GSM8K (5-shot)	0.0	0.0	55.34

Details:

hf-causal-experimental (pretrained=/content/lm-evaluation-harness/artifacts/model-9gh18vfl:v25,use_accelerate=false,trust_remote_code=True), limit: None, provide_description: False, num_fewshot: 0, batch_size: 8

Task	Version	Metric	Value		Stderr
arc_easy	0	acc	0.4596	±	0.0102
		acc_norm	0.4070	±	0.0101

Task	Version	Metric	Value		Stderr
arc_challenge	0	acc	0.1911	±	0.0115
		acc_norm	0.2415	±	0.0125

Task	Version	Metric	Value		Stderr
hellaswag	0	acc	0.2833	±	0.0045
		acc_norm	0.2999	±	0.0046

Task	Version	Metric	Value		Stderr
truthfulqa_mc	1	mc1	0.2583	±	0.0153
		mc2	0.4430	±	0.0152

Task	Version	Metric	Value		Stderr
hendrycksTest-abstract_algebra	1	acc	0.2200	±	0.0416
		acc_norm	0.2200	±	0.0416
hendrycksTest-anatomy	1	acc	0.2593	±	0.0379
		acc_norm	0.2593	±	0.0379
hendrycksTest-astronomy	1	acc	0.1711	±	0.0306
		acc_norm	0.1711	±	0.0306
hendrycksTest-business_ethics	1	acc	0.2400	±	0.0429
		acc_norm	0.2400	±	0.0429
hendrycksTest-clinical_knowledge	1	acc	0.2566	±	0.0269
		acc_norm	0.2566	±	0.0269
hendrycksTest-college_biology	1	acc	0.2639	±	0.0369
		acc_norm	0.2639	±	0.0369
hendrycksTest-college_chemistry	1	acc	0.1800	±	0.0386
		acc_norm	0.1800	±	0.0386
hendrycksTest-college_computer_science	1	acc	0.3300	±	0.0473
		acc_norm	0.3300	±	0.0473
hendrycksTest-college_mathematics	1	acc	0.3000	±	0.0461
		acc_norm	0.3000	±	0.0461
hendrycksTest-college_medicine	1	acc	0.2023	±	0.0306
		acc_norm	0.2023	±	0.0306
hendrycksTest-college_physics	1	acc	0.2843	±	0.0449
		acc_norm	0.2843	±	0.0449
hendrycksTest-computer_security	1	acc	0.2200	±	0.0416
		acc_norm	0.2200	±	0.0416
hendrycksTest-conceptual_physics	1	acc	0.2511	±	0.0283
		acc_norm	0.2511	±	0.0283
hendrycksTest-econometrics	1	acc	0.2807	±	0.0423
		acc_norm	0.2807	±	0.0423
hendrycksTest-electrical_engineering	1	acc	0.2897	±	0.0378
		acc_norm	0.2897	±	0.0378
hendrycksTest-elementary_mathematics	1	acc	0.2804	±	0.0231
		acc_norm	0.2804	±	0.0231
hendrycksTest-formal_logic	1	acc	0.2143	±	0.0367
		acc_norm	0.2143	±	0.0367
hendrycksTest-global_facts	1	acc	0.1700	±	0.0378
		acc_norm	0.1700	±	0.0378
hendrycksTest-high_school_biology	1	acc	0.3226	±	0.0266
		acc_norm	0.3226	±	0.0266
hendrycksTest-high_school_chemistry	1	acc	0.2759	±	0.0314
		acc_norm	0.2759	±	0.0314
hendrycksTest-high_school_computer_science	1	acc	0.2700	±	0.0446
		acc_norm	0.2700	±	0.0446
hendrycksTest-high_school_european_history	1	acc	0.2606	±	0.0343
		acc_norm	0.2606	±	0.0343
hendrycksTest-high_school_geography	1	acc	0.3081	±	0.0329
		acc_norm	0.3081	±	0.0329
hendrycksTest-high_school_government_and_politics	1	acc	0.3627	±	0.0347
		acc_norm	0.3627	±	0.0347
hendrycksTest-high_school_macroeconomics	1	acc	0.2641	±	0.0224
		acc_norm	0.2641	±	0.0224
hendrycksTest-high_school_mathematics	1	acc	0.2630	±	0.0268
		acc_norm	0.2630	±	0.0268
hendrycksTest-high_school_microeconomics	1	acc	0.3403	±	0.0308
		acc_norm	0.3403	±	0.0308
hendrycksTest-high_school_physics	1	acc	0.3113	±	0.0378
		acc_norm	0.3113	±	0.0378
hendrycksTest-high_school_psychology	1	acc	0.2716	±	0.0191
		acc_norm	0.2716	±	0.0191
hendrycksTest-high_school_statistics	1	acc	0.4491	±	0.0339
		acc_norm	0.4491	±	0.0339
hendrycksTest-high_school_us_history	1	acc	0.2402	±	0.0300
		acc_norm	0.2402	±	0.0300
hendrycksTest-high_school_world_history	1	acc	0.2363	±	0.0277
		acc_norm	0.2363	±	0.0277
hendrycksTest-human_aging	1	acc	0.2197	±	0.0278
		acc_norm	0.2197	±	0.0278
hendrycksTest-human_sexuality	1	acc	0.2824	±	0.0395
		acc_norm	0.2824	±	0.0395
hendrycksTest-international_law	1	acc	0.2479	±	0.0394
		acc_norm	0.2479	±	0.0394
hendrycksTest-jurisprudence	1	acc	0.2037	±	0.0389
		acc_norm	0.2037	±	0.0389
hendrycksTest-logical_fallacies	1	acc	0.2393	±	0.0335
		acc_norm	0.2393	±	0.0335
hendrycksTest-machine_learning	1	acc	0.1875	±	0.0370
		acc_norm	0.1875	±	0.0370
hendrycksTest-management	1	acc	0.2039	±	0.0399
		acc_norm	0.2039	±	0.0399
hendrycksTest-marketing	1	acc	0.1795	±	0.0251
		acc_norm	0.1795	±	0.0251
hendrycksTest-medical_genetics	1	acc	0.3000	±	0.0461
		acc_norm	0.3000	±	0.0461
hendrycksTest-miscellaneous	1	acc	0.2644	±	0.0158
		acc_norm	0.2644	±	0.0158
hendrycksTest-moral_disputes	1	acc	0.2225	±	0.0224
		acc_norm	0.2225	±	0.0224
hendrycksTest-moral_scenarios	1	acc	0.2726	±	0.0149
		acc_norm	0.2726	±	0.0149
hendrycksTest-nutrition	1	acc	0.2353	±	0.0243
		acc_norm	0.2353	±	0.0243
hendrycksTest-philosophy	1	acc	0.2283	±	0.0238
		acc_norm	0.2283	±	0.0238
hendrycksTest-prehistory	1	acc	0.2099	±	0.0227
		acc_norm	0.2099	±	0.0227
hendrycksTest-professional_accounting	1	acc	0.2411	±	0.0255
		acc_norm	0.2411	±	0.0255
hendrycksTest-professional_law	1	acc	0.2458	±	0.0110
		acc_norm	0.2458	±	0.0110
hendrycksTest-professional_medicine	1	acc	0.3897	±	0.0296
		acc_norm	0.3897	±	0.0296
hendrycksTest-professional_psychology	1	acc	0.2141	±	0.0166
		acc_norm	0.2141	±	0.0166
hendrycksTest-public_relations	1	acc	0.1818	±	0.0369
		acc_norm	0.1818	±	0.0369
hendrycksTest-security_studies	1	acc	0.2490	±	0.0277
		acc_norm	0.2490	±	0.0277
hendrycksTest-sociology	1	acc	0.2537	±	0.0308
		acc_norm	0.2537	±	0.0308
hendrycksTest-us_foreign_policy	1	acc	0.2900	±	0.0456
		acc_norm	0.2900	±	0.0456
hendrycksTest-virology	1	acc	0.1807	±	0.0300
		acc_norm	0.1807	±	0.0300
hendrycksTest-world_religions	1	acc	0.1813	±	0.0295
		acc_norm	0.1813	±	0.0295

Task	Version	Metric	Value		Stderr
winogrande	0	acc	0.5154	±	0.014

Task	Version	Metric	Value		Stderr
gsm8k	0	acc	0	±	0