Jacaranda commited on
Commit
4bea6af
1 Parent(s): 355f59e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -14,18 +14,18 @@ pipeline_tag: question-answering
14
 
15
 
16
  ## Model Details
17
- UlizaLlama7b-1 is a language model that builds upon the foundation of Jacaranda/kiswallama-pretrained7B. Jacaranda/kiswallama-pretrained is a large language model continually-pretrained with 321,530,045 swahili tokens and a customized tokenizer with a swahili vocabulary of 20,000 tokens to extend the capabilities of Meta/Llama2. It offers significant improvements in both encoding and decoding for Swahili text, surpassing the Swahili performance of Meta/Llama2. Moreover, Jacaranda/kiswallama-pretrained excels in providing accurate next-word completions in Swahili, a capability which Meta/Llama2 falls short of.
18
  ### Model Description
19
- Origin: Adaptation of the Jacaranda/kiswallama-pretrained model.
20
- Data: Instructional dataset in Swahili and English consisting of prompt-response pairs.
21
- Training: Alignment to standard methodologies, incorporation of task-centric heads, neural network weight optimization via backpropagation, and task-specific adjustments.
22
- Fine-tuning: Utilized the LoRA approach, refining two matrices that mirror the main matrix from Jacaranda/kiswallama-pretrained. This Low Rank Adapter (LoRa) was vital for instruction-focused fine-tuning. Post-training, the developed LoRa was extracted, and Hugging Face's merge and unload() function facilitated the amalgamation of adapter weights with the base model. This fusion enables standalone inference with the merged model
23
  <!-- Provide a longer summary of what this model is. -->
24
 
25
 
26
 
27
- - **Developed by:** [Jacaranda Health]
28
- - **Funded by [optional]:** [Google]
29
  - **Model type:** [LlamaModelForCausalLm]
30
  - **Language(s) (NLP):** [English and Swahili]
31
  - **License:** [to include]
 
14
 
15
 
16
  ## Model Details
17
+ UlizaLlama7b-1 is a language model that builds upon the foundation of [Jacaranda/kiswallama-pretrained7B](https://huggingface.co/Jacaranda/kiswallama-pretrained). Jacaranda/kiswallama-pretrained is a large language model continually-pretrained with 321,530,045 swahili tokens and a customized tokenizer with a swahili vocabulary of 20,000 tokens to extend the capabilities of [Meta/Llama2](https://huggingface.co/meta-llama/Llama-2-7b). It offers significant improvements in both encoding and decoding for Swahili text, surpassing the Swahili performance of Meta/Llama2. Moreover, Jacaranda/kiswallama-pretrained excels in providing accurate next-word completions in Swahili, a capability which Meta/Llama2 falls short of.
18
  ### Model Description
19
+ - Origin: Adaptation of the Jacaranda/kiswallama-pretrained model.
20
+ - Data: Instructional dataset in Swahili and English consisting of prompt-response pairs.
21
+ - Training: Alignment to standard methodologies, incorporation of task-centric heads, neural network weight optimization via backpropagation, and task-specific adjustments.
22
+ - Fine-tuning: Utilized the LoRA approach, refining two matrices that mirror the main matrix from Jacaranda/kiswallama-pretrained. This Low Rank Adapter (LoRa) was vital for instruction-focused fine-tuning. Post-training, the developed LoRa was extracted, and Hugging Face's merge and unload() function facilitated the amalgamation of adapter weights with the base model. This fusion enables standalone inference with the merged model
23
  <!-- Provide a longer summary of what this model is. -->
24
 
25
 
26
 
27
+ - **Developed by:** [Jacaranda Health](https://www.jacarandahealth.org/)
28
+ - **Funded by [optional]:** [Google AI For Social Good Grant]
29
  - **Model type:** [LlamaModelForCausalLm]
30
  - **Language(s) (NLP):** [English and Swahili]
31
  - **License:** [to include]