--- language: - sw - en metrics: - perplexity - bleu library_name: peft pipeline_tag: question-answering --- # Model Card for UlizaLlama ## Model Details UlizaLlama7b-1 is a language model that builds upon the foundation of Jacaranda/kiswallama-pretrained7B. Jacaranda/kiswallama-pretrained is a large language model continually-pretrained with 321,530,045 swahili tokens and a customized tokenizer with a swahili vocabulary of 20,000 tokens to extend the capabilities of Meta/Llama2. It offers significant improvements in both encoding and decoding for Swahili text, surpassing the Swahili performance of Meta/Llama2. Moreover, Jacaranda/kiswallama-pretrained excels in providing accurate next-word completions in Swahili, a capability which Meta/Llama2 falls short of. ### Model Description Origin: Adaptation of the Jacaranda/kiswallama-pretrained model. Data: Instructional dataset in Swahili and English consisting of prompt-response pairs. Training: Alignment to standard methodologies, incorporation of task-centric heads, neural network weight optimization via backpropagation, and task-specific adjustments. Fine-tuning: Utilized the LoRA approach, refining two matrices that mirror the main matrix from Jacaranda/kiswallama-pretrained. This Low Rank Adapter (LoRa) was vital for instruction-focused fine-tuning. Post-training, the developed LoRa was extracted, and Hugging Face's merge and unload() function facilitated the amalgamation of adapter weights with the base model. This fusion enables standalone inference with the merged model - **Developed by:** [Jacaranda Health] - **Funded by [optional]:** [Google] - **Model type:** [LlamaModelForCausalLm] - **Language(s) (NLP):** [English and Swahili] - **License:** [to include] - **Model Developers:** [Stanslaus Mwongela, Jay Patel, Sathy Rajasekharan] - **Finetuned from model:** [ Jacaranda/kiswallama-pretrained model which builds upon Meta/Llama2] ## Uses UlizaLlama7b-1 is optimized for downstream tasks, notably those demanding instructional datasets in Swahili, English, or both. Organizations can further fine-tune it for their specific domains. Potential areas include: - Question-answering within specific domains. - Assistant-driven chat capabilities: healthcare, agriculture, legal, education, tourism and hospitality, public services, financial sectors, communication, customer assistance, commerce, etcpublic services, financial sectors, communication, customer assistance, commerce, etc. Meanwhile, Jacaranda/kiswallama-pretrained offers versatility in: - Text Summarization - Autoregressive Text Completion - Content Generation - Text Rewording - Grammar Refinement and Editing - Further Research-The current UlizaLlama is available as a 7 Billion parameters model, further research can also explore availing bigger variants of UlizaLlama. ### Out-of-Scope Use To ensure the ethical and responsible use of UlizaLlama, we have outlined a set of guidelines. These guidelines categorize activities and practices into three main areas: prohibited actions, high-risk activities, and deceptive practices. By understanding and adhering to these directives, users can contribute to a safer and more trustworthy environment. ## Bias, Risks, and Limitations UlizaLlama7b-1 is a cutting-edge technology brimming with possibilities, yet is not without inherent risks. The extensive testing conducted thus far has been predominantly in Swahili, English, however leaving an expansive terrain of uncharted scenarios. Consequently, like its LLM counterparts, UlizaLlama7b-1 outcome predictability remains elusive, and there's the potential for it to occasionally generate responses that are either inaccurate, biased, or otherwise objectionable in nature when prompted by users. With this in mind, the responsible course of action dictates that, prior to deploying UlizaLlama7b-1 in any applications, developers must embark on a diligent journey of safety testing and meticulous fine-tuning, customized to the unique demands of their specific use cases.