bit-dny
/

MindLLM-1b3-chat-zh-v2.0

Text Generation

Inference Endpoints

Model card Files Files and versions Community

bit-dny commited on Jan 8

Commit

5c6b1ad

•

1 Parent(s): faea48b

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -44,6 +44,14 @@ To cite this model, please use
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->

 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Deployment resource consumption
+| Precision  | Minimum GPU memory (Inference)   | Minimum GPU memory (Full Parameter Fine-tuning)    |
+|-------|-------|-------|
+| float32    | 6.08G    | 32.65G    |
+| float16(unquantized)    | 3.45G    | -(36.94G*)    |
+| bfloat16(unquantized)    | 3.45G    | 20.47G（33.93G*）    |
+* \* Indicates use of mixed precision
 ### Direct Use
 <!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->