allenai
/

Molmo-72B-0924

Image-Text-to-Text

text-generation

Model card Files Files and versions Community

chrisc36 commited on 11 days ago

Commit

2ce8d82

•

1 Parent(s): bfcc419

Update README.md

Files changed (1) hide show

README.md +6 -0

README.md CHANGED Viewed

@@ -94,16 +94,21 @@ print(generated_text)
 To make inference more efficient, run with autocast:
 with torch.autocast(device_type="cuda", enabled=True, dtype=torch.bfloat16):
   output = model.generate_from_batch(
       inputs,
       GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
       tokenizer=processor.tokenizer
   )
 We did most of our evaluation in this setting (autocast on, but float32 weights)
 To even further reduce the memory requirements, the model can be run with bfloat16 weights:
 model.to(dtype=torch.bfloat16)
 inputs["images"] = inputs["images"].to(torch.bfloat16)
 output = model.generate_from_batch(
@@ -111,6 +116,7 @@ output = model.generate_from_batch(
     GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
     tokenizer=processor.tokenizer
 )
 Note that we have observed that this can change the output of the model compared to running with float32 weights.
 ## Evaluations

 To make inference more efficient, run with autocast:
+```python
 with torch.autocast(device_type="cuda", enabled=True, dtype=torch.bfloat16):
   output = model.generate_from_batch(
       inputs,
       GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
       tokenizer=processor.tokenizer
   )
+```
 We did most of our evaluation in this setting (autocast on, but float32 weights)
 To even further reduce the memory requirements, the model can be run with bfloat16 weights:
+```
 model.to(dtype=torch.bfloat16)
 inputs["images"] = inputs["images"].to(torch.bfloat16)
 output = model.generate_from_batch(
     GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
     tokenizer=processor.tokenizer
 )
+```
 Note that we have observed that this can change the output of the model compared to running with float32 weights.
 ## Evaluations