Can this model be used on Apple Silicon?

#14
by jsmidt - opened

Is it possible to run an fp8 model on Apple Silicon or "mps"? If so, does anyone have an example script that could be used? torch_dtype=torch.float8 does not seem to work. I get this error:

 "TypeError: Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype."

Is it possible to run an fp8 model on Apple Silicon or "mps"? If so, does anyone have an example script that could be used? torch_dtype=torch.float8 does not seem to work. I get this error:

 "TypeError: Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype."

me too~please help me

Sorry y'all, this is a Mac issue, not a Flux issue. FP8 still isn't supported on MPS after years of people complaining. If you have 16GB+ of RAM you can run it on CPU, but it will be much slower. I get 120s/it for 1024*1024.

I have it running on MacOS on an M2 chip. I recommend using Conda to create an environment, and use these specific versions of torch etc.
torch==2.3.1
torchaudio==2.3.1
torchsde==0.2.6
torchvision==0.18.1

Untitled.jpg

Thank you so much, @FiditeNemini
I was having an issue running on the latest versions.

torch==2.5.0.dev20240806
torchaudio==2.4.0.dev20240806
torchsde==0.2.6
torchvision==0.20.0.dev20240806

I've downgraded to your version, and it works on my M1 Mac. (it took 30 mins though lol)
Thanks heaps!

Thanks much @ FiditeNemini!

I had issues running the Flux Schnell version on my Mac Studio M1 with 32GB running Sonoma 14.6.1.

I downloaded the Schnell model, ae model (not sure if needed for Schnell) was using the Schnell sample image workflow from this URL, but was getting a bad image of color splotches:
https://comfyanonymous.github.io/ComfyUI_examples/flux/#simple-to-use-fp8-checkpoint-version

I set up a new conda environment for flux, loaded the older torch* versions you specified, ran 'pip install -r requirements.txt' in my ComfyUI folder again and it all started working!
Render time on the sample workflow contained in the sample image (1024 x 1024) took 220 seconds.

Thanks alot for the help!

@kelvinator2 I set up a new conda environment for flux, loaded the older torch* versions you specified, ran 'pip install -r requirements.txt' in my ComfyUI folder again and still getting a bad image of color splotches (m1 max. 64g 14.6)

@zwqjoy Sorry you're still having a hassle. I'm trying to think what might be a difference between my set up and yours. I assume that at least as an initial test (given you have 64GB and could load more) you downloaded the Flux.1 Schnell model to test, not the larger ones for Flux.1 and were trying to run the exact workflow example I linked as a test? Also, I think there is some difference between the behavior of Sonoma 14.6 and Sonoma 14.61, FWIW. As noted, I'm running 14.61, which may be what you meant anyway. And you put the ae model in the comfyUI/model/vae folder? Like I say, not sure if its needed, but I did.

I can tell you the main thing I had to do, which hopefully you're already fine on, but I was not: Before I could get a number of things working completely correctly in comfyUI, I discovered that my Mac Terminal program had been set to run under Rosetta (probably because I'd been upgrading my macos from earlier versions). So, when I entered 'arch' (to get its architecture), it responded with something '..x86' or i386 instead of 'arm64'. Turned out, probably because the terminal (or compiler) wasn't sure of my machine's architecture or compiled under an intel mac version, a scattering of binaries had been bound in that were Intel based instead of arm64 type, which is what the M1 wants. That mongrel version ran some ComfyUI nodes but not others. Probably not your problem, but worth checking. I had to do a complete clean reinstallation (gulp) of Sonoma 14.61 and reload python, ComfyUI, etc. from scratch, making sure they were all arm64. The color splotches problem seems a common one - I was probably seeing about what you're seeing before I got it right. Hope you get it sorted without too much more hassle...

Hi @kelvinator2 @FiditeNemini

I downloaded the Flux.1 Schnell model .
image.png

image.png

zwqjoy I'm not home where my Mac Studio is, so I can't double check your Comfy settings against what I've got. If it's the sample workflow, it should be the same, I think, but worth a double check.

I notice that it looks like your M1 seems to be on a Macbook Pro. It's possible that differences from a Mac Studio might be causing a different result in some way. I have a Macbook Pro, too, but it has a M1 Pro chip (like most Macbook Pro M1s are, I think), and that's a little different and slightly less powerful version than the M1 Max, apparently. I haven't gotten Flux working on my Macbook Pro yet because it just has 16GB so I figure to do more work on the Mac Studio 32GB. Anyway, when I get home tomorrow afternoon, I'll try to double check the settings and get back to you, unless I hear you solved it by then.

@zwqjoy Wow! I was surprised to find that you aren't using the sample workflow that I linked to at all, so that could explain why you might get a different and perhaps problematic result. Oh, I see now - you're using the workflow @ FiditeNemini posted. He's running on an M2, but you'd think it would work. Anyway, I haven't tried his. I got a much simpler workflow at the URL link I posted:

https://comfyanonymous.github.io/ComfyUI_examples/flux/#simple-to-use-fp8-checkpoint-version

by scrolling down to the heading titled Flux Schnell and using the image of "a bottle with a beautiful rainbow galaxy inside it on top of a wooden table in the middle of a modern kitchen..." etc for the workflow. I downloaded the image and put it in ComfyUI and got this workflow:

Flux Schnell sample ComfyUI workflow1.png

In the first iteration of multiple uses it takes a long time on my machine to load the 14GB plus model, but skips that for subsequent changes in the prompt once it's loaded. Let me know if that one works for you...

As a double-check, I just used that workflow with this prompt: 'a wise thin ancient looking Hindu yogi deep in meditation with a slight glow of enlightenment around his head, dressed only in a loin cloth is sitting on the ground cross-legged outdoors, at the entrance to his meditation cave. In front of him is a small cardboard sign with the text "The Yogi is In"'. and it generated this image. I've found flux schnell very often gets the spelling or words a bit wrong, at least on my set up, but it didn't this time:

ComfyUI_00036_.png

Actually that workflow runs very nicely here too. Using a compact checkpoint and built-in Ksampler makes it much easier to manage, not so much of a departure from SDXL, etc. Nice!

Yeah, it seems to work well! I tried the workflow you posted, too, @FiditeNemini , and it also works fine on my Mac Studio M1 Pro wi 32GB, with either the flux1-dev-fp8.safetensors model you show or the full flux1-schnell.safetensors (load from models/unet), , as long as I use the fp8 clip file. I'll try the fp16 clip file as well, but the Black Forest Labs instructions recommend that only be used on machines with more than 32GB. Already, the flux1-dev-fp8.safetensors with the fp8 clip file looks like it's taking about 30 minutes to generate a 1024 x 1024 image. Schnell is a lot schneller...

@kelvinator2 @FiditeNemini

Using a compact checkpoint and built-in Ksampler , still noisy image

image.png

image.png

Your workflow looks like it should work, as far as I can tell, @zwqjoy . I'm not sure if it makes sense in your circumstance or not, but I can tell you that about 2 weeks ago I was doing things like setting flags to force recompile all my python libraries, etc. when ComfyUI and Flux weren't working for me. Eventually, I decided it might be other stuff on my system interfering in some way. So I finally bit the bullet and did a clean install. So, what fixed it in my case was a format and clean reinstall of Sonoma 14.61, then going straight to step by step getting python reinstalled in a conda environment the right way, etc. Hopefully, you won't need to go that far in your case. In my case I'm glad I did it as it got things working right (and cleaned some junk off my system I didn't need). I still have a few more apps to reinstall that I'll get around to eventually...

@kelvinator2 I git pull the latest comfyui, pip install torch==2.3.1 torchaudio==2.3.1 torchvision==0.18.1. and It is OK now.

@zwqjoy Excellent! Glad to hear it. I just installed the XLabs-AI custom node: x-flux-comfyui, and started loading the batch of checkpoint, controllnet and lora models they have designed to work with Flux1.

https://github.com/XLabs-AI/x-flux-comfyui/blob/main/Readme.md

I've got the first sample workflows I've tried that they include in their custom node install seeming to load okay into my Mac M1 ComfyUI (no red boxes) after I put their models in the right places, but the workflows blow out when I try to run them because they require a 'cuda' device. I especially wanted to try out their https://huggingface.co/XLabs-AI/flux-RealismLora. Like the other XLabs-AI workflows I tried, it needs cuda and I'm not finding a work around for Mac silicon yet, though they may have similar functions not specifically designed for Flux. Because I want to try to work with video (tons of image processing), I need a major speed boost over my M1 pro with 32GB in any case. So, I had already decided to bite the bullet and add a windows 11 fast server (AMD Ryzen 9 7950X3D) with 64GB and an RTX 4090 to my set up. So, now I'm planning to do that this coming week.

Thought this was an encouraging article on how the Flux/lora duos might take the lead over Midjourney to make open-source, ComfyUI type flows the best AI image generating software on the planet. The guy uses online services for a lot of his approach for speed, but you can see how it could be put together running on your local computer:

https://gaspardtertrais.medium.com/flux-lora-tutorial-the-duo-that-could-replace-midjourney-prompting-guide-included-a9bacf28aa3c

I'm trying to install the previous versions by running the following command:
pip install torch==2.3.1 torchaudio==2.3.1 torchvision==0.18.1

But I'm getting the following error:
ERROR: Could not find a version that satisfies the requirement torch==2.3.1 (from versions: 2.0.0, 2.0.1, 2.1.0, 2.1.1, 2.1.2, 2.2.0, 2.2.1, 2.2.2)
ERROR: No matching distribution found for torch==2.3.1

Any help on how to resolve this? The only way I've been able to get Flux working is using --cpu

I'm trying to install the previous versions by running the following command:
pip install torch==2.3.1 torchaudio==2.3.1 torchvision==0.18.1

But I'm getting the following error:
ERROR: Could not find a version that satisfies the requirement torch==2.3.1 (from versions: 2.0.0, 2.0.1, 2.1.0, 2.1.1, 2.1.2, 2.2.0, 2.2.1, 2.2.2)
ERROR: No matching distribution found for torch==2.3.1

Any help on how to resolve this? The only way I've been able to get Flux working is using --cpu

me too,help!!

@janjanjan1 and @blendertom My experience was you have to make sure you've got a compatible Mac OS, right python version, Xcode cli (sp?) module, etc. Not sure what range is possible on various versions and it's not necessarily limited to what I've got, but FWIW, here's what I've got: Sonoma 14.6.1 and python 3.11.9 running on a Mac Studio M1 Max wi 32GB. It's probably too much info, but here are two listings that might help:

  1. Some excerpts of my conda listing of what's in the Conda 'flux' environment I set up to run flux1
  2. What my buddy, Perplexity.ai, said when I put your problem to it. I've been using (mostly free, pro) Perplexity.ai and ChatGPT all the time to hack through most of my tech probs these days and recommend one method of keeping asking them questions with new info or variations - sometimes they come through with great general solutions or something someone's posted, etc.

Some of my 'conda list' for flux - I think I might have needed to load wi miniforge to get arm64 versions of a couple instead of x86 :

packages in environment at /Users/kelly/miniforge3/envs/flux:

Name Version Build Channel

...
blend-modes 2.1.0 pypi_0 pypi
blurgenerator 1.1.0 pypi_0 pypi
bzip2 1.0.8 h99b78c6_7 conda-forge
ca-certificates 2024.7.4 hf0a4a13_0 conda-forge
certifi 2024.7.4 pypi_0 pypi
...
cycler 0.12.1 pypi_0 pypi
cython 3.0.11 pypi_0 pypi
...
ffmpeg 1.4 pypi_0 pypi
...
imageio 2.34.2 pypi_0 pypi
imageio-ffmpeg 0.5.1 pypi_0 pypi
...
kornia-rs 0.1.5 pypi_0 pypi
lazy-loader 0.4 pypi_0 pypi
libexpat 2.6.2 hebf3989_0 conda-forge
libffi 3.4.2 h3422bc3_5 conda-forge
libsqlite 3.46.0 hfb93653_0 conda-forge
libzlib 1.3.1 hfb2fe0b_1 conda-forge
llvmlite 0.43.0 pypi_0 pypi
....
python 3.11.9 h932a869_0_cpython conda-forge
...
qrcode 7.4.2 pypi_0 pypi
readline 8.2 h92ec313_1 conda-forge
referencing 0.35.1 pypi_0 pypi
...
sentencepiece 0.2.0 pypi_0 pypi
setuptools 72.1.0 pyhd8ed1ab_0 conda-forge
shapely 2.0.5 pypi_0 pypi
...
tinycss2 1.3.0 pypi_0 pypi
tk 8.6.13 h5083fa2_1 conda-forge
wheel 0.44.0 pyhd8ed1ab_0 conda-forge
wrapt 1.16.0 pypi_0 pypi
xatlas 0.0.9 pypi_0 pypi
xxhash 3.5.0 pypi_0 pypi
xz 5.2.6 h57fd34a_0 conda-forge

What Perplexity had to say about your situation, FWIW:

The issue your friends are encountering while trying to install PyTorch version 2.3.1 on their Apple Silicon Macs could be due to several factors, including Python version compatibility and package availability.
Potential Causes and Solutions
Python Version Compatibility:
Ensure that the Python version installed is compatible with PyTorch 2.3.1. Generally, PyTorch supports Python versions 3.7 and above, but it's advisable to use Python 3.10 or 3.11 for better compatibility with recent PyTorch releases on Apple Silicon
2
. Using a Python version that is too new or too old can lead to compatibility issues.
Availability of PyTorch 2.3.1:
The error message indicates that PyTorch version 2.3.1 is not available in the PyPI repository from which pip is trying to install. This could mean that the version is either not released or not available for the specific Python version or system architecture being used. It's important to verify whether PyTorch 2.3.1 is officially released and available for download for macOS on Apple Silicon.
Using the Correct Installation Index:
When installing PyTorch on Apple Silicon, it may be necessary to specify an additional index URL to ensure that the correct binaries are used. For example, using the --extra-index-url parameter can help direct the installation to the appropriate resources:
bash
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu

This ensures that the installation pulls from the correct repository that supports Apple Silicon
2
.
System Requirements:
Ensure that the system meets all necessary requirements, such as having macOS 12.3 or later and the Xcode command-line tools installed
2
. These are prerequisites for running PyTorch efficiently on Apple Silicon.
Alternative Installation Methods:
If the desired version is not available through pip, consider using a different installation method such as Anaconda or downloading the wheel files directly from the PyTorch website if available.
By addressing these potential issues, your friends should be able to resolve the installation problem and get PyTorch running on their Apple Silicon Macs.

Hope it helps. I know how frustrating it is. It took me days to get it going...@kelvinator2

I got it running now on M3 Pro 36GB. Used the specific versions of torch. One detail i missed was to set the weight_type of the diffusion model to default. As the workflow i downloaded had set it to fp8_e4m3fn.
Maybe that helps some others. Will still try to go to the latest nightly torch version with the same properties set.

but the dev model took a lot of time to complete the preset 30 steps 🙈

I've been trying to run the Flux model in my Mac Book Pro M3PRO with Sequoia 15.0 but I'm still getting the same error message : "TypeError: Trying to convert Float8_e4m3fn to the MPS backend but it does not have support for that dtype." . I tried to use these specific versions of torch that @FiditeNemini gave but I still get the error message.
torch==2.3.1
torchaudio==2.3.1
torchsde==0.2.6
torchvision==0.18.1

This is my workflow :

Capture d’écran 2024-09-22 à 14.54.26.png

Can somebody help me please ? :)

Hello guys,
I have the same issue with macbook pro M2 using Pinokio. How to use this setup ?
torch==2.3.1
torchaudio==2.3.1
torchsde==0.2.6
torchvision==0.18.1

thanks for your help, sorry Im a noob

I would recommend moving to the GGUF quantized versions of the model. It works fine on Mac for me.

https://github.com/city96/ComfyUI-GGUF

As I mentioned in my last post, I had to set the weight_type to default, what in effect forces your mac to convert or use the fp8 model with the capabilities it has. What in my case made it really slow. With gguf it was way faster on my MBPM3. And you can use the latest torch nightly as recommended in a lot of guides

Sign up or log in to comment