Stable Diffusion 3.5 Large Turbo -- Flumina Server App

This repository contains an implementation of Stable Diffusion 3.5 Large Turbo inference on Fireworks AI's new Flumina Server App toolkit.

Getting Started -- Serverless deployment on Fireworks

This Server App is deployed to Fireworks as-is in a "serverless" deployment, enabling you to use the model without managing GPUs or deployments yourself.

Grab an API Key from Fireworks and set it in your environment variables:

export API_KEY=YOUR_API_KEY_HERE

Text-to-Image Example Call

curl -X POST 'https://api.fireworks.ai/inference/v1/workflows/accounts/fireworks/models/stable-diffusion-3p5-large-turbo/text_to_image' \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-H "Accept: image/jpeg" \
-d '{
        "prompt": "Close-up shot of a chrysanthemum with artistic bokeh stylish photo"
}' \
--output output.jpg

Deploying Stable Diffusion 3.5 Large Turbo to Fireworks On-Demand

Stable Diffusion 3.5 Large Turbo is available on Fireworks via on-demand deployments. It can be deployed in a few simple steps:

Prerequisite: Install the Flumina CLI

The Flumina CLI is included with the fireworks-ai Python package. It can be installed with pip like so:

pip install 'fireworks-ai[flumina]>=0.15.7'

Also get an API key from the Fireworks site and set it in the Flumina CLI:

flumina set-api-key YOURAPIKEYHERE

Creating an On-Demand Deployment

flumina deploy can be used to create an on-demand deployment. When invoked with a model name that exists already, it will create a new deployment in your account which has that model:

flumina deploy accounts/fireworks/models/stable-diffusion-3p5-large-turbo

When successful, the CLI will print out example commands to call your new deployment, for example:

curl -X POST 'https://api.fireworks.ai/inference/v1/workflows/accounts/fireworks/models/stable-diffusion-3p5-large-turbo/text_to_image?deployment=accounts/u-6jamesr6-63834f/deployments/b10d69dd' \
    -H "Authorization: Bearer $API_KEY" \
    -H "Content-Type: application/json" \
    -H "Accept: image/jpeg" \
    -d '{
        "prompt": "Beatiful west coast sunset",
        "aspect_ratio": "16:9",
        "guidance_scale": 0.0,
        "num_inference_steps": 4,
        "seed": 0
    }' --output output.jpg

Your deployment can also be administered using the Flumina CLI. Useful commands include:

flumina list deployments to show all of your deployments
flumina get deployment to get details about a specific deployment
flumina delete deployment to delete a deployment

What is Flumina?

Flumina is Fireworks.ai’s new system for hosting Server Apps that allows users to deploy deep learning inference to production in minutes, not weeks.

What does Flumina offer for Stable Diffusion models?

Flumina offers the following benefits:

Clear, precise definition of the server-side workload by looking at the server app implementation (you are here)
Extensibility interface, which allows for dynamic loading/dispatching of add-ons server-side. For Stable Diffusion 3.5:
- ControlNet (Union) adapters (Coming soon!)
- LoRA adapters (Coming soon!)
Off-the-shelf support for standing up on-demand capacity for the Server App on Fireworks
- Further, customization of the logic of the deployment by modifying the Server App and deploying the modified version.
Now with support for FP8 numerics, delivering enhanced speed and efficiency for intensive workloads.

Deploying Custom Stable Diffusion 3.5 Large Turbo Apps to Fireworks On-demand

Coming soon!