r3gm/DiffuseCraft · Currently, HF is buggy.

Sep 5

https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/104
Hello.
HF is currently debugging several huge bugs.
Especially Zero GPU and Diffusers projects that seem to be related to stablepy also.
Workarounds on the user side:

Comment out Examples (especially None value) of Gradio
Be careful with functions with @spaces decorators
Also add @spaces decorator when loading model

r3gm

Owner Sep 5

Thanks for the heads-up @John6666
I’ve made some changes and will keep your tips in mind

John6666

Sep 5

Thank you for always maintaining the library.😸
I will close the Discussion.

BTW, I wonder if streamtest is going away and gone without being officially merged...
I've moved to dev2, but I've seen a few people using DiffuseCraft, which is a streamtest-only implementation version (yield images, seed).
Well, I'm sure it will be fixed in sync if they haven't modified it too much... just wondering.

John6666 changed discussion status to closed Sep 5

r3gm

Owner Sep 16

Hello, I have added the function to the main branch

John6666

Sep 16

Hi, Thanks for the contact!
I'll try to transplant it.

John6666

Sep 16

•

edited Sep 16

Successfully confirmed that True/False in image_previews changes the number of return values (3 if on). However, since the Zero GPU space has recently been sped up, the step is often completed before the previews appear. (Like noise -> complete). It would be helpful to have Seed returned.
This would not be a problem for anyone using either the old or new implementation. Thanks.

https://huggingface.co/posts/cbensimon/747180194960645
The HF bug is about 70% fixed, but it's still there, and it's easy to have trouble with CUDA onloading and offloading, especially after starting the demo. stablepy is probably able to work around it thanks to frequent offloading, but libraries that assume only a CUDA environment are in trouble. If the tensor straddles the CPU and GPU, the program is sure to die.

By the way, this is totally unrelated.
It seems to be customary for Civitai and others to write metadata into image files.
I've added my own code for an image generation space that uses Serverless Inference, but as far as DiffuseCraft is concerned, I'd like to keep in line with the original as much as possible.

Although it would be so much easier if stablepy returned it, since stablepy returns images, not image files (And it should be.), I'm thinking it would be just before or after the yield in DiffuseCraft's app.py, but the timing is more difficult than I thought, especially considering the stream of the yield, even if it ends with a return...
Any ideas?

The metadata looks like this, which is probably originally WebUI's own format, so there should be no strict definition.
https://huggingface.co/spaces/cagliostrolab/animagine-xl-3.1/blob/main/app.py
https://huggingface.co/spaces/cagliostrolab/animagine-xl-3.1/blob/main/utils.py

    metadata = {
        "prompt": prompt,
        "negative_prompt": negative_prompt,
        "resolution": f"{width} x {height}",
        "guidance_scale": guidance_scale,
        "num_inference_steps": num_inference_steps,
        "seed": seed,
        "sampler": sampler,
        "sdxl_style": style_selector,
        "add_quality_tags": add_quality_tags,
        "quality_tags": quality_selector,
    }

        metadata["use_upscaler"] = None
        metadata["Model"] = {
            "Model": DESCRIPTION,
            "Model hash": "e3c47aedb0",
        }

def save_image(image, metadata, output_dir, is_colab):
    if is_colab:
        current_time = datetime.now().strftime("%Y%m%d_%H%M%S")
        filename = f"image_{current_time}.png"   
    else:
        filename = str(uuid.uuid4()) + ".png"
    os.makedirs(output_dir, exist_ok=True)
    filepath = os.path.join(output_dir, filename)
    metadata_str = json.dumps(metadata)
    info = PngImagePlugin.PngInfo()
    info.add_text("metadata", metadata_str)
    image.save(filepath, "PNG", pnginfo=info)
    return filepath

My version

https://huggingface.co/spaces/John6666/flux-lora-the-explorer/blob/main/mod.py

def save_image(image, savefile, modelname, prompt, height, width, steps, cfg, seed):
    import uuid
    from PIL import Image, PngImagePlugin
    import json
    try:
        if savefile is None: savefile = f"{modelname.split('/')[-1]}_{str(uuid.uuid4())}.png"
        metadata = {"prompt": prompt, "Model": {"Model": modelname.split("/")[-1]}}
        metadata["num_inference_steps"] = steps
        metadata["guidance_scale"] = cfg
        metadata["seed"] = seed
        metadata["resolution"] = f"{width} x {height}"
        metadata_str = json.dumps(metadata)
        info = PngImagePlugin.PngInfo()
        info.add_text("metadata", metadata_str)
        image.save(savefile, "PNG", pnginfo=info)
        return str(Path(savefile).resolve())
    except Exception as e:
        print(f"Failed to save image file: {e}")
        raise Exception(f"Failed to save image file:") from e

r3gm

Owner Sep 20

I'm setting up a lighter stream to improve performance. For the metadata, I'm currently using a plain text format like in automatic1111, but I might switch to a dictionary format like you suggested.
In the yield, I use if image_path: to identify the last loop for returning the metadata. I'm not sure if this will be helpful to you, though

John6666

Sep 20

•

edited Sep 20

I'm setting up a lighter stream to improve performance.

Preview is now perfect.

For the metadata, I'm currently using a plain text format like in automatic1111,

I think A1111 is more standard. You have already written...

I use if image_path: to identify the last loop for returning the metadata.

It can be a great help!

P.S.
The metadata is then successfully embedded.
Since Gradio's Image and Gallery components do not destroy the image metadata, I decided to insert the processing in this line, a little wildly, so that the GUI side does not need to be modified.
This is easy enough to rewrite in the event of a DiffuseCraft update.
Thank you very much!🤗
https://huggingface.co/spaces/John6666/DiffuseCraftMod

from PIL import Image
def save_images(images: list[Image.Image], metadatas: list[str]):
    from PIL import PngImagePlugin
    import uuid
    try:
        output_images = []
        for image, metadata in zip(images, metadatas):
            info = PngImagePlugin.PngInfo()
            info.add_text("metadata", metadata)
            savefile = f"{str(uuid.uuid4())}.png"
            image.save(savefile, "PNG", pnginfo=info)
            output_images.append(str(Path(savefile).resolve()))
        return output_images
    except Exception as e:
        print(f"Failed to save image file: {e}")
        raise Exception(f"Failed to save image file:") from e


                info_state = info_state + "<br>" + "GENERATION DATA:<br>" + "<br>-------<br>".join(metadata).replace("\n", "<br>")

                img = save_images(img, metadata)
                
            yield img, info_state

r3gm

Owner Sep 21

Works well
I found a way to do this with HTML, which could make things easier.

John6666

Sep 21

The HTML download link worked perfectly, but with the clientele in my space, sharing the images folder would have been kind of disastrous, so I had no choice but to omit it.😅

John6666

Sep 21

I'm reporting a future bug.

It's a long explanation because I know you don't understand it. A few weeks ago, I was forced to shorten Examples in my space because I was experiencing crashes on start-up if I included None in the Examples field.
The root cause is still unknown, although it was discussed on the HF forum.
I assumed that since it wasn't happening in your space, it must be my code, but I was wrong.

It can be reproduced with just the following changes. When I look at the behaviour in the logs, it's absurd, like trying to put numbers for different components in the dropdown...
However, this is not just a common Gradio bug, as it happens even when it is not Gradio.
Note that no workaround other than reducing the contents of Examples is currently known.

README.md

sdk_version: 4.44.0

r3gm

Owner Sep 21

Thanks so much for the info

John6666

Sep 22

Sorry. The few workarounds appear to have failed.

As a rule of thumb, this is more likely to happen with drop-downs than with text boxes or sliders.
Although it is somewhat obvious that dynamically generated elements are likely to be involved, it is hard to guess from the logic, as model selection and the like are fine.
When I tried to avoid this, I used the Russian roulette? method and removed them from one side to the other.

It's just a guess, but I suspect that one or more of the components is not well visible to the mysterious process on the VM management function (the bug culprit), and that after passing through the components, they are shifted one by one?

===== Application Startup at 2024-09-22 05:38:12 =====

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]
0it [00:00, ?it/s]
You need an API key to download Civitai models.
[#dfebe7 200MiB/1.9GiB(9%) CN:16 DL:268MiB ETA:6s]
[#dfebe7 550MiB/1.9GiB(27%) CN:16 DL:316MiB ETA:4s]
[#dfebe7 910MiB/1.9GiB(44%) CN:16 DL:333MiB ETA:3s]
[#dfebe7 1.2GiB/1.9GiB(62%) CN:16 DL:340MiB ETA:2s]
[#dfebe7 1.5GiB/1.9GiB(79%) CN:16 DL:342MiB ETA:1s]
[#dfebe7 1.9GiB/1.9GiB(96%) CN:16 DL:343MiB]

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
dfebe7|OK  |   340MiB/s|models/milkyWonderland_v40.safetensors

Status Legend:
(OK):download completed.
[#5dd1ca 0B/0B CN:1 DL:0B]
[#5dd1ca 0B/0B CN:1 DL:0B]
[#5dd1ca 291MiB/319MiB(91%) CN:16 DL:295MiB]

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
5dd1ca|OK  |   289MiB/s|vaes/sdxl_vae-fp16fix-c-1.1-b-0.5.safetensors

Status Legend:
(OK):download completed.
[#d39e0b 232MiB/319MiB(72%) CN:16 DL:312MiB]

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
d39e0b|OK  |   305MiB/s|vaes/sdxl_vae-fp16fix-blessed.safetensors

Status Legend:
(OK):download completed.
[#f9221f 159MiB/159MiB(99%) CN:1 DL:211MiB]

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
f9221f|OK  |   210MiB/s|vaes/vividReal_v20.safetensors

Status Legend:
(OK):download completed.
[#11bcc6 0B/0B CN:1 DL:0B]
[#11bcc6 87MiB/159MiB(54%) CN:16 DL:275MiB]

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
11bcc6|OK  |   278MiB/s|vaes/vae-ft-mse-840000-ema-pruned_fp16.safetensors

Status Legend:
(OK):download completed.
You need an API key to download Civitai models.

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
fb2215|OK  |   303MiB/s|loras/Coloring_book_-_LineArt.safetensors

Status Legend:
(OK):download completed.
You need an API key to download Civitai models.
You need an API key to download Civitai models.

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
58f467|OK  |    74MiB/s|loras/anime-detailer-xl.safetensors

Status Legend:
(OK):download completed.

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
09a4c5|OK  |   103MiB/s|loras/style-enhancer-xl.safetensors

Status Legend:
(OK):download completed.
You need an API key to download Civitai models.
[#e06807 240MiB/256MiB(93%) CN:16 DL:300MiB]

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
e06807|OK  |   293MiB/s|loras/Hyper-SD15-8steps-CFG-lora.safetensors

Status Legend:
(OK):download completed.
[#980881 220MiB/750MiB(29%) CN:16 DL:282MiB ETA:1s]
[#980881 548MiB/750MiB(73%) CN:16 DL:309MiB]

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
980881|OK  |   305MiB/s|loras/Hyper-SDXL-8steps-CFG-lora.safetensors

Status Legend:
(OK):download completed.

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
7af41f|OK  |   8.1MiB/s|embedings/bad_prompt_version2.pt

Status Legend:
(OK):download completed.

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
2a51bd|OK  |    11MiB/s|embedings/EasyNegativeV2.safetensors

Status Legend:
(OK):download completed.

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
30a175|OK  |   1.3MiB/s|embedings/bad-hands-5.pt

Status Legend:
(OK):download completed.
FILE: embedings/bad_prompt_version2.pt
FILE: embedings/EasyNegativeV2.safetensors
FILE: embedings/bad-hands-5.pt
FILE: models/milkyWonderland_v40.safetensors
FILE: loras/Coloring_book_-_LineArt.safetensors
FILE: loras/anime-detailer-xl.safetensors
FILE: loras/style-enhancer-xl.safetensors
FILE: loras/Hyper-SD15-8steps-CFG-lora.safetensors
FILE: loras/Hyper-SDXL-8steps-CFG-lora.safetensors
FILE: vaes/sdxl_vae-fp16fix-c-1.1-b-0.5.safetensors
FILE: vaes/sdxl_vae-fp16fix-blessed.safetensors
FILE: vaes/vividReal_v20.safetensors
FILE: vaes/vae-ft-mse-840000-ema-pruned_fp16.safetensors
🏁 Download and listing of valid models completed.
Loading model...
[INFO] >> Default VAE: madebyollin/sdxl-vae-fp16-fix
[DEBUG] >> The deprecation tuple ('no variant default', '0.24.0', "You are trying to load the model files of the `variant=fp16`, but no such modeling files are available.The default model files: {'unet/diffusion_pytorch_model.safetensors', 'text_encoder/model.safetensors', 'text_encoder_2/model.safetensors', 'vae/diffusion_pytorch_model.safetensors'} will be loaded instead. Make sure to not load from `variant=fp16`if such variant modeling files are not available. Doing so will lead to an error in v0.24.0 as defaulting to non-variantmodeling files is deprecated.") should be removed since diffusers' version 0.30.2 is >= 0.24.0
[DEBUG] >> Loading model without parameter variant=fp16


Loading pipeline components...:   0%|          | 0/7 [00:00<?, ?steps/s]

Loading pipeline components...:  29%|██▊       | 2/7 [00:01<00:03,  1.40steps/s]

Loading pipeline components...:  71%|███████▏  | 5/7 [00:04<00:01,  1.22steps/s]
Loading pipeline components...: 100%|██████████| 7/7 [00:04<00:00,  1.49steps/s]
[DEBUG] >> Default VAE
[DEBUG] >> Base sampler: EulerAncestralDiscreteScheduler {
  "_class_name": "EulerAncestralDiscreteScheduler",
  "_diffusers_version": "0.30.2",
  "beta_end": 0.012,
  "beta_schedule": "scaled_linear",
  "beta_start": 0.00085,
  "interpolation_type": "linear",
  "num_train_timesteps": 1000,
  "prediction_type": "epsilon",
  "rescale_betas_zero_snr": false,
  "sample_max_value": 1.0,
  "set_alpha_to_one": false,
  "skip_prk_steps": true,
  "steps_offset": 1,
  "timestep_spacing": "leading",
  "trained_betas": null
}

/usr/local/lib/python3.10/site-packages/gradio/components/dropdown.py:188: UserWarning: The value passed into gr.Dropdown() is not in the list of choices. Please update the list of choices to include: Lineart or set allow_custom_value=True.
  warnings.warn(
/usr/local/lib/python3.10/site-packages/gradio/components/dropdown.py:188: UserWarning: The value passed into gr.Dropdown() is not in the list of choices. Please update the list of choices to include: txt2img or set allow_custom_value=True.
  warnings.warn(
/usr/local/lib/python3.10/site-packages/gradio/components/dropdown.py:188: UserWarning: The value passed into gr.Dropdown() is not in the list of choices. Please update the list of choices to include: 512 or set allow_custom_value=True.
  warnings.warn(
/usr/local/lib/python3.10/site-packages/gradio/components/dropdown.py:188: UserWarning: The value passed into gr.Dropdown() is not in the list of choices. Please update the list of choices to include: 100 or set allow_custom_value=True.
  warnings.warn(
Traceback (most recent call last):
  File "/home/user/app/app.py", line 1142, in <module>
    gr.Examples(
  File "/usr/local/lib/python3.10/site-packages/gradio/helpers.py", line 61, in create_examples
    examples_obj = Examples(
  File "/usr/local/lib/python3.10/site-packages/gradio/helpers.py", line 281, in __init__
    self._get_processed_example(example)
  File "/usr/local/lib/python3.10/site-packages/gradio/helpers.py", line 290, in _get_processed_example
    prediction_value = component.postprocess(sample)
  File "/usr/local/lib/python3.10/site-packages/gradio/components/file.py", line 204, in postprocess
    orig_name=Path(value).name,
  File "/usr/local/lib/python3.10/pathlib.py", line 960, in __new__
    self = cls._from_parts(args)
  File "/usr/local/lib/python3.10/pathlib.py", line 594, in _from_parts
    drv, root, parts = self._parse_args(args)
  File "/usr/local/lib/python3.10/pathlib.py", line 578, in _parse_args
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not int
The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]
0it [00:00, ?it/s]
You need an API key to download Civitai models.

John6666

Sep 24

Good evening?

I'm here to suggest that if Flux and Pony are going to enter the competition, it will soon be difficult to determine the model type by name.
And Pony7 was declared as a likely AuraFlow...
This is not usable in a local environment, but for a demo, it would be okay.

model_type_dict = {
    "diffusers:StableDiffusionPipeline": "SD 1.5",
    "diffusers:StableDiffusionXLPipeline": "SDXL",
    "diffusers:FluxPipeline": "FLUX",
}

def get_model_type(repo_id: str):
    from huggingface_hub import HfApi
    api = HfApi() # api = HfApi(token=HF_READ_TOKEN) # if use private or gated model
    default = "SD 1.5"
    try:
        model = api.model_info(repo_id=repo_id, timeout=5.0)
        tags = model.tags
        for tag in tags:
            if tag in model_type_dict.keys(): return model_type_dict.get(tag, default)
    except Exception:
        return default
    return default

model_type = get_model_type(model_name)

And while we're at it, we've got a semi-official model that doesn't require certification, which is a modified version of someone else's dev that multimodalart has made into Diffusers.
Maybe this one will be the base for training LoRA.
https://huggingface.co/multimodalart/FLUX.1-dev2pro-full

r3gm

Owner Sep 25

Thanks a lot, I’ve used this in the GUI

John6666

Oct 4

Not long ago, Gradio 5 was announced. So here is the error forecast.
The following few letters (5.0.0b3) change will cause an error in running inference.
The error message does not matter much, in short, it appears that the phenomenon of Examples getting into strange places has been extended to after startup.
I thought that fixing the try except misspelled item would fix it, but Gradio doesn't seem to be that easy...

In theory, I don't understand the conditions under which it occurs. So it may not be Gradio alone, but one of a series of strange errors related to the Zero GPU space. If it is a hardware issue of some sort, I can understand the situation being incomprehensible.
https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/107

---
title: 🧩 DiffuseCraft
emoji: 🧩🖼️
colorFrom: red
colorTo: pink
sdk: gradio
sdk_version: 5.0.0b3
app_file: app.py
pinned: true
license: mit
short_description: Stunning images using stable diffusion.
---

Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

[INFO] >> Default VAE: madebyollin/sdxl-vae-fp16-fix
[DEBUG] >> The deprecation tuple ('no variant default', '0.24.0', "You are trying to load the model files of the `variant=fp16`, but no such modeling files are available.The default model files: {'unet/diffusion_pytorch_model.safetensors', 'text_encoder_2/model.safetensors', 'text_encoder/model.safetensors', 'vae/diffusion_pytorch_model.safetensors'} will be loaded instead. Make sure to not load from `variant=fp16`if such variant modeling files are not available. Doing so will lead to an error in v0.24.0 as defaulting to non-variantmodeling files is deprecated.") should be removed since diffusers' version 0.31.0.dev0 is >= 0.24.0
[DEBUG] >> Loading model without parameter variant=fp16
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 703, in process_events
    response = await route_utils.call_process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 323, in call_process_api
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1980, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1565, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 684, in async_iteration
    return await anext(iterator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 678, in __anext__
    return await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2405, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 914, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 661, in run_sync_iterator_async
    return next(iterator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 822, in gen_wrapper
    response = next(iterator)
  File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 822, in gen_wrapper
    response = next(iterator)
  File "/home/user/app/app.py", line 483, in load_new_model
    self.model.load_pipe(
  File "/usr/local/lib/python3.10/site-packages/stablepy/diffusers_vanilla/model.py", line 718, in load_pipe
    if os.path.isfile(vae_model):
  File "/usr/local/lib/python3.10/genericpath.py", line 30, in isfile
    st = os.stat(path)
TypeError: stat: path should be string, bytes, os.PathLike or integer, not list

r3gm

Owner Oct 6

It looks like it was an issue with Gradio, but it got fixed in sdk_version: 5.0.0b7

Unfortunately, Gradio 5 brought a bunch of changes, like how file storage works, so we’ll need to update a lot of things

John6666

Oct 6

it got fixed in sdk_version: 5.0.0b7

Thank goodness...it's hard to get around this with just users.

like how file storage works

Wow.🤢 I've encountered some behavior changes in Examples and some obsolete component arguments.
Also, it seems that the Zero GPU space is currently incompletely supported and needs a little hack.
Oh well, we are all used to Gradio's whims.

r3gm

Owner Oct 7

I will also see if it can be updated to Gradio 5 once it is supported by the official version

John6666

Oct 16

Hello.

The transition from Gradio3 to Gradio4 was apparently terrible, but this time, much care seems to have been taken to make the transition smooth.
https://huggingface.co/blog/gradio-5
https://github.com/gradio-app/gradio/issues/9463

By the way, while fixing another Zero GPU space bug, I wondered if the reason DiffuseCraft is buggy with Gradio5 is partly because the output is Gallery?
I really don't understand this specification, but it is designed to cause malfunctions if the Gallery component is not given as an argument. If you use Gallery as a pure output component, it will malfunction.
I think Gallery more convenient than Image in terms of UI, but it's a relatively new component, so there are a lot of bugs, like broken layouts.
https://huggingface.co/spaces/multimodalart/flux-lora-lab/discussions/3/files

John6666

Oct 16

•

edited Oct 16

Come to think of it, if a LoRA stem of filename contains . or spaces, LoRA is not actually used at runtime.
I have been renaming the file itself in advance to deal with this. I don't know if this is a bug or a specification, but I'm reporting it just in case.

Config model: votepurchase/animagine-xl-3.1 None ['loras/xl_Noose_Portal(a3.1).safetensors', None, None, None, None] # doesn't work
Config model: votepurchase/animagine-xl-3.1 None ['loras/xl_Noose_Portal(a3_1).safetensors', None, None, None, None] # works

John6666

Oct 17

https://huggingface.co/spaces/John6666/votepurchase-multiple-model/discussions/10#67100f41928c1b332046589c
FLUX Issue. For a while I was successfully generating images with the Flux model in DiffuseCraft, but for some reason it is now not working with connection error both here and in my Mod.
I didn't notice it because I'm mainly using SDXL, but it was reported to me and I checked it out and found out.

But still, if the error is a connection error, does that mean the whole process is dead when we load Flux...
But I think the only branch in DiffuseCraft that only applies to Flux is the part where the model transformer is to.("cuda:0")'d.
Did something happen inside stablepy?

r3gm

Owner Oct 20

Come to think of it, if a LoRA stem of filename contains . or spaces, LoRA is not actually used at runtime.
I have been renaming the file itself in advance to deal with this. I don't know if this is a bug or a specification, but I'm reporting it just in case.
Config model: votepurchase/animagine-xl-3.1 None ['loras/xl_Noose_Portal(a3.1).safetensors', None, None, None, None] # doesn't work
Config model: votepurchase/animagine-xl-3.1 None ['loras/xl_Noose_Portal(a3_1).safetensors', None, None, None, None] # works
I tried to reproduce the issue, but everything worked fine...

r3gm

Owner Oct 20

https://huggingface.co/spaces/John6666/votepurchase-multiple-model/discussions/10#67100f41928c1b332046589c
FLUX Issue. For a while I was successfully generating images with the Flux model in DiffuseCraft, but for some reason it is now not working with connection error both here and in my Mod.
I didn't notice it because I'm mainly using SDXL, but it was reported to me and I checked it out and found out.

But still, if the error is a connection error, does that mean the whole process is dead when we load Flux...
But I think the only branch in DiffuseCraft that only applies to Flux is the part where the model transformer is to.("cuda:0")'d.
Did something happen inside stablepy?

With Flux, I transfer the model from CPU to the CUDA or remove some components when they are no longer needed. This might happen if the model fails to set a component to CPU properly due to insufficient RAM. Maybe the best approach would be to use a cli.py where everything runs together, helping to free up all the resources efficiently.

John6666

Oct 20

I tried to reproduce the issue, but everything worked fine...

So it is very likely that I am failing at something... By the way, I get similar symptoms with Textual Inversion.
The nature of the symptom that DiffuseCraft and stablepy don't throw an error but fail to actually apply it to the output image is also a mystery to me.
PEFT should throw an exception if something is slightly wrong...
Well, it has been taken care of anyway, so I'm not in trouble.

With Flux, I transfer the model from CPU to the CUDA or remove some components when they are no longer needed.

Well, even 40GB of Zero GPU space is not enough for Flux... Colab Free has 16GB, I think. My PC has only 12GB...
Bucket relays are primitive but effective.

Even if we want to quantize, Diffusers or torch does not natively support NF4 loading to memory, and even if they do, it is hard to use it with stablepy because quantization is prone to errors when used with PEFT. (I ran into a bug with transformer.)
We now can use torchao's 8bit, but it is an adventure to require latest version of torch.

I am not familiar with cli.py, but I wonder if it can be used to create a separate process. It would be safer if memory management could be left to the OS or virtual OS. Especially torch's tensor related memory is uncontrollable from Python. malloc, free were cumbersome but powerful...

https://github.com/sayakpaul/diffusers-torchao

r3gm

Owner Oct 21

PEFT should throw an exception if something is slightly wrong...

If the logger's in debug mode, it'll show the peft error.

import logging
from stablepy import logger
logger.setLevel(logging.DEBUG)

We now can use torchao's 8bit, but it is an adventure to require latest version of torch.

It seems like they chose to make it more flexible
https://github.com/huggingface/diffusers/pull/9213#issuecomment-2424749826

I am not familiar with cli.py, but I wonder if it can be used to create a separate process. It would be safer if memory management could be left to the OS or virtual OS.
Especially torch's tensor related memory is uncontrollable from Python. malloc, free were cumbersome but powerful...

I had this issue a while back with the transformers library where the RAM just wouldn't free up. After some digging, I realized that nest-asyncio was the culprit. Once I stopped using it, everything was fine. Apparently, the spaces includes it by default in the latest version.

John6666

Oct 21

•

edited Oct 21

logger's in debug mode

I will try it.

make it more flexible

This direction is helpful; it's almost the same approach as with the transformers.

quantization_config = DiffusersQuantoConfig(weights="float8", compute_dtype=torch.bfloat16)
FluxTransformer2DModel.from_pretrained("<either diffusers format or quanto format weights>", quantization_config=quantization_config)

I realized that nest-asyncio was the culprit. Once I stopped using it, everything was fine. Apparently, the spaces includes it by default in the latest version.

The bug that sometimes prevents the memory of packed tensors in the Zero GPU space from being freed is probably caused by this guy...
The resulting bugs are varied, but I think there are probably one or two main causes, since they occurred all at once in a short period of time. The behavior is noticeable as if async or MP is doing something bad, such as variable scope related things going wrong.
If it's a bug in the spaces library, we'd better wait for the library to be fixed and rebuild.
https://huggingface.co/spaces/zero-gpu-explorers/README/discussions/107

Edit:
May I reproduce the following section in a bug report to the Zero GPU space community?
Based on the symptoms, it's probably the same culprit or relative.

I had this issue a while back with the transformers library where the RAM just wouldn't free up. After some digging, I realized that nest-asyncio was the culprit. Once I stopped using it, everything was fine. Apparently, the spaces includes it by default in the latest version.

John6666

Oct 21

•

edited Oct 21

The error log was successfully obtained. The image generation itself completes successfully.

Config model: votepurchase/animagine-xl-3.1 None ['loras/xl_Noose_Portal(a3.1).safetensors', '', '', '', '']
LoRA file not found.
LoRA file not found.
LoRA file not found.
LoRA file not found.
[DEBUG] >> _un, re and load_ lora
[DEBUG] >> Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'loras/xl_Noose_Portal(a3.1).safetensors'.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/stablepy/diffusers_vanilla/lora_loader.py", line 112, in lora_mix_load
    pipe.load_lora_weights(lora_path)
  File "/usr/local/lib/python3.10/site-packages/diffusers/loaders/lora_pipeline.py", line 560, in load_lora_weights
    state_dict, network_alphas = self.lora_state_dict(
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/diffusers/loaders/lora_pipeline.py", line 673, in lora_state_dict
    state_dict = cls._fetch_state_dict(
  File "/usr/local/lib/python3.10/site-packages/diffusers/loaders/lora_base.py", line 265, in _fetch_state_dict
    weight_name = cls._best_guess_weight_name(
  File "/usr/local/lib/python3.10/site-packages/diffusers/loaders/lora_base.py", line 331, in _best_guess_weight_name
    files_in_repo = model_info(pretrained_model_name_or_path_or_dict).siblings
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 106, in _inner_fn
    validate_repo_id(arg_value)
  File "/usr/local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 160, in validate_repo_id
    raise HFValidationError(
huggingface_hub.errors.HFValidationError: Repo id must use alphanumeric chars or '-', '_', '.', '--' and '..' are forbidden, '-' and '.' cannot start or end the name, max length is 96: 'loras/xl_Noose_Portal(a3.1).safetensors'.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/stablepy/diffusers_vanilla/model.py", line 1086, in process_lora
    self.pipe = lora_mix_load(
  File "/usr/local/lib/python3.10/site-packages/stablepy/diffusers_vanilla/lora_loader.py", line 121, in lora_mix_load
    state_dict = safetensors.torch.load_file(lora_path, device="cpu")
  File "/usr/local/lib/python3.10/site-packages/safetensors/torch.py", line 311, in load_file
    with safe_open(filename, framework="pt", device=device) as f:
FileNotFoundError: No such file or directory: "loras/xl_Noose_Portal(a3.1).safetensors"
[ERROR] >> ERROR: LoRA not compatible: loras/xl_Noose_Portal(a3.1).safetensors
[DEBUG] >> No such file or directory: "loras/xl_Noose_Portal(a3.1).safetensors"
[INFO] >> loras/xl_Noose_Portal(a3_1).safetensors
[DEBUG] >> INFO PIPE: StableDiffusionXLPipeline
[DEBUG] >> text_encoder_type: torch.float16
[DEBUG] >> unet_type: torch.float16
[DEBUG] >> vae_type: torch.float16
[DEBUG] >> pipe_type: torch.float16
[DEBUG] >> scheduler_main_pipe: EulerAncestralDiscreteScheduler {
  "_class_name": "EulerAncestralDiscreteScheduler",
  "_diffusers_version": "0.31.0.dev0",
  "beta_end": 0.012,
  "beta_schedule": "scaled_linear",
  "beta_start": 0.00085,
  "interpolation_type": "linear",
  "num_train_timesteps": 1000,
  "prediction_type": "epsilon",
  "rescale_betas_zero_snr": false,
  "sample_max_value": 1.0,
  "set_alpha_to_one": false,
  "skip_prk_steps": true,
  "steps_offset": 1,
  "timestep_spacing": "leading",
  "trained_betas": null
}

[DEBUG] >> Start stream

  0%|          | 0/28 [00:00<?, ?steps/s][DEBUG] >> 0

 14%|█▍        | 4/28 [00:01<00:07,  3.29steps/s][DEBUG] >> 5

 32%|███▏      | 9/28 [00:02<00:04,  4.02steps/s][DEBUG] >> 10

 50%|█████     | 14/28 [00:03<00:03,  4.28steps/s][DEBUG] >> 15

 68%|██████▊   | 19/28 [00:04<00:02,  4.41steps/s][DEBUG] >> 20

 86%|████████▌ | 24/28 [00:05<00:00,  4.48steps/s][DEBUG] >> 25
100%|██████████| 28/28 [00:06<00:00,  4.35steps/s]
[DEBUG] >> finish
[INFO] >> Seeds: [1996884573]

From the error content, the program assumes that LoRA is specified by HF repo name?

r3gm

Owner 30 days ago

From the error content, the program assumes that LoRA is specified by HF repo name?

This checks whether the Lora file is present. If it isn't, the system treats it as a HF repo instead.
I will check if it also happens in this space

r3gm

Owner 27 days ago

Config model: eienmojiki/Anything-XL None ['loras/xl_二次元通道(a3.1).safetensors', 'loras/xl_Er Ci Yuan Tong Dao (a3.1).safetensors', 'None', 'None', 'None']
[DEBUG] >> _un, re and load_ lora
[INFO] >> loras/xl_二次元通道(a3.1).safetensors
[INFO] >> loras/xl_Er Ci Yuan Tong Dao (a3.1).safetensors

I made some adjustments because I was having trouble retrieving the correct file name, but it seems to be working fine.

John6666

27 days ago

Thank you very much!
I didn't realize it was a problem on the DiffuseCraft side, not stablepy... I see, it's hard to reproduce.
When dealing with strange file names that are not ASCII-like, it is easier to use pathlib.Path(path).suffix or pathlib.Path(path).stem. But pathlib is pathlib, and there are features that are not supported without the os library or shutil, and well, there are many other things...

BTW, I'll post some of the feedback I've received that I think should be addressed on the stablepy side. I have a suggestion to add a scheduler.
Euler AYS should probably not be done unless it is implemented on the Diffusers side, but v-prediction is already possible, so it is effectively just a matter of thinking of an alias on stablepy.
However, that would be the challenge, as it would be cluttered if done simply.
The reason why v-prediction is required now is probably due to the influence of Kohaku, illustrious and noobai.
There are more and more attempts to change schedulers and samplers.
The Flux one is amazing. But this one should also wait for Diffusers support.
https://huggingface.co/spaces/John6666/votepurchase-multiple-model/discussions/8#670a8a63222579c05ec4ac42
https://huggingface.co/spaces/John6666/sdxl-to-diffusers-v2/discussions/9#671bfe2b7a5cf485f8340624
https://huggingface.co/Laxhar/noob_sdxl_v_pred
https://www.reddit.com/r/comfyui/comments/1g9wfbq/simple_way_to_increase_detail_in_flux_and_remove/

r3gm

Owner 26 days ago

Euler AYS should probably not be done unless it is implemented on the Diffusers side, but v-prediction is already possible, so it is effectively just a matter of thinking of an alias on stablepy.

I've noticed that AYS has been implemented, but it's important to check which pipelines are compatible with it.

The reason why v-prediction is required now is probably due to the influence of Kohaku, illustrious and noobai.

I'll be running some tests of v-pred.

John6666

26 days ago

I've noticed that AYS has been implemented

Nice!😆 The reason it wasn't on the main revision origin manual page when I looked is because it's a new commit.
Unlike vpred, this one only needs one frame, so it's purely a matter of deciding on the alias and you're done.
However, until a pip version of Diffusers for this comes along, I guess we'll have to wait.

I'll be running some tests of v-pred.

It would be helpful. It's an option that hasn't been used much, so there might be a bug if we try it actually.

r3gm

Owner 18 days ago

•

edited 18 days ago

Hey @John6666
I made some updates, but I ran into a bunch of bugs with diffusers. To get around that, I decided to limit some 'schedule types'.
v-pred works relatively well and can be adjusted in "other settings"

John6666

18 days ago

•

edited 17 days ago

Wow.

I ran into a bunch of bugs with diffusers

I also encountered a few myself. Specifically, for example, the community pipeline throws an error with new Diffusers and Transformers.
I recently got a github account so I guess I can participate in debugging the library from now on, but in any case we need to debug Diffusers itself.
The good news is that v-pred works fine for now.😀
In any case, I'll have to reflect it in my mods first.

Edit:
HF's errors today are too bad...
I was able to get it to a working state, but I didn't have the energy left to check it out...
https://discuss.huggingface.co/t/space-runs-ok-for-several-hours-then-runtime-error/115638

Edit:
According to the person who requested v-pred and AYS, it is working properly as expected.
I also tried Euler+AYS, and the output changes quite a bit. Generally, good results are obtained.
As for v-pred, it is too peaky to know whether it is working well or not...
However, the output changes a lot, so it is definitely being applied in my environment.

r3gm

Owner 17 days ago

I recently got a github account so I guess I can participate in debugging the library from now on, but in any case we need to debug Diffusers itself.

Great to see you on github, it’ll make debugging and contributing so much easier.

I was able to get it to a working state, but I didn't have the energy left to check it out...

Was the space starting or getting stuck with a 504 error?

I also tried Euler+AYS, and the output changes quite a bit. Generally, good results are obtained.

Even though Auto1111 uses sigmas, I found that in Diffusers, only Euler works with input sigmas, so, I decided to go with timesteps instead.

The variants like Euler trailing are now Euler-SGM uniform. Also, for SDXL, it's useful to keep clip skip enabled since this specifies the -2 layer with which it was trained.

John6666

17 days ago

•

edited 17 days ago

For the time being, I think I'll practice submitting PRs to github whenever I find a bug that even I can fix. I think this is true of any program, but the author is unlikely to notice a bug in an area that is not used very often. Unless it's a mathematical part, I should be able to fix it.

Was the space starting or getting stuck with a 504 error?

This was a rather unusual error. Yesterday, during the day, after the space build was complete, there was a phenomenon where it would freeze during the startup process (probably at the moment of the Gradio GUI startup), but this had been fixed by the evening. I think this was probably an error that only occurred yesterday. No new problems have occurred since then.
The problem was rather with something else, and while the GPU space was relatively unaffected, the CPU space of me and other people was erroring out over a wide range, and I had to spend time dealing with that.

Even though Auto1111 uses sigmas, I found that in Diffusers, only Euler works with input sigmas

Since they evolved separately, even if the name and the general algorithm are the same, the implementation is different...
That's probably why DiffuseCraft's default is Euler instead Euler a...
I feel like the scheduler area of Diffusers hasn't been debugged much, except for the default settings of the scheduler that everyone uses frequently.
Well, practically speaking, one or two default settings are enough.
There are also cases where debugging is incredibly difficult because there are no explicit errors.

it's useful to keep clip skip enabled

I basically have it turned on, but it's not compatible unless we use the community pipeline.
It's fine for use in local or GPU space. I'm not sure which library's specification it is, but anyway, with the current implementation, the community pipeline cannot be used via the Inference API, even if it is a standard implementation. And sometimes, even outside of the Inference API, strange errors like the ones I mentioned above tend to occur.
I've never encountered any problems with this via stablepy, which probably uses lpw_ by default, but I did encounter the unworkable error I mentioned above when I was doing some manual work in my other space. If it hasn't been fixed in the current dev version, I'll try to raise an issue on github. It would be best if it didn't happen again...😅

Edit:
It seems that the above bug does not seem to be reproducible. It seems to have been a common Diffusers seizure

r3gm

Owner 16 days ago

That's probably why DiffuseCraft's default is Euler instead Euler a...

I changed it because Euler is the one most tested with Diffusers.

I feel like the scheduler area of Diffusers hasn't been debugged much, except for the default settings of the scheduler that everyone uses frequently.

Yes, it's mostly with the schedule types Exponential and Beta.

I've never encountered any problems with this via stablepy, which probably uses lpw_ by default

Initially, I had planned to use lpw_, but as I progressed, I decided to explore two different approaches. One is based on compel and the other on a1111.

After going through the code, I noticed that clip skip is already set to layer -2 by default in diffusers, so no need to adjust it https://github.com/huggingface/diffusers/blob/76b7d86a9a5c0c2186efa09c4a67b5f5666ac9e3/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py#L417

is also the case for lpw https://github.com/huggingface/diffusers/blob/76b7d86a9a5c0c2186efa09c4a67b5f5666ac9e3/examples/community/lpw_stable_diffusion_xl.py#L395

In stablepy, I also disabled this option and set it to -2 by default with the Classic-variants, but with Compel and Classic, I left the option selectable. Anyway, it's a good idea to leave Clip Skip active with SDXL

John6666

16 days ago

I see. We had no problems with the SDXL standard pipeline except for the 77 token limit, which has a workaround, and the A1111 style prompt emphasis... I was wondering why the SDXL Pony models did not have major problems.🤔

There are many individual things that I personally could probably do and would like to do, but with this chaos of SD3.5 and Flux generation architectures, I'm not sure where to start first. If a bug appears in front of me, I will squash it, but other than that.

r3gm

Owner 15 days ago

I'll go with Flux for now.
By the way, do you have any space to convert Flux models to Diffusers? 😃

John6666

15 days ago

Aside from the architecture, the SD3.5 model seems to be a bit of a mess, so it looks like it would be best to wait until the version is updated a little more or until the Animagine team releases the models they are training on.

space to convert Flux models to Diffusers?

These two barely work, but because the CPU space specs for HF are hopelessly lacking, even with float8, it only succeeds half the time.😅
I recommend duplicating it and loading it into Zero GPU space when you need it to run. I do this, so there is no need to change the code.
https://huggingface.co/spaces/John6666/safetensors_to_diffusers
https://huggingface.co/spaces/John6666/flux-to-diffusers-test

r3gm

Owner 15 days ago

Thanks so much! It’s working really well.

John6666

15 days ago

•

edited 15 days ago

Please be careful, as it is a Frankenstein machine that only converts the included state_dict and copies the rest.
If you don't mix dev and schnell, it should work. Probably.🙄

Edit:
I remembered that I had implemented a request to be able to directly load a single DiT file, since most people only upload DiTs.
This is possible within the Diffusers function, so there may be a way to implement this in stablepy.
https://huggingface.co/spaces/John6666/flux-lora-the-explorer/blob/main/app.py#L112

r3gm

Owner 8 days ago

I remembered that I had implemented a request to be able to directly load a single DiT file, since most people only upload DiTs.
This is possible within the Diffusers function, so there may be a way to implement this in stablepy.
https://huggingface.co/spaces/John6666/flux-lora-the-explorer/blob/main/app.py#L112

Thanks, I’ll take a look at the code.

John6666

8 days ago

The code for the loading part is effectively just one line, but the only difference is that, unlike from_pretrained, we need to explicitly download it.
There is also from_single_file, but that function only works with whole model safetensors files, and it's not useful in the current real FLUX ecosystem. Not many people can easily upload 35GB files. Ideally, it would be good if the from_single_file function included logic to make up for any missing components from other repo, but it seems unlikely to happen because sayakpaul is not interested in doing it. This would be probably the same situation with SD3.5Large or other next generation models in near future.

And this is important or rather a piece of trivia, but when using it with HF Spaces, hf_hub_download works MUCH faster than aria2c. It seems that it is transferring internally.
Also, of course, the handling of HF tokens is perfect. As a side note, this is also the reason why I used to separate HF_TOKEN and HF_READ_TOKEN. At the time, aria2c was failing to download from the HF private repo. If I were more experienced in HF, I would just commit to DiffuseCraft, but at the time I was still in the practice stage.😅 That's still the case, but I maybe have recovered enough to be of some use in fixing easy bugs.
It is not clear whether it is better to incorporate this into stablepy or whether it should be limited to DiffuseCraft to avoid a dependency on huggingface_hub library. However, it seems that there is a great deal of demand from users anyway.
Also, it may be even faster if you preload_from_hub for components other than DiT of FLUX model in DiffuseCraft.
It's possible that I've just overlooked it, but I feel like the number of configurable items in README.md has increased a lot recently.
https://huggingface.co/docs/hub/spaces-config-reference

r3gm

Owner 7 days ago

And this is important or rather a piece of trivia, but when using it with HF Spaces, hf_hub_download works MUCH faster than aria2c. It seems that it is transferring internally.
Also, of course, the handling of HF tokens is perfect. As a side note, this is also the reason why I used to separate HF_TOKEN and HF_READ_TOKEN. At the time, aria2c was failing to download from the HF private repo. If I were more experienced in HF, I would just commit to DiffuseCraft, but at the time I was still in the practice stage.😅 That's still the case, but I maybe have recovered enough to be of some use in fixing easy bugs.
It is not clear whether it is better to incorporate this into stablepy or whether it should be limited to DiffuseCraft to avoid a dependency on huggingface_hub library. However, it seems that there is a great deal of demand from users anyway.

Switching to hf_hub_download sounds perfect if it helps, any changes to improve things are more than welcome

Also, it may be even faster if you preload_from_hub for components other than DiT of FLUX model in DiffuseCraft.
It's possible that I've just overlooked it, but I feel like the number of configurable items in README.md has increased a lot recently.
https://huggingface.co/docs/hub/spaces-config-reference

I’ve been testing it, and it works well overall. However, I did notice that when more storage is used, the active RAM increases.
Also, when I delete a model from storage at a certain point, the used RAM decreases, which could potentially help prevent crashes when RAM runs low, though I haven’t tested it thoroughly

John6666

7 days ago

•

edited 7 days ago

I did notice that when more storage is used, the active RAM increases

Oh...

Switching to hf_hub_download sounds perfect if it helps

Perfect inside HF Spaces, but aria2c is probably faster from the outside. I didn't benchmark it, but it's just how it feels.
Also, hf_transfer, which is probably used inside from_pretrained, is probably the fastest inside HF Spaces, but it doesn't seem to be a function for civilians. For parts of the library?
For me, hf_hub_download and snapshot_download and aria2c and my firefox are sufficient, so I've never used it...
https://github.com/huggingface/hf_transfer

r3gm

Owner 6 days ago

Perfect inside HF Spaces, but aria2c is probably faster from the outside. I didn't benchmark it, but it's just how it feels.
Also, hf_transfer, which is probably used inside from_pretrained, is probably the fastest inside HF Spaces, but it doesn't seem to be a function for civilians. For parts of the library?
For me, hf_hub_download and snapshot_download and aria2c and my firefox are sufficient, so I've never used it...
https://github.com/huggingface/hf_transfer

Thanks, I'll definitely change this if it helps speed up the download time

I made a few adjustments to load those Flux model files based on your space, and it’s working now

John6666

6 days ago

If the programmer is willing to do the work, there's no problem with the program being fast.😀
BTW, thanks to utils.py and constants.py, maintenance has become easier. I can complete the work almost entirely with the mouse.

John6666

2 days ago

Maybe you already know this from github, but there was an announcement of a major update to LoRA. Well we are not bothered by PEFT.
https://discord.com/channels/879548962464493619/1014557141132132392/1308350211323592786