赞
踩
有一种简便的方法用于推理是至关重要的。扩散系统通常由多个组件组成,如parameterized model、tokenizers和schedulers,它们以复杂的方式进行交互。这就是为什么我们设计了DiffusionPipeline,将整个扩散系统的复杂性包装成易于使用的API,同时保持足够的灵活性,以适应其他用例,例如将每个组件单独加载作为构建块来组装自己的扩散系统。
1.Diffusion Pipeline
DiffusionPipeline是扩散模型最简单最通用的方法。
- from diffusers import DiffusionPipeline
-
- repo_id = "runwayml/stable-diffusion-v1-5"
- pipe = DiffusionPipeline.from_pretrained(repo_id, use_safetensors=True)
也可以使用特定的pipeline
- from diffusers import StableDiffusionPipeline
-
- repo_id = "runwayml/stable-diffusion-v1-5"
- pipe = StableDiffusionPipeline.from_pretrained(repo_id, use_safetensors=True)
Community pipelines是原始实现不同于DiffusionPipeline,例如StableDiffusionControlNetPipeline.
1.1 local pipeline
- from diffusers import DiffusionPipeline
-
- repo_id = "./stable-diffusion-v1-5" # local path
- stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, use_safetensors=True)
from_pretrained()方法在检测到本地路径时不会下载。
1.2 swap components in a pipeline
可以使用另一个兼容的组件来自定义任何流程的默认组件。定制非常重要,因为:
- from diffusers import DiffusionPipeline
-
- repo_id = "runwayml/stable-diffusion-v1-5"
- stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, use_safetensors=True)
- stable_diffusion.scheduler.compatibles
- from diffusers import DiffusionPipeline, EulerDiscreteScheduler, DPMSolverMultistepScheduler
-
- repo_id = "runwayml/stable-diffusion-v1-5"
- scheduler = EulerDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
- stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, scheduler=scheduler, use_safetensors=True)
可以将PNDMScheduler更换为EulerDiscreteScheduler,在回传到DiffusionPipeline中。
1.3 safety checker
safety checker可以根据已知的NSFW内容检查生成的输出,
- from diffusers import DiffusionPipeline
-
- repo_id = "runwayml/stable-diffusion-v1-5"
- stable_diffusion = DiffusionPipeline.from_pretrained(repo_id, safety_checker=None, use_safetensors=True)
1.4 reuse components across pipelines
可以在多个pipeline中可以重复使用相同的组件,以避免将权重加载到RAM中2次
- from diffusers import StableDiffusionPipeline, StableDiffusionImg2ImgPipeline
-
- model_id = "runwayml/stable-diffusion-v1-5"
- stable_diffusion_txt2img = StableDiffusionPipeline.from_pretrained(model_id, use_safetensors=True)
-
- components = stable_diffusion_txt2img.components
可以将components传递到另一个pipeline中,无需将权重重新加载到RAM中:
stable_diffusion_img2img = StableDiffusionImg2ImgPipeline(**components)
下面的方式更加灵活:
- from diffusers import StableDiffusionPipeline, StableDiffusionImg2ImgPipeline
-
- model_id = "runwayml/stable-diffusion-v1-5"
- stable_diffusion_txt2img = StableDiffusionPipeline.from_pretrained(model_id, use_safetensors=True)
- stable_diffusion_img2img = StableDiffusionImg2ImgPipeline(
- vae=stable_diffusion_txt2img.vae,
- text_encoder=stable_diffusion_txt2img.text_encoder,
- tokenizer=stable_diffusion_txt2img.tokenizer,
- unet=stable_diffusion_txt2img.unet,
- scheduler=stable_diffusion_txt2img.scheduler,
- safety_checker=None,
- feature_extractor=None,
- requires_safety_checker=False,
- )
1.5 checkpoint variants
以torch.float16保存,节省一半的内存,但是无法训练,EMA不用于推理,用于微调模型。
2. models
- from diffusers import UNet2DConditionModel
-
- repo_id = "runwayml/stable-diffusion-v1-5"
- model = UNet2DConditionModel.from_pretrained(repo_id, subfolder="unet", use_safetensors=True)
所有的权重都存储在一个safetensors中, 可以用.from_single_file()来加载模型。safetensors安全且加载速度快。
2.1 load different stable diffusion formats
.ckpt也可以用from_single_file(),但最好转成hf格式,可以使用diffusers官方提供的服务转:https://huggingface.co/spaces/diffusers/sd-to-diffusers
python ../diffusers/scripts/convert_original_stable_diffusion_to_diffusers.py --checkpoint_path temporalnetv3.ckpt --original_config_file cldm_v15.yaml --dump_path ./ --controlnet
A1111 Lora文件,diffusers可以使用load_lora_weights()加载lora模型:
- from diffusers import DiffusionPipeline, UniPCMultistepScheduler
- import torch
-
- pipeline = DiffusionPipeline.from_pretrained(
- "andite/anything-v4.0", torch_dtype=torch.float16, safety_checker=None
- ).to("cuda")
- pipeline.scheduler = UniPCMultistepScheduler.from_config(pipeline.scheduler.config)
-
- # uncomment to download the safetensor weights
- #!wget https://civitai.com/api/download/models/19998 -O howls_moving_castle.safetensors
-
- pipeline.load_lora_weights(".", weight_name="howls_moving_castle.safetensors")
-
- prompt = "masterpiece, illustration, ultra-detailed, cityscape, san francisco, golden gate bridge, california, bay area, in the snow, beautiful detailed starry sky"
- negative_prompt = "lowres, cropped, worst quality, low quality, normal quality, artifacts, signature, watermark, username, blurry, more than one bridge, bad architecture"
-
- images = pipeline(
- prompt=prompt,
- negative_prompt=negative_prompt,
- width=512,
- height=512,
- num_inference_steps=25,
- num_images_per_prompt=4,
- generator=torch.manual_seed(0),
- ).images
-
- from diffusers.utils import make_image_grid
-
- make_image_grid(images, 2, 2)

3.scheduler
scheduler没有参数化或训练;由配置文件定义。加载scheduler不会消耗大的内存,并且相同的配置文件可以用于各种不同的scheduler,比如下面的scheduler均可与StableDiffusionPipline兼容。
Diffusion流程本质上是由扩散模型和scheduler组成的集合,它们在一定程度上彼此独立。这意味着可以替换流程的某些部分,其中最好的例子就是scheduler。扩散模型通常只定义从噪声到较少噪声样本的前向传递过程,而调度器定义了整个去噪过程,包括:
去噪步骤是多少?随机的还是确定性的?用什么算法找到去噪样本? 调度器可以非常复杂,并且经常在去噪速度和去噪质量之间进行权衡。
- from diffusers import StableDiffusionPipeline
- from diffusers import (
- DDPMScheduler,
- DDIMScheduler,
- PNDMScheduler,
- LMSDiscreteScheduler,
- EulerDiscreteScheduler,
- EulerAncestralDiscreteScheduler,
- DPMSolverMultistepScheduler,
- )
-
- repo_id = "runwayml/stable-diffusion-v1-5"
-
- ddpm = DDPMScheduler.from_pretrained(repo_id, subfolder="scheduler")
- ddim = DDIMScheduler.from_pretrained(repo_id, subfolder="scheduler")
- pndm = PNDMScheduler.from_pretrained(repo_id, subfolder="scheduler")
- lms = LMSDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
- euler_anc = EulerAncestralDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
- euler = EulerDiscreteScheduler.from_pretrained(repo_id, subfolder="scheduler")
- dpm = DPMSolverMultistepScheduler.from_pretrained(repo_id, subfolder="scheduler")
-
- # replace `dpm` with any of `ddpm`, `ddim`, `pndm`, `lms`, `euler_anc`, `euler`
- pipeline = StableDiffusionPipeline.from_pretrained(repo_id, scheduler=dpm, use_safetensors=True)

4.DiffusionPipline explained
作为一个类方法,DiffusionPipeline.from_pretrained()做两件事,1.下载推理所需的权重并缓存,一般存在在.cache文件中,2.将缓存文件中的model_index.json进行实例化。
feature_extractor--CLIPFeatureExtractor(transformers);scheduler--PNDMScheduler;text_encoder--CLIPTextModel(transformers);tokenizer--CLIPTokenizer(transformers);unet--UNet2DConditionModel;vae--AutoencoderKL
- {
- "_class_name": "StableDiffusionPipeline",
- "_diffusers_version": "0.6.0",
- "feature_extractor": [
- "transformers",
- "CLIPImageProcessor"
- ],
- "safety_checker": [
- "stable_diffusion",
- "StableDiffusionSafetyChecker"
- ],
- "scheduler": [
- "diffusers",
- "PNDMScheduler"
- ],
- "text_encoder": [
- "transformers",
- "CLIPTextModel"
- ],
- "tokenizer": [
- "transformers",
- "CLIPTokenizer"
- ],
- "unet": [
- "diffusers",
- "UNet2DConditionModel"
- ],
- "vae": [
- "diffusers",
- "AutoencoderKL"
- ]
- }

下面是runway/stable-diffusion-v1-5的文件夹结构:
- .
- ├── feature_extractor
- │ └── preprocessor_config.json
- ├── model_index.json
- ├── safety_checker
- │ ├── config.json
- │ └── pytorch_model.bin
- ├── scheduler
- │ └── scheduler_config.json
- ├── text_encoder
- │ ├── config.json
- │ └── pytorch_model.bin
- ├── tokenizer
- │ ├── merges.txt
- │ ├── special_tokens_map.json
- │ ├── tokenizer_config.json
- │ └── vocab.json
- ├── unet
- │ ├── config.json
- │ ├── diffusion_pytorch_model.bin
- └── vae
- ├── config.json
- ├── diffusion_pytorch_model.bin

可以查看组件的属性和配置:
- pipeline.tokenizer
- CLIPTokenizer(
- name_or_path="/root/.cache/huggingface/hub/models--runwayml--stable-diffusion-v1-5/snapshots/39593d5650112b4cc580433f6b0435385882d819/tokenizer",
- vocab_size=49408,
- model_max_length=77,
- is_fast=False,
- padding_side="right",
- truncation_side="right",
- special_tokens={
- "bos_token": AddedToken("<|startoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
- "eos_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
- "unk_token": AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=True),
- "pad_token": "<|endoftext|>",
- },
- )
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。