diffusion常见VAE使用及其训练_训练sd 中的 vae

作者：笔触狂放9 | 2024-04-30 23:09:15

踩

训练sd 中的 vae

kl-f8-VAE

Latent Diffusion Models 包含很多Kl8/4...的VAE，这些VAE可以使用自己的数据集进行预训练：

所用损失函数： L1 + LPIPS

网址：GitHub - CompVis/latent-diffusion: High-Resolution Image Synthesis with Latent Diffusion Models

f8-ft-EMA 、f8-ft-MSE

没有发现训练代码...

他俩与“kl-f8-VAE”的区别：

kl-f8-VAE是在“ImageNet”进行训练的，而f8-ft-EMA /f8-ft-MSE它们是为了增强stable diffusion人脸的训练

1）. sd-vae-ft-ema

- trained on LAION-aesthetics+human：The first, ft-EMA, was resumed from the original checkpoint, trained for 313k steps and uses EMA weights. It uses the same loss configuration as the original checkpoint (L1 + LPIPS).

stabilityai/sd-vae-ft-ema（https://huggingface.co/stabilityai/sd-vae-ft-ema）

2）. sd-vae-ft-mse

- continue training on same dataset but in such a way to make the outputs more smooth：The second, ft-MSE, was resumed from ft-EMA and uses EMA weights and was trained for another 280k steps using a different loss, with more emphasis on MSE reconstruction (MSE + 0.1 * LPIPS). It produces somewhat ``smoother'' outputs. The batch size for both versions was 192 (16 A100s, batch size 12 per GPU).

stabilityai/sd-vae-ft-mse（https://huggingface.co/stabilityai/sd-vae-ft-mse）

在上面的链接中有这两个模型在辅助生成图片时的效果对比。就使用经验而言，EMA 会更锐利、MSE 会更平滑。

声明：本文内容由网友自发贡献，不代表【wpsshop博客】立场，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：https://www.wpsshop.cn/w/笔触狂放9/article/detail/515753