赞
踩
在文生成图领域,Midjourney可以被视为一种玩具,而Stable Diffusion则以其稳定、可控和高效的能力而闻名,一直被认为是最接近实用工具的文生成图模型之一。
Stability AI发布了Stable Diffusion 3,宣称这是他们最强大的文本到图像模型之一。该模型利用扩散转换器架构,显著提高了多主题提示、图像质量和拼写能力的性能。相较于Stable Diffusion 2,Stable Diffusion 3在文本语义理解、色彩饱和度、图像构图、分辨率、类型、质感、对比度等方面都有显著增强,可与闭源模型Midjourney相提并论。
该版本的模型参数数量在8亿至80亿之间,这表明Stable Diffusion 3可能针对移动设备进行了优化,使得AI算力消耗更低,同时推理速度更快。这一特性使得Stable Diffusion 3在移动端应用中具有更高的实用性和适用性。
感兴趣可加入:566929147 企鹅群一起学习讨论
测试链接:https://stability.ai/stablediffusion3
根据Stability AI官网上提供的效果图,可以体验一下Stable Diffusion 3在文字渲染方面的能力。
Prompt: Epic anime artwork of a wizard atop a mountain at night casting a cosmic spell into the dark sky that says “Stable Diffusion 3” made out of colorful energy.
Prompt: cinematic photo of a red apple on a table in a classroom, on the blackboard are the words “go big or go home” written in chalk.
下图是官网上展示的路牌、公交灯牌的霓虹效果。可以看出,这些效果图不仅文字清晰而且也没有任何拼写错误:
提示词中允许包含多个主题,多种多样的物品,甚至水印:
Prompt: a painting of an astronaut riding a pig wearing a tutu holding a pink umbrella, on the ground next to the pig is a robin bird wearing a top hat, in the corner are the words “stable diffusion”.
变色龙特写照片:
Prompt: studio photograph closeup of a chameleon over a black background.
Prompt: Trees photographed under the Milky Way, the moon and twilight shine on the Valley. The full moon appears high in the sky and the twilight glow can still be seen.
Prompt: Photo of an 90’s desktop computer on a work desk, on the computer screen it says “welcome”. On the wall in the background we see beautiful graffiti with the text “SD3” very large on the wall.
Prompt: Night photo of a sports car with the text “SD3” on the side, the car is on a race track at high speed, a huge road sign with the text “faster”.
Prompt:A horse balancing on top of a colorful ball in a field with green grass and a mountain in the background.
Prompt:Wide photo of a shipwreck on the beach, lots of rust and moss on the ship contrasting with the beautiful blue of the ocean water and the peace that the beauty of nature conveys. The big waves are magnificent and touch the ship.
Prompt:Photo of a red sphere on top of a blue cube. Behind them is a green triangle, on the right is a dog, on the left is a cat.
Prompt:Three transparent glass bottles on a wooden table. The one on the left has red liquid and the number 1. The one in the middle has blue liquid and the number 2. The one on the right has green liquid and the number 3.
Prompt:Anime style illustration of a newsstand on top of a small grassy hill, on top of the newsstand we see the text “it’s here!”. In the background we see a big rain approaching.
通过对比可以看出,目前的Diffusion 3在展现提示词内容方面表现良好,并且展现出了开始理解物理世界的能力。例如,对于马的图像,可以清晰地看到马踩在球上,球发生了形变,这表明模型能够理解并展现出物体之间的交互作用。
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。