Stability AI Releases SDXL 0.9 Photorealistic Text-to-Image Generative AI Model
Synthetic media startup Stability AI has released SDXL 0.9, the complete version of the Stable Diffusion XL model that began beta testing in April. The new model is designed to upgrade Stable Diffusion’s ability to render photorealistic images and better composition, with an eye toward enterprise use of the generative AI model.
SDXL 0.9 gives the Stable Diffusion model more power and directs the extra flexibility and energy toward augmenting the images it generates from text prompts with more details that add realism to the final result. That includes areas like spelling words correctly and the fine details of faces and hands, which generative AI image models often struggle to depict. The AI is also better at extrapolating from prompts without diverting into unusably strange additions to the photographic composition. Stability AI points to the major upgrade in the model’s parameter size from the beta as the power behind the improvements. The initial test version had 3.1 billion parameters and relied on a single model instead of the two CLIP models employed by the new version.
“Despite its ability to be run on a modern consumer GPU, SDXL 0.9 presents a leap in creative use cases for generative AI imagery. The ability to generate hyper-realistic creations for films, television, music, and instructional videos, as well as offering advancements for design and industrial use, places SDXL at the forefront of real world applications for AI imagery,” Stability AI wrote in its announcement. “SDXL 0.9 has one of the largest parameter counts of any open source image model, boasting a 3.5B parameter base model and a 6.6B parameter model ensemble pipeline (the final output is created by running on two models and aggregating the results). The second stage model of the pipeline is used to add finer details to the generated output of the first stage.”
SDXL 0.9 is available on ClipDrop, the app Stability AI bought last year app, with an API in the works so companies can embed the tool in their existing software. The company has been rapidly expanding its portfolio and improving its tech since raising $101 million in October. The recent list includes the cartoon-making Stable Animation SDK, and the DeepFloyd IF image generator, which doesn’t use the Stable Diffusion model. Stability has also pushed into non-visual generative AI, releasing its StableLM large language mode, capable of composing text and computer code.