Now Reading
segmind/SSD-1B · Hugging Face

segmind/SSD-1B · Hugging Face

2023-10-25 03:07:11

image/png



Demo

Check out the mannequin at Segmind SSD-1B for ⚡ quickest inference. You may as well strive it on 🤗 Spaces



Mannequin Description

The Segmind Steady Diffusion Mannequin (SSD-1B) is a distilled 50% smaller model of the Steady Diffusion XL (SDXL), providing a 60% speedup whereas sustaining high-quality text-to-image era capabilities. It has been educated on various datasets, together with Grit and Midjourney scrape information, to boost its potential to create a variety of visible content material based mostly on textual prompts.

This mannequin employs a data distillation technique, the place it leverages the teachings of a number of professional fashions in succession, together with SDXL, ZavyChromaXL, and JuggernautXL, to mix their strengths and produce spectacular visible outputs.

Particular due to the HF workforce 🤗 particularly Sayak, Patrick and Poli for his or her collaboration and steering on this work.



Picture Comparision (SDXL-1.0 vs SSD-1B)

image/png



Utilization:

This mannequin can be utilized by way of the 🧨 Diffusers library.

Be sure to put in diffusers from supply by working

pip set up git+https://github.com/huggingface/diffusers

As well as, please set up transformers, safetensors and speed up:

pip set up transformers speed up safetensors

To make use of the mannequin, you may run the next:

from diffusers import StableDiffusionXLPipeline
import torch
pipe = StableDiffusionXLPipeline.from_pretrained("segmind/SSD-1B", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")


immediate = "An astronaut driving a inexperienced horse" 
neg_prompt = "ugly, blurry, poor high quality" 
picture = pipe(immediate=immediate, negative_prompt=neg_prompt).photos[0]



Please do use unfavourable prompting, and a CFG round 9.0 for the very best quality!



Mannequin Description



Key Options

  • Textual content-to-Picture Technology: The mannequin excels at producing photos from textual content prompts, enabling a variety of artistic functions.

  • Distilled for Pace: Designed for effectivity, this mannequin affords a 60% speedup, making it a sensible selection for real-time functions and eventualities the place speedy picture era is important.

  • Numerous Coaching Knowledge: Skilled on various datasets, the mannequin can deal with quite a lot of textual prompts and generate corresponding photos successfully.

  • Data Distillation: By distilling data from a number of professional fashions, the Segmind Steady Diffusion Mannequin combines their strengths and minimizes their limitations, leading to improved efficiency.



Mannequin Structure

The SSD-1B Mannequin is a 1.3B Parameter Mannequin which has a number of layers faraway from the Base SDXL Mannequin



Coaching data

These are the important thing hyperparameters used throughout coaching:

  • Steps: 251000
  • Studying charge: 1e-5
  • Batch measurement: 32
  • Gradient accumulation steps: 4
  • Picture decision: 1024
  • Combined-precision: fp16



Pace Comparision

We now have noticed that SSD-1B is upto 60% sooner than the Base SDXL Mannequin. Under is a comparision on an A100 80GB.

image/png

Under are the velocity up metrics on a RTX 4090 GPU.

image/png

See Also



Mannequin Sources

For analysis and improvement functions, the SSD-1B Mannequin could be accessed by way of the Segmind AI platform. For extra data and entry particulars, please go to Segmind.



Makes use of



Direct Use

The Segmind Steady Diffusion Mannequin is appropriate for analysis and sensible functions in numerous domains, together with:

  • Artwork and Design: It may be used to generate artworks, designs, and different artistic content material, offering inspiration and enhancing the artistic course of.

  • Training: The mannequin could be utilized in academic instruments to create visible content material for educating and studying functions.

  • Analysis: Researchers can use the mannequin to discover generative fashions, consider its efficiency, and push the boundaries of text-to-image era.

  • Protected Content material Technology: It affords a protected and managed technique to generate content material, decreasing the chance of dangerous or inappropriate outputs.

  • Bias and Limitation Evaluation: Researchers and builders can use the mannequin to probe its limitations and biases, contributing to a greater understanding of generative fashions’ conduct.



Downstream Use

The Segmind Steady Diffusion Mannequin may also be used immediately with the 🧨 Diffusers library coaching scripts for additional coaching, together with:

export MODEL_NAME="segmind/SSD-1B"
export VAE_NAME="madebyollin/sdxl-vae-fp16-fix"
export DATASET_NAME="lambdalabs/pokemon-blip-captions"

speed up launch train_text_to_image_lora_sdxl.py 
  --pretrained_model_name_or_path=$MODEL_NAME 
  --pretrained_vae_model_name_or_path=$VAE_NAME 
  --dataset_name=$DATASET_NAME --caption_column="textual content" 
  --resolution=1024 --random_flip 
  --train_batch_size=1 
  --num_train_epochs=2 --checkpointing_steps=500 
  --learning_rate=1e-04 --lr_scheduler="fixed" --lr_warmup_steps=0 
  --mixed_precision="fp16" 
  --seed=42 
  --output_dir="sd-pokemon-model-lora-sdxl" 
  --validation_prompt="cute dragon creature" --report_to="wandb" 
  --push_to_hub
export MODEL_NAME="segmind/SSD-1B"
export VAE_NAME="madebyollin/sdxl-vae-fp16-fix"
export DATASET_NAME="lambdalabs/pokemon-blip-captions"

speed up launch train_text_to_image_sdxl.py 
  --pretrained_model_name_or_path=$MODEL_NAME 
  --pretrained_vae_model_name_or_path=$VAE_NAME 
  --dataset_name=$DATASET_NAME 
  --enable_xformers_memory_efficient_attention 
  --resolution=512 --center_crop --random_flip 
  --proportion_empty_prompts=0.2 
  --train_batch_size=1 
  --gradient_accumulation_steps=4 --gradient_checkpointing 
  --max_train_steps=10000 
  --use_8bit_adam 
  --learning_rate=1e-06 --lr_scheduler="fixed" --lr_warmup_steps=0 
  --mixed_precision="fp16" 
  --report_to="wandb" 
  --validation_prompt="a cute Sundar Pichai creature" --validation_epochs 5 
  --checkpointing_steps=5000 
  --output_dir="sdxl-pokemon-model" 
  --push_to_hub
export MODEL_NAME="segmind/SSD-1B"
export INSTANCE_DIR="canine"
export OUTPUT_DIR="lora-trained-xl"
export VAE_PATH="madebyollin/sdxl-vae-fp16-fix"

speed up launch train_dreambooth_lora_sdxl.py 
  --pretrained_model_name_or_path=$MODEL_NAME  
  --instance_data_dir=$INSTANCE_DIR 
  --pretrained_vae_model_name_or_path=$VAE_PATH 
  --output_dir=$OUTPUT_DIR 
  --mixed_precision="fp16" 
  --instance_prompt="a photograph of sks canine" 
  --resolution=1024 
  --train_batch_size=1 
  --gradient_accumulation_steps=4 
  --learning_rate=1e-5 
  --report_to="wandb" 
  --lr_scheduler="fixed" 
  --lr_warmup_steps=0 
  --max_train_steps=500 
  --validation_prompt="A photograph of sks canine in a bucket" 
  --validation_epochs=25 
  --seed="0" 
  --push_to_hub



Out-of-Scope Use

The SSD-1B Mannequin just isn’t appropriate for creating factual or correct representations of individuals, occasions, or real-world data. It’s not meant for duties requiring excessive precision and accuracy.



Limitations and Bias

Limitations & Bias
The SSD-1B Mannequin has some challenges in embodying absolute photorealism, particularly in human depictions. Whereas it grapples with incorporating clear textual content and sustaining the constancy of complicated compositions because of its autoencoding strategy, these hurdles pave the best way for future enhancements. Importantly, the mannequin’s publicity to a various dataset, although not a panacea for ingrained societal and digital biases, represents a foundational step in direction of extra equitable expertise. Customers are inspired to work together with this pioneering software with an understanding of its present limitations, fostering an setting of acutely aware engagement and anticipation for its continued evolution.

Source Link

What's Your Reaction?
Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
View Comments (0)

Leave a Reply

Your email address will not be published.

2022 Blinking Robots.
WordPress by Doejo

Scroll To Top