SD 3.5 Medium — implementation by Spacelike AI
Multimodal Diffusion Transformer · 2.5 B parameters · Stability AI
SD 3.5 Medium is the mid-size flagship of the SD 3.5 family, released in October 2024. It uses the improved Multimodal Diffusion Transformer (MM-DiT) core first introduced in Stable Diffusion 3, scaled down for efficient inference on consumer hardware while retaining high-quality output at 1-megapixel native resolution.
Compared to SD 3 Medium, the 3.5 series applies QK-normalization and additional training stages to improve prompt adherence, anatomy, and fine detail. Runs on GPUs with as little as 9.9 GB VRAM.
Specification
- Architecture
- MM-DiT-X · improved-QK-norm · triple text encoder conditioning
- Parameters
- 2.5 B (backbone)
- Training objective
- Rectified flow matching
- Native resolution
- 1024 × 1024 · up to ~1440 × 1440 supported
- Text encoders
- CLIP-L + CLIP-G + T5-XXL (frozen)
- Sampler shown
- Euler · 28 steps · cfg 4.5
- License
- Stability AI Community License
- Release
- October 29, 2024
- Checkpoint
- sd3.5_medium.safetensors
Client: Tenstorrent Inc. — implementation · performance optimization.
implementation · Hugging Face · Vendor announcement
Live sample on the Spacelike AI home page.
Sample images on this page are licensed under CC BY 4.0 — reuse with attribution to Spacelike AI and a link back to spacelike.ai.