SD 3.5 Medium — implementation by Spacelike AI

Multimodal Diffusion Transformer · 2.5 B parameters · Stability AI

SD 3.5 Medium final-sample output generated by Spacelike AI.

SD 3.5 Medium is the mid-size flagship of the SD 3.5 family, released in October 2024. It uses the improved Multimodal Diffusion Transformer (MM-DiT) core first introduced in Stable Diffusion 3, scaled down for efficient inference on consumer hardware while retaining high-quality output at 1-megapixel native resolution.

Compared to SD 3 Medium, the 3.5 series applies QK-normalization and additional training stages to improve prompt adherence, anatomy, and fine detail. Runs on GPUs with as little as 9.9 GB VRAM.

Specification

Architecture: MM-DiT-X · improved-QK-norm · triple text encoder conditioning
Parameters: 2.5 B (backbone)
Training objective: Rectified flow matching
Native resolution: 1024 × 1024 · up to ~1440 × 1440 supported
Text encoders: CLIP-L + CLIP-G + T5-XXL (frozen)
Sampler shown: Euler · 28 steps · cfg 4.5
License: Stability AI Community License
Release: October 29, 2024
Checkpoint: sd3.5_medium.safetensors

Client: Tenstorrent Inc. — implementation · performance optimization.

implementation · Hugging Face · Vendor announcement

Live sample on the Spacelike AI home page.

Sample images on this page are licensed under CC BY 4.0 — reuse with attribution to Spacelike AI and a link back to spacelike.ai.

SpacelikeAI Pushing AI Models To The Limits Of Hardware

Denoising Step / 05

sigma 14.6 · latent noise

cfg7.5

step1 / 5

seed0x7A3F

Loading models…