---
title: "Mochi 1 — Spacelike AI"
description: "AsymmDiT video diffusion · 10 B parameters · Genmo AI"
canonical: https://spacelike.ai/models/mochi-1/
---

# Mochi 1 — on dedicated hardware, by Spacelike AI

_AsymmDiT video diffusion · 10 B parameters · Genmo AI_

Mochi 1 is Genmo AI's open-source video generation model, released October 2024 under the Apache 2.0 license. The 10 B parameter **AsymmDiT** (Asymmetric Diffusion Transformer) backbone — one of the largest video diffusion transformers released openly at the time — allocates roughly 4× more parameters to visual reasoning than to text, reflecting the signal imbalance between the two modalities in video generation.

The companion Mochi-VAE compresses video 8×8 spatially and 6× temporally. Generates 5.4-second 480p clips at 30 fps from a single T5 XXL-encoded prompt.

## Specification

- **Architecture:** AsymmDiT · 4:1 vision-to-text parameter ratio
- **Parameters:** 10 B *(backbone)*
- **Training objective:** Flow matching
- **Native resolution:** 848 × 480 · 30 fps · 5.4 s
- **Text encoder:** T5-XXL (frozen)
- **Sampler shown:** Flow-match Euler · 64 steps · cfg 4.5
- **License:** Apache 2.0
- **Release:** October 22, 2024
- **Checkpoint:** genmo/mochi-1-preview

## Client

[Tenstorrent Inc.](https://tenstorrent.com/) — quality improvement

## Links

- [implementation](https://github.com/tenstorrent/tt-metal/blob/main/models/tt_dit/models/Mochi_1.md)
- [Hugging Face](https://huggingface.co/genmo/mochi-1-preview)
- [Vendor announcement](https://www.genmo.ai/blog/mochi-1-a-new-sota-in-open-text-to-video)

## Sample output

![Mochi 1 — sample output generated by Spacelike AI](https://spacelike.ai/images/mochi-1/frames/frame-001.avif)

---

Mochi 1 implementation by **Spacelike AI GmbH**, Vienna. Interactive viewer: https://spacelike.ai/models/mochi-1/

Sample images licensed [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) — attribute "Spacelike AI" and link back to https://spacelike.ai/.