Local inference benchmark · single GPU

Turbo image models on an RTX 3090

GPU NVIDIA GeForce RTX 3090 Resolution 1024² Seed 42 Runs/model 9 (clean) Date 2026-06-23

PROMPT
A weathered fisherman repairing a fishing net on a harbor dock at golden hour, holding the net in both hands, a wooden sign reading "HARBOR 7" on the wall behind him, photorealistic, cinematic lighting

Fastest per image

Boogu Turbo

5.544 s/img · 10.82 img/min

Fastest per step

Z-Image Turbo

1.309 it/s raw throughput

Most VRAM-efficient

Krea 2 Turbo

18.75 GB mean peak of 24 GB

Real-world speed — seconds per image

What you actually wait for one 1024² image. Lower is better. Boogu's 4-step count beats Z-Image's 8 despite being the larger model.

Raw throughput — it/s

Per-step engine speed (step count removed). Higher is better.

Images per minute

Derived sustained output rate.

Mean peak VRAM — headroom under the 24 GB ceiling

Average of each run's peak VRAM (red whiskers = run-to-run spread); the dashed line is your 24 GB limit. Krea's fp8 + 4B-encoder footprint is leanest; Z-Image's bf16 + full encoder runs closest to the ceiling despite being the smallest model.

Per-run consistency

Total time across clean runs — flat lines mean no throttling/spill.

Avg power draw

Sustained board power during generation.