xAI's Grok Imagine Video 1.5 is the #1 ranked image-to-video model on the Arena leaderboard, with a +52 Elo improvement over version 1.0. Animate any still image into a cinematic video with natively synchronized audio — realistic motion, physics-accurate interactions, and automatically generated sound in a single pass.
Available on SeedDance platform

Grok Imagine Video 1.5 is xAI's latest image-to-video generation model, officially released on May 31, 2026. It secures the #1 position on the Arena.ai Image-to-Video leaderboard with a massive +52 Elo point improvement over the previous version, outperforming Seedance 2.0, HappyHorse 1.0, and Google Veo. Built on the Aurora engine, it animates still images into short videos with synchronized audio — handling visual generation and audio synthesis in one seamless pass.
Grok Imagine Video 1.5 Preview (720p) officially ranks #1 on the Arena.ai Image-to-Video leaderboard, surpassing ByteDance's Seedance 2.0, Alibaba ATH's HappyHorse, and Google Veo with a decisive +52 Elo point improvement over the previous version.
Audio is generated simultaneously with video in a single pass. Background music, sound effects, ambient audio, and even short dialogue are created in perfect sync with on-screen action — no separate audio editing needed. Version 1.5 introduces major audio improvements for more natural and immersive sound.
Grok Imagine Video 1.5 is a dedicated image-to-video model, optimized specifically for animating still images. This focused design means every parameter and capability is tuned for the best possible image animation results, from preserving visual identity to generating contextually appropriate motion.
Blind testing shows substantial gains in face accuracy over version 1.0. Grok Imagine Video 1.5 generates more realistic faces — including celebrity likenesses — while maintaining strong character consistency throughout video sequences, making it ideal for portrait animations and character-driven content.
Grok Imagine Video 1.5 combines xAI's Aurora engine with major upgrades in audio quality, photorealism, temporal coherence, and prompt adherence — delivering the highest quality image-to-video generation available today.

xAI's most advanced image-to-video model — Aurora engine physics, native synchronized audio, and the #1 ranking on the Arena leaderboard.
Upload any still image — portrait, product photo, illustration, or concept art — and Grok Imagine Video 1.5 animates it with realistic motion and contextually appropriate action. The output aspect ratio defaults to the input image's native aspect ratio when set to auto.
Audio is co-generated with video in a single pass. Background music, ambient sounds, sound effects, and dialogue are all synchronized to on-screen action. Influence audio by mentioning sound in your prompt or using the AUDIO: section for explicit audio direction.
Choose between 480p for faster generation and lower cost, or 720p for standard definition quality. The resolution parameter gives you control over output quality and generation speed to match your project requirements.
Generate videos from 1 to 15 seconds long. Shorter clips (5–8 seconds) are more stable and artifact-free, while longer clips up to 15 seconds work well for narrative sequences. Choose the duration that fits your platform and creative vision.
Support for auto (matches input image), 16:9, 9:16, 1:1, 4:3, 3:4, 3:2, and 2:3 aspect ratios. Match your output to any platform — YouTube widescreen, TikTok portrait, Instagram square, or cinematic formats.
Built on xAI's proprietary Aurora engine, Grok Imagine Video 1.5 models real-world physics — gravity, momentum, collisions, fluid dynamics, and cloth behavior — for visually convincing and physically grounded animation results.
Specify camera movements directly in your prompt: pan, tilt, zoom, dolly, tracking, orbit, aerial, handheld, and slow push-in. The model understands standard cinematic camera language and interprets directorial instructions with precision.
Grok Imagine Video 1.5 handles multi-beat sequences well. List actions in order in your prompt — the athlete crouches, then explodes forward, then the crowd erupts — and the model generates coherent multi-action sequences with temporal consistency.
Everything you need to know about Grok Imagine Video 1.5 and how to use it on SeedDance.
Experience the #1 image-to-video AI model on SeedDance. Upload any image and watch it come to life with synchronized audio, realistic motion, and Aurora engine physics — in seconds.