The 15-billion-parameter AI video model, reportedly developed by Alibaba’s Taotian Future Life Lab, claims the #1 spot on both Text-to-Video and Image-to-Video leaderboards with native audio-visual generation in a single pass.
WILMINGTON, DE, April 10, 2026 /24-7PressRelease/ — Happy Horse 1.0, an AI video generation model that appeared anonymously on the Artificial Analysis Video Arena in early April 2026, has claimed the #1 position on both the Text-to-Video and Image-to-Video leaderboards, outpacing previously leading models including ByteDance Seedance 2.0.
According to media reports, the 15-billion-parameter model is the work of an independent team within Alibaba’s Taotian Future Life Lab, led by Zhang Di, formerly a vice president at Kuaishou and a technical lead on the Kling AI video model. Alibaba’s share price saw short-term gains following the attribution reports.
WHAT MAKES HAPPY HORSE 1.0 DIFFERENT
Unlike every comparable open-weight video model released to date — including Wan 2.2 A14B and LTX-2 Pro, which output silent clips — Happy Horse 1.0 is reported to generate dialogue, ambient sound, Foley, and music jointly with the visuals in a single forward pass. No separate audio model, no post-hoc lip-sync.
Community-compiled architecture notes describe a unified single-stream Transformer with 40 layers arranged in a “sandwich” structure: modality-specific layers at the ends and a 32-layer shared backbone in the middle. Text, image, video, and audio tokens are processed in one sequence through unified self-attention, stabilized by per-attention-head sigmoid gating. The model is said to use DMD-2 distillation for inference in just 8 steps, generating 1080p clips in approximately 38 seconds on an NVIDIA H100.
Native multilingual lip-sync is another reported highlight, covering English, Mandarin Chinese, Japanese, Korean, German, and French — built into the generation stage rather than bolted on afterward.
LEADERBOARD RESULTS
On the Artificial Analysis Video Arena, which ranks models through blind pairwise user voting:
– Text-to-Video (no audio): Elo 1333–1387, #1
– Image-to-Video (no audio): Elo 1391–1406, #1 (all-time record)
– Text-to-Video (with audio): Elo 1205–1233, #1 or #2
– Image-to-Video (with audio): Elo 1161, #2
NO OFFICIAL RELEASE YET
As of this writing, there is no official website, no published paper, no open weights, and no public API from the team behind Happy Horse 1.0. Information about the model has spread almost entirely through third-party architecture notes and media coverage, leaving creators and researchers without a central resource.
To help close that gap, a community-run website at https://happy-horse.video offers a place to explore and experience AI video generation in the Happy Horse style, alongside aggregated reference material on the model’s reported capabilities. The site is independent and not affiliated with Alibaba or the Happy Horse 1.0 development team.
For more information, visit https://happy-horse.video
Happy Horse AI is an independent website that aggregates publicly available information about Happy Horse 1.0 and lets users explore AI video generation in the Happy Horse style. The site is not affiliated with Alibaba Group, Taotian, or the Happy Horse 1.0 development team, and does not claim to distribute the official Happy Horse 1.0 model weights.
—
For the original version of this press release, please visit 24-7PressRelease.com here
Legal Disclaimer:
The content on this page is syndicated from independent third-party providers. Kyrion Media makes no warranties or representations regarding the accuracy, completeness, legality, or reliability of the information, including text, images, videos, or licenses. If you are affiliated with this content or have any complaints, copyright concerns, or requests for removal, please contact us at retract@kyrionmedia.com with the specific URL of the content in question. We will review and address valid requests promptly.