Kira Models: Lite, Nova, Ultra and More

We are excited to introduce our first generation of in-house multimodal models, designed to power both image and video creation in a single unified system.

Unlike traditional AI tools that rely on separate systems for each media type, Kira models are built as multimodal creative engines. Each model can generate images, create video, produce music, and handle editing workflows within the same system.

The Kira 1.0 model stack includes three models optimized for different stages of the creative process:

Kira Lite 1.0 – fast generation and rapid iteration
Kira Nova 1.0 – balanced performance and high-quality results
Kira Ultra 1.0 – maximum quality and creative fidelity

Together, they form a progressive creative pipeline that allows users to move from ideas to production-ready results.

Our Models

Kira Lite 1.0

Our fastest multimodal model. Built for speed and rapid creative iteration across both image and video.

Generates images in seconds and produces video with strong baseline motion
Lowest credit cost per generation
Handles a wide range of styles from illustration to photography
Best for:
- Brainstorming and ideation
- Quick visual concepts
- Social media content
- Early music or soundtrack ideas
- High-volume creative workflows

Kira Nova 1.0

Our balanced multimodal model, delivering higher visual quality and more refined generation across images, video, and music.

Smoother motion and improved temporal stability in video

Handles complex prompts with multiple subjects and layered descriptions
Best for:
- Polished visuals
- Product imagery and marketing content
- Social video production
- Structured music creation and soundtracks
- Detailed creative projects

Kira Ultra 1.0

Our most powerful multimodal model. Maximum quality across both image, video, and music generation.

Highest resolution images with the sharpest detail and best prompt adherence
Highest motion fidelity in video with superior subject consistency across frames
Excels at photorealism, cinematic composition, fine textures, and complex perspectives
Best for:
- Hero visuals
- Final production renders
- Professional video content

Model Performance

Below are the results of Kira Lite 1.0, Kira Nova 1.0, and Kira Ultra 1.0 in internal benchmarks across image and video generation tasks.

All Kira models are in-house multimodal models designed to handle both image generation and video generation workflows.

The evaluation compares Kira models against leading image and video generation systems across several dimensions including prompt following, motion quality, visual fidelity, and editing consistency.

Multi-Dimensional Evaluation

We evaluate Kira models across three major capability categories:

Text-to-Image Generation
Image-to-Video Generation
Multimodal Editing Tasks

These benchmarks compare Kira models against widely used generation systems across both image and video domains.

Text-to-Image Evaluation

Key Results

Kira Lite 1.0

Optimized for speed and rapid iteration. Performs strongly in prompt understanding and style diversity while maintaining very fast generation time.

Kira Nova 1.0

Balanced model with strong performance across prompt adherence, composition accuracy, and image realism.

Kira Ultra 1.0

Highest-performing Kira image model with superior performance in:

Fine texture rendering
Photorealism and lighting realism
Complex prompt interpretation
Material and surface fidelity

Ultra performs competitively with leading image generation systems in high-fidelity tasks.

Image-to-Video Evaluation

Key Results

Kira Lite 1.0

Fastest generation speed with strong baseline motion generation suitable for quick social content creation.

Kira Nova 1.0

Balanced performance between motion quality and generation speed, producing smoother motion and improved temporal stability.

Kira Ultra 1.0

Highest motion fidelity among Kira models with improved:

Subject consistency across frames
Natural camera motion
Complex movement rendering
Cinematic scene transitions

Ultra performs competitively with leading video generation models such as Seedance 2.0 on motion fidelity benchmarks.

Multimodal Task Evaluation

Key Results

Kira Lite 1.0

Optimized for lightweight multimodal workflows and rapid creative iteration.

Kira Nova 1.0

Strong editing consistency and reliable prompt following across multimodal tasks.

Kira Ultra 1.0

Best performance in complex editing pipelines, maintaining visual identity and scene coherence across multiple transformations.

Model Comparison

Kira Lite 1.0: Fastest speed / Good image quality / Good motion quality / Good prompt accuracy / Best for idea exploration

Kira Nova 1.0: Fast speed / High image quality / High motion quality / Strong prompt accuracy / Best for polished content

Kira Ultra 1.0: Standard speed / Highest image quality / Highest motion quality / Best prompt accuracy / Best for final production

Summary

Kira model stack is designed as a progressive creative pipeline.

Lite > Nova > Ultra

Users can start with Lite for rapid experimentation, refine results with Nova, and finalize visuals using Ultra for maximum quality.

All Kira models support image generation, video creation, and music generation, allowing creators to move seamlessly across formats inside Kira.

Generate an image, turn it into a video, or animate it with motion control, and add a soundtrack, all within the same platform.

Kira Models: Lite, Nova, Ultra and More

Our Models

Kira Lite 1.0

Kira Nova 1.0

Kira Ultra 1.0

Model Performance

Multi-Dimensional Evaluation

Text-to-Image Evaluation

Key Results

Image-to-Video Evaluation

Key Results

Multimodal Task Evaluation

Key Results

Model Comparison

Summary

Was this article helpful?

Thanks for the feedback!