Model
Supported Modes
| Mode | Description |
|---|---|
text_to_video | Text-to-video. You control the aspect ratio. |
first_last_frame | Start frame required, end frame optional. Aspect ratio auto-detected from input. |
omni_reference | Up to 9 image/video/audio references (audio max 15s). Use @Image1/@Video1/@Audio1 placeholders in prompt. |
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
prompt | string | No | "" | Text prompt. Use @Image1/@Video1/@Audio1 placeholders in omni_reference mode. |
mode | string | Yes | — | text_to_video, first_last_frame, or omni_reference |
aspect_ratio | string | No | "1:1" | 21:9, 16:9, 4:3, 1:1, 3:4, 9:16. Only used for text-to-video. |
duration | integer | No | 5 | Duration in seconds (4–15) |
resolution | string | No | "720p" | 720p only |
seed | integer | No | random | Seed for reproducibility |
first_frame_url | string | Conditional | null | Start frame image. Required for first_last_frame. |
last_frame_url | string | No | null | End frame image. Optional for first_last_frame. |
references | string[] | Conditional | null | Media URLs (images/videos/audio). Audio max 15s. Required for omni_reference. Max 9. |
Example - Text-to-Video
Example - First & Last Frame
Example - Omni-Reference
@Image1, first video = @Video1, first audio = @Audio1, etc. Images, videos, and audio are numbered independently.
