reference_video_urls.
Model
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
reference_video_urls | string[] | Yes | - | Publicly reachable source video URL. Must contain exactly one URL. Uploaded source videos must meet the limits below. |
prompt | string | Yes | - | Edit instruction. Use @Video1 to refer to the source video and @ImageN/@CharacterN for extra references. |
reference_image_urls | string[] | No | - | Publicly reachable image URLs used as edit references. They count toward the total reference limit. |
reference_characters | array | No | - | Character references for the edit. Max 3 character items. Each character may include 1-10 image_urls. Supports per-character voice or custom_voice. See available voices. |
seed | integer | No | Random | Seed sent with the edit request. If omitted, a seed is generated automatically and returned when available. |
start_frame | integer | No | 0 | First source frame index included in the edit range. |
end_frame | integer | No | Source end | Last source frame index included in the edit range. If omitted, the service uses the detected final frame when available. |
allow_audio_filtered | boolean | No | false | When true, the generation runs without audio instead of failing when the audio track is filtered. |
Limits
| Limit | Value |
|---|---|
| Video references | 1 |
| Character references | 3 |
| Total video + image + character references | 7 |
| Uploaded source video size | Up to 1 GB |
| Uploaded source video length | Up to 30 seconds |
Character References
Each character must be an object withimage_urls. Even one-image characters must use an array. Gemini Omni Flash edit also supports per-character preset voices and custom tuned voices.
Preset Voice
Custom Tuned Voice
| Field | Type | Required | Description |
|---|---|---|---|
image_urls | string[] | Yes* | Publicly reachable character image URLs. Use 1-10 images for a character entity. |
name | string | No | Display name for the character. Used as the temporary character handle/name. |
description | string | No | Character notes/personality text. |
voice | string | No | Preset voice for this character. Do not combine with custom_voice on the same character. |
custom_voice | object | No | Tuned voice config for this character. Use this instead of voice when you want to modify a base voice. |
custom_voice.voice | string | Yes | Preset voice used as the base voice for tuning. See available voices. |
custom_voice.name | string | Yes | Display name for the tuned voice. |
custom_voice.voice_performance | string | Yes | Direction for the tuned voice delivery, such as tone, energy, accent, pacing, or acting notes. |
custom_voice.sample_dialogue | string | Yes | Sample line used for tuning. Max 120 characters. |
custom_voice.speaker | string | No | Speaker label for the tuned voice. Defaults to the custom voice name when omitted. |
image_urls must contain at least one URL.
Example
Prompt References
| Token | Refers to |
|---|---|
@Video1 | The source video for edit requests |
@Image1, @Image2 | Items from reference_image_urls only |
@Character1, @Character2 | Items from reference_characters only |
Rejected Combinations
| Input | Why rejected |
|---|---|
task_id | Edit requests must use reference_video_urls; source task IDs are not supported. |
Missing or empty reference_video_urls | An edit needs exactly one source video URL. |
More than one reference_video_urls item | Video edit accepts one source video. |
end_frame lower than start_frame | Frame range must move forward. |
| More than 7 total video + image + character items | Gemini Omni Flash edit reference limit. |
image_url on a character | Use image_urls instead, even for one image. |
| Plain string character entries | Character entries must be objects with image_urls. |
Empty image_urls | A character needs at least one image. |
More than 10 image_urls on one character | Character entity image limit. |
voice and custom_voice on the same character | Use a preset voice or a tuned voice, not both. |
