Create and manage AI generation tasks with a unified interface
TheDocumentation Index
Fetch the complete documentation index at: https://docs.unifically.com/llms.txt
Use this file to discover all available pages before exploring further.
/v1/tasks endpoint provides a unified interface for all AI generation models (video, image, audio).
/v1/tasks
Creates a new generation task for any supported model.
provider/model-name format. See Available Models below.true, the request is validated and the cost is calculated without actually creating a task or deducting from your balance. Useful for previewing the price of a request before committing.dry_run to true:
/v1/tasks/:task_id
Retrieves the status and output of a task.
| Model | Description |
|---|---|
google/veo-3.1-fast | Google Veo 3.1 Fast |
google/veo-3.1-fast-relaxed | Google Veo 3.1 Fast Relaxed |
google/veo-3.1-quality | Google Veo 3.1 Quality |
google/veo-3.1-lite | Google Veo 3.1 Lite |
google/veo-3.1-lite-relaxed | Google Veo 3.1 Lite Relaxed |
google/veo-3.1-extend | Google Veo 3.1 Extend |
google/veo-3.1-upscale | Google Veo 3.1 Upscale |
hailuo/minimax-2.0 | Minimax Hailuo 2.0 |
hailuo/minimax-2.3 | Minimax Hailuo 2.3 |
hailuo/minimax-2.3-fast | Minimax Hailuo 2.3 Fast |
higgsfield-ai/lite | Higgsfield Lite |
higgsfield-ai/turbo | Higgsfield Turbo |
higgsfield-ai/standard | Higgsfield Standard |
higgsfield-ai/cinematic-studio-video-3.5 | Cinematic Studio Video v3.5 |
higgsfield-ai/cinematic-studio-video-3.0 | Cinematic Studio Video v3.0 |
higgsfield-ai/cinematic-studio-video-2.5 | Cinematic Studio Video v2.5 |
kuaishou/kling-3.0-omni-video | Kling v3.0 Omni Video |
kuaishou/kling-3.0-omni-video-edit | Kling v3.0 Omni Video Edit |
kuaishou/kling-o1-video | Kling O1 Video |
kuaishou/kling-o1-video-edit | Kling O1 Video Edit |
kuaishou/kling-3.0-video | Kling v3.0 Video |
kuaishou/kling-2.6-video | Kling v2.6 Video |
kuaishou/kling-2.5-turbo-video | Kling v2.5 Turbo Video |
kuaishou/kling-2.1-video | Kling v2.1 Video |
kuaishou/kling-2.1-master-video | Kling v2.1 Master Video |
kuaishou/kling-2.6-motion-control | Kling v2.6 Motion Control |
kuaishou/kling-3.0-motion-control | Kling v3.0 Motion Control |
xai/grok-imagine-video-extend | Grok Imagine Video Extend |
topaz-labs/video-upscale | Topaz Video Upscale |
| Model | Description |
|---|---|
google/nano-banana | Nano Banana |
google/nano-banana-pro | Nano Banana Pro |
openai/gpt-image-2 | GPT Image 2 (1K/2K/4K, multiple aspect ratios) |
higgsfield-ai/cinematic-studio-image | Cinematic Studio Image |
black-forest-labs/flux.2-pro | Flux.2 Pro |
black-forest-labs/flux.2-flex | Flux.2 Flex |
black-forest-labs/flux.2-max | Flux.2 Max |
kuaishou/kling-o1-image | Kling O1 Image |
kuaishou/kling-3.0-omni-image | Kling v3.0 Omni Image |
kuaishou/kling-3.0-image | Kling v3.0 Image |
kuaishou/kling-2.1-image | Kling v2.1 Image |
topaz-labs/image-upscale | Topaz Image Upscale |
topaz-labs/image-generative | Topaz Image Generative |
alibaba/qwen-image-2.0-pro | Qwen Image 2.0 Pro (T2I + editing) |
alibaba/qwen-image-2.0 | Qwen Image 2.0 (T2I + editing) |
alibaba/qwen-image-max | Qwen Image Max (T2I + editing) |
alibaba/qwen-image-plus | Qwen Image Plus (T2I + editing) |
alibaba/qwen-image | Qwen Image (T2I + editing) |
alibaba/z-image-turbo | Z-Image Turbo (T2I only) |
alibaba/wan-2.7-pro-image | Wan 2.7 Pro Image (T2I + editing, up to 4K) |
alibaba/wan-2.7-image | Wan 2.7 Image (T2I + editing) |
alibaba/wan-2.6-image | Wan 2.6 Image (T2I + editing) |
alibaba/wan-2.5-image | Wan 2.5 Image (T2I + editing) |
alibaba/wan-2.2-image | Wan 2.2 Image (T2I only) |
alibaba/wan-2.2-flash-image | Wan 2.2 Flash Image (T2I only) |
xai/grok-imagine-image | Grok Imagine Image (T2I + editing) |
| Model | Description |
|---|---|
suno-ai/music | Suno Music Generation |
suno-ai/add-vocals | Add Vocals to Track |
suno-ai/add-instrumental | Add Instrumental |
suno-ai/extend | Extend Audio |
suno-ai/cover | Create Cover |
suno-ai/stems | Extract Stems |
suno-ai/stems-all | Extract All Stems |
suno-ai/lyrics | Generate Lyrics |
suno-ai/wav | WAV Export |
higgsfield-ai/text-to-speech | Text-to-Speech |
elevenlabs/text-to-speech | ElevenLabs Text-to-Speech |
elevenlabs/text-to-dialogue | ElevenLabs Multi-Voice Dialogue |
elevenlabs/sound-effect | ElevenLabs Sound Effects |
elevenlabs/voice-isolation | ElevenLabs Voice Isolation |
elevenlabs/speech-to-text | ElevenLabs Speech-to-Text |
google/veo-3.1-fast, google/veo-3.1-fast-relaxed, google/veo-3.1-quality, google/veo-3.1-lite, google/veo-3.1-lite-relaxed
There are two mutually exclusive image modes — the API rejects requests that mix them:
| Mode | Fields | Availability |
|---|---|---|
| Frame mode | start_image_url [+ end_image_url] | All models |
| Reference mode | reference_image_urls [+ voice] | Fast, Fast-relaxed, Lite, and Lite-relaxed |
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt for video generation |
aspect_ratio | string | No | "16:9" (default) or "9:16" |
duration | integer | No | 4, 6, or 8 seconds. Default 4. Must be 8 when reference_image_urls is set. |
seed | integer | No | Reproducibility seed |
start_image_url | string | No | Start frame image URL. Cannot be combined with reference_image_urls |
end_image_url | string | No | End frame image URL. Requires start_image_url. Cannot be combined with reference_image_urls |
reference_image_urls | string[] | No | 1–3 reference image URLs. Fast, Fast-relaxed, Lite, and Lite-relaxed only. Cannot be combined with start_image_url/end_image_url |
voice | string | No | Voice preset ID. Requires at least 1 reference image. Fast, Fast-relaxed, Lite, and Lite-relaxed only. See voices endpoint |
google/veo-3.1-extend
Extend a previously generated video. Aspect ratio is inherited from the source task.
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt for the extended content |
task_id | string | Yes | Task ID of a completed generation |
model | string | Yes | One of: lite, fast, quality, lite-relaxed, fast-relaxed |
duration | integer | No | Must be 8 (only supported value for extend). Default 8. |
seed | integer | No | Reproducibility seed |
google/veo-3.1-upscale
Upscale a completed video to a higher resolution.
| Parameter | Type | Required | Description |
|---|---|---|---|
task_id | string | Yes | Task ID of a completed generation |
resolution | string | Yes | "1080p" or "4k" |
hailuo/minimax-2.0, hailuo/minimax-2.3, hailuo/minimax-2.3-fast
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes* | Max 2000 chars. *Required if no start_image_url |
start_image_url | string | Yes* | Image URL (auto-uploaded). *Required if no prompt (required for 2.3-fast) |
end_image_url | string | No | End frame image URL (minimax-2.0 only, 768p/1080p) |
duration | integer | No | 6 or 10 seconds. 1080p only supports 6 |
resolution | string | No | "768p" (default), "1080p" |
prompt_optimization | boolean | No | Let MiniMax optimize prompt |
higgsfield-ai/lite, higgsfield-ai/turbo, higgsfield-ai/standard
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text description of the desired video motion/action |
start_image_url | string | Yes | Starting frame image URL |
end_image_url | string | No | Ending frame image URL (for guided transitions) |
enhance_prompt | boolean | No | Let AI enhance your prompt for better results (default false) |
seed | integer | No | 0-999999 for reproducibility |
motion_id | string | No | Motion preset ID (UUID) |
higgsfield-ai/cinematic-studio-video-2.5
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Conditional | Required for single and multi_shot_auto shot modes. Supports image refs <<<image_1>>> etc. |
shot_mode | string | No | single (default), multi_shot_auto, multi_shot_manual |
multi_shots | object[] | Conditional | Required for multi_shot_manual. Each: prompt, duration, optional camera_movement_id |
image_urls | string[] | No | Reference image URLs (max 3) |
start_image_url | string | No | Starting frame image URL |
end_image_url | string | No | Ending frame URL. Forces generate_audio to false |
duration | integer | No | 3-12 seconds (default 5) |
aspect_ratio | string | No | 1:1, 3:4, 2:3, 9:16, 3:2, 4:3, 16:9, 21:9 |
resolution | string | No | 720p (default) or 1080p |
genre | string | No | auto, action, horror, comedy, western, suspense, intimate, spectacle |
camera_movement_id | string | No | Camera movement preset ID |
generate_audio | boolean | No | Enable AI sound effects (default false) |
seed | integer | No | 0-999999 for reproducibility |
higgsfield-ai/cinematic-studio-video-3.0
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Conditional | Required when shot_mode is single or multi_shot_auto |
shot_mode | string | No | single (default), multi_shot_auto, multi_shot_manual |
multi_shots | object[] | Conditional | Required when shot_mode='multi_shot_manual'. Each item carries its own prompt, duration, camera_motion_id, and speedramp (ramp_up, flash_in, flash_out, hero_moment) |
image_urls | string[] | No | Up to 3 reference image URLs |
start_image_url | string | No | Optional first-frame image |
end_image_url | string | No | Optional last-frame image |
duration | integer | No | Total video duration in seconds (4-15) |
aspect_ratio | string | No | auto (default), 1:1, 3:4, 9:16, 4:3, 16:9, 21:9 |
resolution | string | No | 480p, 720p, 1080p |
genre | string | No | general (default), action, horror, comedy, noir, epic |
camera_motion_id | string | No | Top-level camera movement preset id (sent as preset_id) |
generate_audio | boolean | No | Generate accompanying audio |
seed | integer | No | Random seed; auto-generated if omitted |
higgsfield-ai/cinematic-studio-video-3.5
Single-shot only. Provide either a free-form style_prompt (which fully overrides the structured style fields) or the structured trio color_palette / lighting / camera_moveset_style.
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Required text prompt |
image_urls | string[] | No | Up to 3 reference image URLs |
start_image_url | string | No | Optional first-frame image |
end_image_url | string | No | Optional last-frame image |
duration | integer | No | Video duration in seconds (4-15) |
aspect_ratio | string | No | auto (default), 1:1, 3:4, 9:16, 4:3, 16:9, 21:9 |
resolution | string | No | 480p, 720p, 1080p |
genre | string | No | general (default), action, horror, comedy, noir, epic, drama |
generate_audio | boolean | No | Generate accompanying audio |
seed | integer | No | Random seed; auto-generated if omitted |
style_prompt | string | No | Free-form style description. Fully overrides color_palette, lighting, and camera_moveset_style when provided |
color_palette | string | No | Color grading preset (sent as color_grading). Values: auto, naturalistic_clean, bleached_warm, hyper_neon, teal_orange_epic, sodium_decay, cold_steel, bleach_bypass, classic_bw |
lighting | string | No | Lighting scheme (sent as light_scheme). Values: auto, soft_cross, contre_jour, overhead_fall, window, practicals, silhouette |
camera_moveset_style | string | No | Camera moveset style (sent as camera_style). Values: auto, classic_static, silent_machine, one_take, epic_scale, … |
camera_motion_id | string | No | Camera movement preset id (sent as preset_id) |
camera_model_id | string | No | Camera body preset id (from GET /higgsfield/camera-settings) |
camera_lens_id | string | No | Camera lens preset id |
camera_focal_length_id | string | No | Camera focal length preset id |
camera_aperture_id | string | No | Camera aperture preset id |
kuaishou/kling-3.0-omni-video
| Parameter | Type | Required | Description |
|---|---|---|---|
video_mode | string | No | "elements" (default), "start_end_frame", "transform", "video_reference" |
prompt | string | Conditional | Text prompt. Mutually exclusive with multi_shots |
mode | string | No | "pro" (default). "std" (720p) or "pro" (1080p) |
duration | integer | No | 3–15 seconds (default 5) |
aspect_ratio | string | No | "16:9" (default), "9:16", "1:1", "auto" (start_end_frame only) |
native_audio | boolean | No | Generate AI audio (default false) |
keep_audio | boolean | No | Preserve audio from source video (default true) |
image_urls | string[] | No | Up to 7 reference image URLs. Use @Image1, @Image2 in prompt |
start_frame_url | string | No | First frame image URL (start_end_frame mode) |
end_frame_url | string | No | Last frame image URL (start_end_frame mode) |
video_url | string | No | Source video URL (transform/video_reference modes) |
multi_shots | array | No | 2–6 shots, each { "prompt": string, "duration": int }. Mutually exclusive with prompt |
elements | array | No | Character/object elements (IMAGE + VIDEO) |
kuaishou/kling-o1-video
Same parameters as Omni 3.0 but does not support multi_shots or native_audio. Max duration 10s.
| Parameter | Type | Required | Description |
|---|---|---|---|
video_mode | string | No | "elements" (default), "start_end_frame", "transform", "video_reference" |
prompt | string | Yes | Text prompt |
mode | string | No | "pro" (default). "std" (720p) or "pro" (1080p) |
duration | integer | No | 3–10 seconds (default 5) |
aspect_ratio | string | No | "16:9" (default), "9:16", "1:1", "auto" (start_end_frame only) |
keep_audio | boolean | No | Preserve audio from source video (default true) |
image_urls | string[] | No | Up to 7 reference image URLs. Use @Image1, @Image2 in prompt |
start_frame_url | string | No | First frame image URL (start_end_frame mode) |
end_frame_url | string | No | Last frame image URL (start_end_frame mode) |
video_url | string | No | Source video URL (transform/video_reference modes) |
kuaishou/kling-3.0-omni-video-edit
| Parameter | Type | Required | Description |
|---|---|---|---|
video_url | string | Yes | Source video URL to edit |
prompt | string | Yes | Text prompt describing the edit |
video_mode | string | No | "reference" (default) or "transform" |
keep_audio | boolean | No | Preserve original audio (default false) |
mode | string | No | "std" (default) or "pro" |
aspect_ratio | string | No | "16:9" (default), "9:16", "1:1" |
image_urls | string[] | No | Up to 4 reference image URLs. Use @Image1, @Image2 in prompt |
elements | array | No | Up to 4 character/object elements |
kuaishou/kling-o1-video-edit
Same parameters as Omni 3.0 video edit but does not support elements.
| Parameter | Type | Required | Description |
|---|---|---|---|
video_url | string | Yes | Source video URL to edit |
prompt | string | Yes | Text prompt describing the edit |
video_mode | string | No | "reference" (default) or "transform" |
keep_audio | boolean | No | Preserve original audio (default false) |
mode | string | No | "std" (default) or "pro" |
aspect_ratio | string | No | "16:9" (default), "9:16", "1:1" |
image_urls | string[] | No | Up to 4 reference image URLs. Use @Image1, @Image2 in prompt |
kuaishou/kling-3.0-video
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Conditional | Text prompt. Mutually exclusive with multi_shots |
mode | string | No | "pro" (default). "std" (720p) or "pro" (1080p) |
duration | integer | No | 3–15 seconds (default 5) |
aspect_ratio | string | No | "16:9" (default), "9:16", "1:1" |
native_audio | boolean | No | Generate AI audio (default true) |
start_frame_url | string | Yes | First frame image URL |
end_frame_url | string | No | Last frame image URL |
elements | array | No | Character/object elements |
multi_shots | array | No | 2–6 shots, each { "prompt": string, "duration": int }. Mutually exclusive with prompt |
kuaishou/kling-2.6-video
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt |
mode | string | No | "pro" (default). "std" (720p) or "pro" (1080p) |
duration | integer | No | 5 or 10 seconds |
native_audio | boolean | No | Enable AI audio generation (default false). Requires pro mode |
start_frame_url | string | Yes | First frame image URL |
end_frame_url | string | No | Last frame image URL (not available with native_audio) |
voices | array | No | Voice references (max 5, requires native_audio). Each: { "voice_id": int } or { "voice_url": string } |
kuaishou/kling-2.5-turbo-video
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt |
mode | string | No | "pro" (default). "std" (720p) or "pro" (1080p) |
duration | integer | No | 5 or 10 seconds |
aspect_ratio | string | No | "16:9" (default), "9:16", "1:1". Ignored when start_frame_url is set |
start_frame_url | string | No | First frame image URL |
end_frame_url | string | No | Last frame image URL |
sound_effects | object | No | { "sound": string, "music": string, "asmr_mode": boolean }. Omit to disable audio |
kuaishou/kling-2.1-video
Image-to-video only.
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt |
start_frame_url | string | Yes | First frame image URL |
end_frame_url | string | No | Last frame image URL |
duration | integer | No | 5 or 10 seconds |
mode | string | No | "pro" (default). "std" or "pro" |
sound_effects | object | No | { "sound": string, "music": string, "asmr_mode": boolean }. Omit to disable audio |
kuaishou/kling-2.1-master-video
Pro-only. No end frame support.
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt |
duration | integer | No | 5 or 10 seconds |
start_frame_url | string | No | First frame image URL (optional) |
sound_effects | object | No | { "sound": string, "music": string, "asmr_mode": boolean }. Omit to disable audio |
kuaishou/kling-3.0-motion-control
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt describing the motion |
image_url | string | Yes | Character/subject image URL |
video_url | string | Yes | Motion reference video URL |
mode | string | No | "std" (default) or "pro" |
keep_audio | boolean | No | Preserve audio from motion video (default true) |
character_orientation | string | No | "video" (default) or "image" |
elements | array | No | Additional character/object elements |
kuaishou/kling-2.6-motion-control
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt describing the motion |
image_url | string | Yes | Character/subject image URL |
video_url | string | Yes | Motion reference video URL |
mode | string | No | "std" (default) or "pro" |
keep_audio | boolean | No | Preserve audio from motion video (default true) |
character_orientation | string | No | "video" (default) or "image" |
xai/grok-imagine-video-extend
Extend a previously generated video via HTTP streaming. Two mutually exclusive modes:
| Mode | How to activate | Behaviour |
|---|---|---|
| Preset | Provide video_preset | The preset controls the video style; prompt, extend_at, extend_duration are ignored |
| Custom | Omit video_preset | You control timing and prompt; prompt, extend_at, extend_duration are required |
| Parameter | Type | Required | Description |
|---|---|---|---|
task_id | string | Yes | Task ID of a completed video generation |
video_preset | string | No | "spicy" or "normal". Enables preset mode |
prompt | string | No | Text prompt to guide the extension. Required in custom mode |
extend_at | float | No | Second to start the extension from. Required in custom mode |
extend_duration | int | No | 6 or 10 seconds. Required in custom mode |
openai/gpt-image-2
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text description |
image_urls | array | No | Reference image URLs for image editing mode |
aspect_ratio | string | No | 1:1, 3:2, 2:3, 16:9. Default: 1:1 |
resolution | string | No | 1K, 2K, 4K. Default: 1K |
google/nano-banana, google/nano-banana-pro
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text description |
aspect_ratio | string | Yes | 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9 |
image_urls | array | No | Reference images |
resolution | string | No | Pro only: 1k, 2k, 4k |
black-forest-labs/flux.2-pro, black-forest-labs/flux.2-flex, black-forest-labs/flux.2-max
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text description |
image_urls | array | No | Reference images (Pro/Max: 8, Flex: 10) |
aspect_ratio | string | No | auto, 1:1, 4:3, 16:9, 3:2, 2:3, 9:16, 3:4 (Max also: 5:4, 21:9) |
quality | string | No | 1K or 2K |
steps | integer | No | Flex only: 1-50 (more = higher quality) |
cfg | number | No | Flex only: 1.5-10 (higher = follows prompt more strictly) |
higgsfield-ai/cinematic-studio-image
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text description |
image_urls | string[] | No | Reference image URLs (max 4) |
aspect_ratio | string | No | 1:1, 3:4, 2:3, 9:16, 3:2, 4:3, 16:9, 21:9 (default: 16:9) |
resolution | string | No | 1k, 2k, 4k (default: 1k) |
seed | integer | No | 0-999999 for reproducibility |
camera_model_id | string | No | Camera body ID |
camera_lens_id | string | No | Lens ID |
camera_aperture_id | string | No | Aperture ID |
camera_focal_length_id | string | No | Focal length ID |
alibaba/qwen-image-2.0-pro — $0.0525/image
Best quality. Text rendering, realistic textures. Automatically switches between T2I and editing based on whether image_urls is provided.
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt (max 800 chars) |
aspect_ratio | string | No | 1:1 (default), 16:9, 9:16, 4:3, 3:4 |
image_urls | string[] | No | Omit for T2I. Provide image URLs for editing |
negative_prompt | string | No | What to avoid (max 500 chars) |
prompt_extend | boolean | No | Smart prompt rewriting (default true) |
seed | integer | No | Seed for reproducibility |
alibaba/qwen-image-2.0 — $0.0245/image
Faster version of 2.0 Pro. Same capabilities and parameters.
alibaba/qwen-image-max — T2I 0.0525/image
Highest realism, fewest AI artifacts. Editing uses a specialized edit model under the hood (industrial design, geometric reasoning, character consistency). Same parameters as Qwen Image 2.0 Pro.
alibaba/qwen-image-plus — T2I 0.021/image
Diverse artistic styles, fast. Editing uses a specialized edit model under the hood. Same parameters as Qwen Image 2.0 Pro.
alibaba/qwen-image — T2I 0.0315/image
Older base model. Editing uses a specialized edit model under the hood. Same parameters as Qwen Image 2.0 Pro.
alibaba/z-image-turbo — **0.021 with prompt rewriting)
Lightweight fast T2I only. Chinese and English text rendering.
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt (max 800 chars) |
aspect_ratio | string | No | 1:1 (default), 2:3, 3:2, 3:4, 4:3, 9:16, 16:9 |
prompt_extend | boolean | No | Prompt rewriting (default false, doubles cost) |
seed | integer | No | Seed for reproducibility |
alibaba/wan-2.7-pro-image — $0.0525/image
Highest quality. Thinking mode for T2I. Supports editing with up to 9 images. Up to 4K resolution for T2I.
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt (max 5000 chars) |
aspect_ratio | string | No | 1:1 (default), 16:9, 9:16, 4:3, 3:4, 3:2, 2:3. Editing preserves input ratio |
image_urls | string[] | No | Omit for T2I. Up to 9 images for editing |
thinking_mode | boolean | No | Better quality, slower (default true). T2I only |
seed | integer | No | Seed for reproducibility |
alibaba/wan-2.7-image — $0.021/image
Faster variant of 2.7 Pro. Same capabilities, max 2K resolution. Same parameters as Wan 2.7 Pro Image.
alibaba/wan-2.6-image — $0.021/image
Automatically selects T2I or editing mode based on image_urls. Supports style transfer with 1–4 reference images.
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt (max 2000 chars) |
aspect_ratio | string | No | 1:1 (default), 2:3, 3:2, 3:4, 4:3, 9:16, 16:9 |
image_urls | string[] | No | Omit for T2I. 1–4 images for editing/style transfer |
negative_prompt | string | No | What to avoid (max 500 chars) |
prompt_extend | boolean | No | Smart prompt rewriting (default true) |
seed | integer | No | Seed for reproducibility |
alibaba/wan-2.5-image — $0.021/image
Automatically selects T2I or editing mode based on image_urls. Supports 1–3 reference images. Same parameters as Wan 2.6 Image.
alibaba/wan-2.2-image — $0.035/image
T2I only. Does not accept image_urls.
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text prompt (max 500 chars) |
aspect_ratio | string | No | 1:1 (default), 3:4, 4:3, 9:16, 16:9 |
negative_prompt | string | No | What to avoid |
seed | integer | No | Seed for reproducibility |
alibaba/wan-2.2-flash-image — $0.0175/image
Fast T2I only. Cheapest Wan image model. Same parameters as Wan 2.2 Image.
xai/grok-imagine-image — Pro mode: $0.025/image
Generate and edit images using xAI’s Grok Imagine model. When image_urls is provided, the model runs in edit mode.
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Text description or edit instruction |
aspect_ratio | string | No | "1:1" (default), "2:3", "3:2", "9:16", "16:9" |
image_urls | string[] | No | 1–5 reference image URLs (triggers edit mode) |
enable_pro | boolean | No | Enable pro mode for higher quality results |
upsample_prompt | boolean | No | Let AI enhance your prompt for better results |
enable_nsfw | boolean | No | Enable NSFW content generation |
suno-ai/music
| Parameter | Type | Required | Description |
|---|---|---|---|
mv | string | Yes | Model version: chirp-v3-5, chirp-v4, chirp-auk, chirp-bluejay, chirp-crow |
custom | boolean | Yes | false for simple mode, true for custom mode |
gpt_description_prompt | string | No | Simple mode: song description with lyrics |
prompt | string | No | Custom mode: detailed lyrics/prompt |
tags | string | No | Custom mode: genre/style tags |
title | string | No | Song title |
make_instrumental | boolean | No | Generate instrumental only |
negative_tags | string | No | Custom mode: styles to avoid |
persona_id | string | No | Custom voice ID from Suno voice creation; music uses that voice for vocals |
suno-ai/add-vocals, suno-ai/add-instrumental, suno-ai/extend, suno-ai/cover
| Parameter | Type | Required | Description |
|---|---|---|---|
mv | string | Yes | Model version |
clip_id | string | Yes* | Existing clip ID |
audio_url | string | Yes* | Audio file URL (alternative to clip_id) |
custom | boolean | Yes | Simple or custom mode |
gpt_description_prompt | string | No | Simple mode description |
prompt | string | No | Custom mode prompt |
continue_at | number | No | Extend: time in seconds to continue from |
start_s | number | No | Start time for overlay |
end_s | number | No | End time for overlay |
suno-ai/stems, suno-ai/stems-all
| Parameter | Type | Required | Description |
|---|---|---|---|
clip_id | string | Yes | Clip ID to extract stems from |
title | string | No | Title for extraction |
suno-ai/lyrics
| Parameter | Type | Required | Description |
|---|---|---|---|
prompt | string | Yes | Description of lyrics to generate |
mv | string | Yes | Lyrics model: remi-v1 or default |
higgsfield-ai/text-to-speech
| Parameter | Type | Required | Description |
|---|---|---|---|
voice_id | string | Yes | Voice ID |
prompt | string | Yes | Text to convert to speech |
sound_id | string | No | Background sound ID |
similarity_boost | integer | No | 0-100 (default 90) |
style | integer | No | 0-100 (default 60) |
speed | number | No | 0-1.2 (default 1.1) |
stability | integer | No | 0-100 (default 30) |
| Code | Description |
|---|---|
| 400 | Bad Request - Invalid parameters |
| 401 | Unauthorized - Invalid or missing API key |
| 402 | Payment Required - Insufficient balance |
| 404 | Not Found - Task or model not found |
| 429 | Too Many Requests - Rate limited |
| 500 | Internal Server Error |