Generations

AdvancedControls { depth, face, normals, 2 more }

Per-signal manual conditioning controls for video edits

depth?: DepthControl { blur, enabled } | null

Depth / scene-geometry conditioning control

blur?: number | null

Depth-map blur amount from 0 to 1. Higher values allow more geometric freedom.

formatfloat

minimum0

maximum1

enabled?: boolean | null

Enable or disable depth conditioning. Omit to use the model default.

face?: FaceControl { enabled } | null

Face-identity conditioning control

enabled?: boolean | null

Enable or disable face conditioning. Omit to use the model default.

normals?: NormalsControl { augmentation, enabled } | null

Surface-normals conditioning control

augmentation?: number | null

Surface-normals augmentation from 0 to 1. Higher values allow more reinterpretation of surface geometry.

formatfloat

minimum0

maximum1

enabled?: boolean | null

Enable or disable normals conditioning. Omit to use the model default.

pose?: PoseControl { enabled, strength } | null

Pose / skeleton conditioning control

enabled?: boolean | null

Enable or disable pose conditioning. Omit to use the model default.

strength?: PoseControlStrength | null

Pose-conditioning strength

One of the following:

"precise"

"coarse"

trajectory?: TrajectoryControl { enabled, sparsity } | null

Motion-trajectory conditioning control

enabled?: boolean | null

Enable or disable trajectory conditioning. Omit to use the model default.

sparsity?: number | null

Point-trajectory sparsity from 0 to 1. Higher values use fewer motion anchors.

formatfloat

minimum0

maximum1

DepthControl { blur, enabled }

Depth / scene-geometry conditioning control

blur?: number | null

Depth-map blur amount from 0 to 1. Higher values allow more geometric freedom.

formatfloat

minimum0

maximum1

enabled?: boolean | null

Enable or disable depth conditioning. Omit to use the model default.

FaceControl { enabled }

Face-identity conditioning control

enabled?: boolean | null

Enable or disable face conditioning. Omit to use the model default.

Generation { id, created_at, model, 5 more }

Generation status and output

id: string

Generation identifier

formatuuid

created_at: string

Creation timestamp

model: Model

Model used

One of the following:

"uni-1"

"uni-1-max"

"ray-3.2"

state: "queued" | "processing" | "completed" | "failed"

Current state of the generation

One of the following:

"queued"

"processing"

"completed"

"failed"

type: "image" | "image_edit" | "video" | 2 more

The kind of generation to perform

One of the following:

"image"

"image_edit"

"video"

"video_edit"

"video_reframe"

failure_code?: GenerationFailureCode | null

Machine-readable failure code for programmatic handling

One of the following:

"content_moderated"

"generation_failed"

"budget_exhausted"

"output_not_found"

"image_too_large"

"unsupported_format"

"corrupt_input"

"invalid_request"

"rate_limited"

failure_reason?: string | null

Human-readable failure description

output?: Array<GenerationOutput { type, url } >

Generated outputs (populated on completion)

type: string

Media type (e.g. image, video)

url: string

Presigned URL (1hr expiry)

formaturi

GenerationFailureCode = "content_moderated" | "generation_failed" | "budget_exhausted" | 6 more

Machine-readable failure code for programmatic handling

One of the following:

"content_moderated"

"generation_failed"

"budget_exhausted"

"output_not_found"

"image_too_large"

"unsupported_format"

"corrupt_input"

"invalid_request"

"rate_limited"

GenerationOutput { type, url }

A single generated output

type: string

Media type (e.g. image, video)

url: string

Presigned URL (1hr expiry)

formaturi

ImageRef { data, generation_id, media_type, url }

Media reference for guided generation. Provide exactly one of url, inline base64 data, or generation_id. URL/data references accept image media at image positions; video_edit and video_reframe sources also accept source.url or source.data when source.media_type is a video/* MIME. generation_id chains image_edit off a prior image output, video_edit/video_reframe off a prior video output, and video.start_frame/end_frame for extension.

data?: string | null

Base64-encoded image or video data

generation_id?: string | null

UUID of a prior generation owned by the same caller. Used on source for image_edit, video_edit, and video_reframe chaining and on video.start_frame / video.end_frame for video extension.

formatuuid

media_type?: string | null

MIME type (for example, image/jpeg or video/mp4). Required with data. Required with source.url on video_edit/video_reframe so the route can dispatch video ingest before fetching bytes; optional for image URLs.

url?: string | null

Publicly accessible image URL, or a video URL when used as source for video_edit/video_reframe with media_type=video/*.

Model = "uni-1" | "uni-1-max" | "ray-3.2"

Model identifier. uni-1 is the default image tier; uni-1-max produces higher-quality output than uni-1 at a higher per-image price. ray-3.2 is the public video model for text-to-video, image-to-video, and video-to-video editing.

One of the following:

"uni-1"

"uni-1-max"

"ray-3.2"

NormalsControl { augmentation, enabled }

Surface-normals conditioning control

augmentation?: number | null

Surface-normals augmentation from 0 to 1. Higher values allow more reinterpretation of surface geometry.

formatfloat

minimum0

maximum1

enabled?: boolean | null

Enable or disable normals conditioning. Omit to use the model default.

PoseControl { enabled, strength }

Pose / skeleton conditioning control

enabled?: boolean | null

Enable or disable pose conditioning. Omit to use the model default.

strength?: PoseControlStrength | null

Pose-conditioning strength

One of the following:

"precise"

"coarse"

PoseControlStrength = "precise" | "coarse"

Pose-conditioning strength

One of the following:

"precise"

"coarse"

SourcePosition { h_norm, w_norm, x_norm, y_norm }

Normalized source rectangle inside the output canvas for video_reframe. Omit to let the model choose the default centered-fit crop.

h_norm: number

Source rectangle height, as a fraction of canvas height. Up to 2.0 so the source can bleed off-canvas.

formatfloat

exclusiveMinimum0

maximum2

w_norm: number

Source rectangle width, as a fraction of canvas width. Up to 2.0 so the source can bleed off-canvas.

formatfloat

exclusiveMinimum0

maximum2

x_norm: number

Left edge of the source rectangle, as a fraction of canvas width. May be negative when the source extends off-canvas.

formatfloat

minimum-2

maximum2

y_norm: number

Top edge of the source rectangle, as a fraction of canvas height. May be negative when the source extends off-canvas.

formatfloat

minimum-2

maximum2

TrajectoryControl { enabled, sparsity }

Motion-trajectory conditioning control

enabled?: boolean | null

Enable or disable trajectory conditioning. Omit to use the model default.

sparsity?: number | null

Point-trajectory sparsity from 0 to 1. Higher values use fewer motion anchors.

formatfloat

minimum0

maximum1

VideoDuration = "5s" | "10s"

Video duration

One of the following:

"5s"

"10s"

VideoEditOptions { auto_controls, controls, keyframe_indexes, 2 more }

Ray 3.2 video-to-video edit controls. Only valid under video.edit when type is video_edit. The source video must be 18 seconds or shorter; output duration matches the source.

auto_controls?: boolean | null

When true, the model derives the control schedule from the source video. When omitted, supplying strength or controls implies manual mode.

controls?: AdvancedControls { depth, face, normals, 2 more } | null

Per-signal manual conditioning controls for video edits

depth?: DepthControl { blur, enabled } | null

Depth / scene-geometry conditioning control

blur?: number | null

Depth-map blur amount from 0 to 1. Higher values allow more geometric freedom.

formatfloat

minimum0

maximum1

enabled?: boolean | null

Enable or disable depth conditioning. Omit to use the model default.

face?: FaceControl { enabled } | null

Face-identity conditioning control

enabled?: boolean | null

Enable or disable face conditioning. Omit to use the model default.

normals?: NormalsControl { augmentation, enabled } | null

Surface-normals conditioning control

augmentation?: number | null

Surface-normals augmentation from 0 to 1. Higher values allow more reinterpretation of surface geometry.

formatfloat

minimum0

maximum1

enabled?: boolean | null

Enable or disable normals conditioning. Omit to use the model default.

pose?: PoseControl { enabled, strength } | null

Pose / skeleton conditioning control

enabled?: boolean | null

Enable or disable pose conditioning. Omit to use the model default.

strength?: PoseControlStrength | null

Pose-conditioning strength

One of the following:

"precise"

"coarse"

trajectory?: TrajectoryControl { enabled, sparsity } | null

Motion-trajectory conditioning control

enabled?: boolean | null

Enable or disable trajectory conditioning. Omit to use the model default.

sparsity?: number | null

Point-trajectory sparsity from 0 to 1. Higher values use fewer motion anchors.

formatfloat

minimum0

maximum1

keyframe_indexes?: Array<number> | null

Parallel list of non-negative, unique frame positions in the source video's frame grid where each keyframes[i] is anchored. Must match keyframes in length.

keyframes?: Array<ImageRef { data, generation_id, media_type, url } > | null

Multi-anchor guide-frame images at arbitrary source-frame positions (parallel with keyframe_indexes). Up to 64 anchors. Mutually exclusive with video.start_frame (the single-anchor case). Each entry takes the same ImageRef shape as source / image_ref[].

data?: string | null

Base64-encoded image or video data

generation_id?: string | null

UUID of a prior generation owned by the same caller. Used on source for image_edit, video_edit, and video_reframe chaining and on video.start_frame / video.end_frame for video extension.

formatuuid

media_type?: string | null

MIME type (for example, image/jpeg or video/mp4). Required with data. Required with source.url on video_edit/video_reframe so the route can dispatch video ingest before fetching bytes; optional for image URLs.

url?: string | null

Publicly accessible image URL, or a video URL when used as source for video_edit/video_reframe with media_type=video/*.

strength?: VideoEditStrength | null

How much a video edit preserves or reimagines the source

One of the following:

"adhere_1"

"adhere_2"

"adhere_3"

"flex_1"

"flex_2"

"flex_3"

"reimagine_1"

"reimagine_2"

"reimagine_3"

VideoEditStrength = "adhere_1" | "adhere_2" | "adhere_3" | 6 more

How much a video edit preserves or reimagines the source

One of the following:

"adhere_1"

"adhere_2"

"adhere_3"

"flex_1"

"flex_2"

"flex_3"

"reimagine_1"

"reimagine_2"

"reimagine_3"

VideoOptions { duration, edit, end_frame, 8 more }

Ray 3.2 video request options. Common output settings live at the top level for type=video, type=video_edit, and type=video_reframe; video-to-video conditioning lives under edit.

duration?: VideoDuration | null

Video duration

One of the following:

"5s"

"10s"

edit?: VideoEditOptions { auto_controls, controls, keyframe_indexes, 2 more } | null

Ray 3.2 video-to-video edit controls. Only valid under video.edit when type is video_edit. The source video must be 18 seconds or shorter; output duration matches the source.

auto_controls?: boolean | null

When true, the model derives the control schedule from the source video. When omitted, supplying strength or controls implies manual mode.

controls?: AdvancedControls { depth, face, normals, 2 more } | null

Per-signal manual conditioning controls for video edits

depth?: DepthControl { blur, enabled } | null

Depth / scene-geometry conditioning control

blur?: number | null

Depth-map blur amount from 0 to 1. Higher values allow more geometric freedom.

formatfloat

minimum0

maximum1

enabled?: boolean | null

Enable or disable depth conditioning. Omit to use the model default.

face?: FaceControl { enabled } | null

Face-identity conditioning control

enabled?: boolean | null

Enable or disable face conditioning. Omit to use the model default.

normals?: NormalsControl { augmentation, enabled } | null

Surface-normals conditioning control

augmentation?: number | null

Surface-normals augmentation from 0 to 1. Higher values allow more reinterpretation of surface geometry.

formatfloat

minimum0

maximum1

enabled?: boolean | null

Enable or disable normals conditioning. Omit to use the model default.

pose?: PoseControl { enabled, strength } | null

Pose / skeleton conditioning control

enabled?: boolean | null

Enable or disable pose conditioning. Omit to use the model default.

strength?: PoseControlStrength | null

Pose-conditioning strength

One of the following:

"precise"

"coarse"

trajectory?: TrajectoryControl { enabled, sparsity } | null

Motion-trajectory conditioning control

enabled?: boolean | null

Enable or disable trajectory conditioning. Omit to use the model default.

sparsity?: number | null

Point-trajectory sparsity from 0 to 1. Higher values use fewer motion anchors.

formatfloat

minimum0

maximum1

keyframe_indexes?: Array<number> | null

Parallel list of non-negative, unique frame positions in the source video's frame grid where each keyframes[i] is anchored. Must match keyframes in length.

keyframes?: Array<ImageRef { data, generation_id, media_type, url } > | null

Multi-anchor guide-frame images at arbitrary source-frame positions (parallel with keyframe_indexes). Up to 64 anchors. Mutually exclusive with video.start_frame (the single-anchor case). Each entry takes the same ImageRef shape as source / image_ref[].

data?: string | null

Base64-encoded image or video data

generation_id?: string | null

UUID of a prior generation owned by the same caller. Used on source for image_edit, video_edit, and video_reframe chaining and on video.start_frame / video.end_frame for video extension.

formatuuid

media_type?: string | null

MIME type (for example, image/jpeg or video/mp4). Required with data. Required with source.url on video_edit/video_reframe so the route can dispatch video ingest before fetching bytes; optional for image URLs.

url?: string | null

Publicly accessible image URL, or a video URL when used as source for video_edit/video_reframe with media_type=video/*.

strength?: VideoEditStrength | null

How much a video edit preserves or reimagines the source

One of the following:

"adhere_1"

"adhere_2"

"adhere_3"

"flex_1"

"flex_2"

"flex_3"

"reimagine_1"

"reimagine_2"

"reimagine_3"

end_frame?: ImageRef { data, generation_id, media_type, url } | null

Media reference for guided generation. Provide exactly one of url, inline base64 data, or generation_id. URL/data references accept image media at image positions; video_edit and video_reframe sources also accept source.url or source.data when source.media_type is a video/* MIME. generation_id chains image_edit off a prior image output, video_edit/video_reframe off a prior video output, and video.start_frame/end_frame for extension.

data?: string | null

Base64-encoded image or video data

generation_id?: string | null

UUID of a prior generation owned by the same caller. Used on source for image_edit, video_edit, and video_reframe chaining and on video.start_frame / video.end_frame for video extension.

formatuuid

media_type?: string | null

MIME type (for example, image/jpeg or video/mp4). Required with data. Required with source.url on video_edit/video_reframe so the route can dispatch video ingest before fetching bytes; optional for image URLs.

url?: string | null

Publicly accessible image URL, or a video URL when used as source for video_edit/video_reframe with media_type=video/*.

exr_export?: boolean | null

Export EXR alongside the MP4. Requires hdr=true.

hdr?: boolean | null

Generate HDR video. Requires HDR access. Not supported for video_reframe.

keyframe_indexes?: Array<number> | null

Parallel list of non-negative, unique output-frame positions where each keyframes[i] is anchored, in the duration x 24fps grid (5s -> 0..120, 10s -> 0..240). Must match keyframes in length.

keyframes?: Array<ImageRef { data, generation_id, media_type, url } > | null

Image-to-video guide frames (type=video only), each pinned to an output-frame position via the parallel keyframe_indexes. 1-64 anchors: a single anchor is a valid start-pinned i2v (an alternate to start_frame), and any count up to 64 places guide frames at arbitrary positions. Unlike start_frame/end_frame (the legacy 2-frame surface), this supports arbitrary positions, 10s durations, and HDR. Mutually exclusive with start_frame / end_frame / loop. Only supported on model ray-3.2. For video-to-video keyframes use video.edit.keyframes on type=video_edit instead.

data?: string | null

Base64-encoded image or video data

generation_id?: string | null

UUID of a prior generation owned by the same caller. Used on source for image_edit, video_edit, and video_reframe chaining and on video.start_frame / video.end_frame for video extension.

formatuuid

media_type?: string | null

MIME type (for example, image/jpeg or video/mp4). Required with data. Required with source.url on video_edit/video_reframe so the route can dispatch video ingest before fetching bytes; optional for image URLs.

url?: string | null

Publicly accessible image URL, or a video URL when used as source for video_edit/video_reframe with media_type=video/*.

loop?: boolean | null

Generate a seamlessly looping video. Only valid for type=video; not supported with duration=10s or hdr=true.

resolution?: VideoResolution | null

Ray 3.2 video output resolution. 360p is the draft tier (fast, low-cost previews), accepted on type=video, video_edit, and video_reframe; on type=video it is SDR-only (not valid with hdr=true). 1080p is public for video generation; video_reframe 1080p is still rolling out and may return a coming-soon validation error until enabled for the caller.

One of the following:

"360p"

"540p"

"720p"

"1080p"

source_position?: SourcePosition { h_norm, w_norm, x_norm, y_norm } | null

Normalized source rectangle inside the output canvas for video_reframe. Omit to let the model choose the default centered-fit crop.

h_norm: number

Source rectangle height, as a fraction of canvas height. Up to 2.0 so the source can bleed off-canvas.

formatfloat

exclusiveMinimum0

maximum2

w_norm: number

Source rectangle width, as a fraction of canvas width. Up to 2.0 so the source can bleed off-canvas.

formatfloat

exclusiveMinimum0

maximum2

x_norm: number

Left edge of the source rectangle, as a fraction of canvas width. May be negative when the source extends off-canvas.

formatfloat

minimum-2

maximum2

y_norm: number

Top edge of the source rectangle, as a fraction of canvas height. May be negative when the source extends off-canvas.

formatfloat

minimum-2

maximum2

start_frame?: ImageRef { data, generation_id, media_type, url } | null

Media reference for guided generation. Provide exactly one of url, inline base64 data, or generation_id. URL/data references accept image media at image positions; video_edit and video_reframe sources also accept source.url or source.data when source.media_type is a video/* MIME. generation_id chains image_edit off a prior image output, video_edit/video_reframe off a prior video output, and video.start_frame/end_frame for extension.

data?: string | null

Base64-encoded image or video data

generation_id?: string | null

UUID of a prior generation owned by the same caller. Used on source for image_edit, video_edit, and video_reframe chaining and on video.start_frame / video.end_frame for video extension.

formatuuid

media_type?: string | null

MIME type (for example, image/jpeg or video/mp4). Required with data. Required with source.url on video_edit/video_reframe so the route can dispatch video ingest before fetching bytes; optional for image URLs.

url?: string | null

Publicly accessible image URL, or a video URL when used as source for video_edit/video_reframe with media_type=video/*.

VideoResolution = "360p" | "540p" | "720p" | "1080p"

Ray 3.2 video output resolution. 360p is the draft tier (fast, low-cost previews), accepted on type=video, video_edit, and video_reframe; on type=video it is SDR-only (not valid with hdr=true). 1080p is public for video generation; video_reframe 1080p is still rolling out and may return a coming-soon validation error until enabled for the caller.

One of the following:

"360p"

"540p"

"720p"

"1080p"

Generations

Create a generation

Get a generation

ModelsExpand Collapse

What can I help you with?

Suggestions