Reports

While the existing answers clearly explain how to update the intrinsic matrix for standard scaling/cropping operations, I want to provide a related perspective: how to construct a intrinsic for projection matrix of rendering when using crop, padding, and scaling, so that standard rendering pipelines correctly project 3D objects onto the edited image.

Background

When performing crop, padding, or scaling on images, the camera projection matrix needs to be adjusted to ensure that 3D objects are correctly rendered onto the modified image.

Key Concepts

Pixel space vs. Homogeneous space

Pixel space (CV): Uses actual pixel coordinates, e.g., a 1920×1080 image has x ∈ [0,1920] and y ∈ [0,1080].
Homogeneous space (Graphics): Normalized coordinates where x ∈ [-1,1] and y ∈ [-1,1], regardless of image size.

This distinction affects how image augmentations influence projections. For example, adding padding on the right:

In pixel space, left-side pixels do not move.
In homogeneous space, the entire x-axis is compressed because the total width increased.

Projection Matrix Parameters

A camera intrinsic matrix contains four main parameters:

fx, fy: focal lengths along x and y axes
cx, cy: principal point offsets along x and y axes

Effects of Augmentations

Translation (cx, cy)

Only cropping/padding on the left/top affects the principal point.
Right/bottom operations have no effect.

Scaling (fx, fy)

In CV pixel space: only scaling changes fx/fy.
In homogeneous space: crop and padding also affect fx/fy, because padding changes the image aspect ratio, which changes the mapping to normalized [-1,1] coordinates.

Updating the Intrinsic Matrix

Pixel space rules:

Cropping/padding:
- cx, cy decrease by the number of pixels cropped from left/top
- Right/bottom cropping has no effect
- fx, fy remain unchanged
Scaling:
- fx, fy multiplied by scale s
- cx, cy multiplied by scale s

Homogeneous space rules:

Cropping/padding changes image aspect ratio → requires extra scaling compensation
Compute compensation factors:

sx = s * (original_width / padded_width)
sy = s * (original_height / padded_height)

fx_new = fx * sx
fy_new = fy * sy
cx_new = cx * sx
cy_new = cy * sy

Note: This compensation only adjusts for the normalized coordinate system and does not change physical camera parameters.

Correct FOV Calculation

To compute FOV consistent with the original image:

fov_x = 2 * arctan((original_width * s) / (2 * fx_new))
fov_y = 2 * arctan((original_height * s) / (2 * fy_new))

This ensures that rendering with crop, padding, and scaling produces objects at the correct location and scale, without relying on viewport adjustments.