Reports

I think that previous answer is partly incorrect. Translation vector is a coordinates in camera's coordinate system. So the distance from the camera to aruco marker is not just z coordinate of tvec, it is a euclidean norm of tvec

import cv2
import numpy as np

img = cv2.imread('img.png')  # replace with your path to image
# Replace with your camera matrix
camera_matrix = np.array([
    [580.77518, 0.0, 724.75002], 
    [0.0, 580.77518, 570.98956], 
    [0.0, 0.0, 1.0]
])
# Replace with your distortion coefficients
dist_coeffs = np.array([
    0.927077, 0.141438, 0.000196, -8.7e-05, 
    0.001695, 1.257216, 0.354688, 0.015954
])
# Replace with your aruco dictionary
dictionary = cv2.aruco.Dictionary_get(cv2.aruco.DICT_4X4_50)
parameters = cv2.aruco.DetectorParameters_create()
marker_size = 0.8  # marker size in some units

corners, ids, _ = cv2.aruco.detectMarkers(
    img, dictionary, parameters=parameters
)
rvec, tvec, _ = cv2.aruco.estimatePoseSingleMarkers(
    corners, marker_size, camera_matrix, dist_coeffs 
)
# The distance will be in the same units as marker size
distance = np.linalg.norm(tvec[0][0])

Also as @Simon mentioned you need to calibrate your camera first to get camera matrix and distortion coefficients

79279553