Vertex AI requires: A health endpoint (e.g., /health) that returns a 200 OK status when the model is ready. and A prediction endpoint (e.g., /predict) that handles inference requests.
Add /health (returns 200 OK when ready) and /predict endpoints to your FastAPI app. Update your gcloud ai models upload with --container-health-route=/health --container-predict-route=/predict --container-ports=8080. Redeploy and check Cloud Logging for errors.