What is your overall publish throughput and what is the average message size? Pub/Sub can deliver up to 10 MB/s per stream, though load will be balanced across streams so you may not see any individual stream saturated if you have many open.
What do you see for the subscription/oldest_unacked_message_age
and the subscription/num_undelivered_messages
metrics? If you don't see a backlog for the latter then your subscribers are generally keeping up with the publish throughput.
You can also configure maxOutstandingElementCount
to 5000
and maxOutstandingByteCount
to 5000 * 700
: are your clients hitting these limits and getting flow controlled? You can check whether your streams are flow controlled with the subscription/open_streaming_pulls
metric.
but see many modifyAckDeadline requests from the GCP Pub/Sub metrics and graphs which doesn't make sense to me
Pub/Sub client libraries send ModifyAckDeadline requests upon receipt of messages, as well as periodically for unacked messages to extend their leases up to the "Maximum acknowledgment extension period", so it would be expected to see ModifyAckDeadline requests even if you are acknowledging quickly.
This page has tips on monitoring and debugging subscription health and the subscription/delivery_latency_health_score
metric can help you more easily identify factors contributing to increased delivery latency. If the metric does not indicate any issues with your subscription, you can create a support case so that someone can look at the subscription from the backend perspective.