I tried to reproduce the high latency you report with my own Cloud Run Service. Here is my very simple service code:
const express = require('express');
const {PubSub} = require('@google-cloud/pubsub');
const app = express();
const pubsub = new PubSub();
const topic_name = // ... my topic name
const topic =
pubsub.topic(topic_name);
app.post('/', (req, res) => {
const data = JSON.stringify(Math.random());
const message = {
data: Buffer.from(data),
};
console.log('Publish %s start', data);
topic.publishMessage(message)
.then(() => {
console.log('Publish %s done', data);
res.status(200).end();
})
.catch(e => {
console.log('Publish %s failed', data);
res.status(500).end();
});
});
const port = parseInt(process.env.PORT) || 8080;
app.listen(port, () => {
console.log(`cloud-run-service: listening on port ${port}`);
});
What I observe is that the first request to the Cloud Run Service incurs high latency, but subsequent publishes are faster. For the first request, between the "Publish Start" and "Publish Done" logs is ~400ms. When I continue to POST to my Cloud Run Service (at a very slow, 1 request per minute), the subsequent publishes all complete much faster (~50ms).
This is still very low throughput for Pub/Sub and the advice from [1] still applies:
> Pub/Sub is designed for low-latency, high-throughput delivery. If the topic has low throughput, the resources associated with the topic could take longer to initialize.
But the publish latency for subsequent publish requests is much better than the "Cold Start" latency for the Cloud Run Instance / Publisher object.
With regards to your question:
> I have read that pubsub performs poorly under low throughput, but is there a way to make it behave better?
Pub/Sub is optimized for high throughput, but even the very low QPS of my test (1 request per minute) was able to achieve 50ms latencies.
You can get lower latencies by publishing consistently, but it is a latency/cost tradeoff. If you consistently publish "heartbeat" messages to your topic to keep the Cloud Run Instance and Pub/Sub resources "warm", you will get lower single request latencies when you send a real publish request.
You can do this without having to handle those additional meaningless "heartbeat" messages at your subscriber by using filters with your subscription [2] . If you publish messages with an attribute indicating it is a "heartbeat" message, you can create a subscription which filters out the message before it reaches your subscriber. Your single request publish latency from your Cloud Run Service should be consistently lower, but you would have to pay for the extra publish traffic and filtered out "heartbeat" messages [3].
[1] https://cloud.google.com/pubsub/docs/topic-troubleshooting#high_publish_latency
[2] https://cloud.google.com/pubsub/docs/subscription-message-filter
[3] https://cloud.google.com/pubsub/pricing#pubsub