If you need to compute something per key in parallel on different machines and compute something for all keys (like an average of counts or n max and n min values) then direct the stream of computed count values as key-value pairs into another topic, that is handled on 1 working JVM instance as a stream computing statistic on counts. The purpose of streams is to make computing per key, which has any business reason. If you need 2-level computations, then you need 2-level topology of computations.