Based on the Java resources, distinct() is a stateful operation that maintains some state internally to accomplish its opereation. For parallelization, filter(seen::add) with a ConcurrentHashMap.newKeySet() is better because the filter method is a stateless operation.