I recently had to count the number of unique subarrays in a 32-bit 2D integer array, so neither pandas.unique
nor pandas.value_counts
were an option for me.
I've found that the following solution was faster by about a factor of 3 for an array with 10e7 items and a fairly non-uniform distribution:
import collections
unique_counts = collections.Counter(zip(*sequences_array.T))
How to efficiently convert 2d numpy array into 1d numpy array of tuples?