If it's possible, you need firstly to sort datasets by (visitor_id) and then bucket with the same column (visitor_id) and same amount of buckets (1024). In case of sorted datasets by the same column and bucketed in the same number.