There are other ways to check for duplicates, such as comparing df.height
to df.unique(subset).height
, your implementation is better for an "early exit" because it can potentially finish much faster if a duplicate is found early in the dataset.
I think your code is already at a good point.