Another option, if you're okay with scanning the array twice, is to use vpmaxsd/vpminsd to find the minimum/maximum high 32 bits, then search for the lower half using a vpcmpeqd/vptest loop. Probably only a win if the array fits in L1.