For most modern laptop/desktop CPUs this seems like a good situation to use any of the "population count" instructions.
I would use a loop with "popcount", then or:ing all positions.
As you shall have only 8,16,32 or 64 bits set all which binary representation is 1 set bit and all other 0s, then the popcount of the or:ed together popcounts shall be 1 iff the memory block consists of only 1 bits.