Thanks to everyone who made suggestions. Especially MTO and no-comment.
In the end the following code worked:
def substring_sieve(data):
prev, *remaining = sorted(data)
output = [prev]
for value in remaining:
value = value.rstrip('/') + '/'
if not value.startswith(prev):
output.append(value)
prev = value
return output
This handles edge cases where there is a duplicate entry in the input list, and where one of the paths is a substring of another. For example: ['/home/greatlon/test_site2', '/home/greatlon/test_site']
Thanks again all!