There's an extra step missing:
row_list = df.select('Column_header').collect() result = [row['Column_header'] for row in row_list]