I solved the above by changing the runner from DirectRunner to DataflowRunner it seems directrunner which is meant for local testing does not support fully all the to_dataframe functionalities when streaming is set to True(above am streaming data from a pubsub topic)