I guess it would be more simple and efficient: df['historical_rank_new'] = df['historical_rank'].str.extract('(\d{4})')
df['historical_rank_new'] = df['historical_rank'].str.extract('(\d{4})')