You need to be using spherical kmeans. Yes, minimizing the euclidean distance between two l2 normalized vectors is the same with minimizing cosine, but you also have to l2 normalize the centroids, which you dont have access to in sklearn Kmean.