sklearn is based on urllib, so if you use the way to proxy used by urllib you solve the problem without downloading separately the docs.