I’m using sparse vectors with about 10 features out of a possible 50 million. However, the conversion to dense vectors is causing heap exhaustion. Is there a way to disable the sparse-to-dense conversion?
Right now, I can’t even train on a small batch of vectors without running into memory issues — but I ultimately need to train on 200 million rows.
Any help would be greatly appreciated. I’m using XGBoost4j-Spark version 3.0.0 with the Java.
Thanks!