On Android, the memory limit is set to Jave VM. Therefore, once you use C++ code to run the LLMs, much more memory can be utilized.