Reports

It is NOT slow it appears to be slow

CLI spits output word by word immediately after hitting enter. In contrast, 'langchain' collects the entire output first, consuming 15-20 seconds, depending on the length of the response, and then spits out ... Boom... Even subprocess.run() has the same effect.

Workaround:

import os os.system('ollama run llama3.2:1b what is water short answer ') and then run the python script from the terminal: python main.py

Here, you can see output almost immediately as a stream.

Save the output in a text file that can be used in your Python script.

os.system('ollama run llama3.2:1b what is water short answer > output.txt')

to append the text file:

os.system('ollama run llama3.2:1b what is water short answer >> output.txt') I have posted this answer on GitHub as well

79475121