I have been using conda for the last ~5 years, building many conda envs that were used by thousands of folks from many different teams. So far, from the discussion, I am not seeing a compelling need to use poetry in conjunction with conda. From what I can read and gather, poetry is solving the issue with pip, rather than conda. But let me summarize from historical timeline, and highlighted the key differences between conda/pip, ...:
- First there is a pip to do package management. It does not have a concept of separate env yet, so on the same machine, if you want to run one version of torch for one app, and another version of torch on another app, it becomes a headache.
- Then comes conda from Anaconda Inc. It comes out with a novel solution to solve the mix and match package requirement problem, by creating concept of conda env. Basically each conda env is mapped to a different physical directories, and therefore env1 can have a different torch version from env2.
- As part of env management, it also have to solve what and how to install packages into each env. So conda is also a package manager.
- As a package manager, one can specify conda packages and their versions (say -torch>=2.4, torchvision>=2.4). One can specify the variants and their dependency packages as well (such as torch with gpu for cuda 12.4), or let conda figure out the dependency for you (and transitive dependency also, which pip cannot handle).
- The nice thing about conda as a package manager is that once it figure out all the dependency, it will list the packages to be updated and prompt you. This is very useful as you want to inspect all the proposed changes before going ahead. (Many times, I aborted the operations pending further analysis.
- While conda as a package manager works well when the number of packages in your conda env is small. It slows down significantly when the number of packages are big, or when the number of versions of the packages are big. This is when mamba comes in. Somehow the dependency resolver works a lot faster than conda. (I had the experience that conda took 20 hours, while mamba took less than 10 minutes. The conda env has like 600 packages though).
- While most python packages are available for installation in both conda and pip, some packages are still only in pip. So conda/mamba does have a section for pip install. The recommendation is install all conda packages first, then pip. Once you have a env that is good, you can generate a conda env export with packages and their version, and optionally the package build id for quick replication of conda env.
- Subsequently, pip then comes out with a solution to do virtualenv for pip packages only. So works similar to a separate conda env (that virtualenv maps to different directories).
- Now since pip does not do transitive dependency, so dependency resolution runs a lot faster than conda (but the result is not 100% guaranteed).
- Another issue with pip install is often a compilation is required, and users often got stuck in this step (not having the right versions of header file, etc...) Conda packages on the other hand, usually do just the download the bz2 zip and unzip. No compilation needed, so quicker and more reliable and less error-prone.
- So poetry looks like to trying to solve the issue with package management issue pip virtual env. With conda env, it is less compelling to use poetry, although I can see one scenario: it is to keep track of the key package required for a particular conda env. For example, in build a conda env say llm-torch, I only need/want to specify the following: conda create -p /opt/conda/envs/llm-torch python=3.11 pytorch=2.5.1 cudnn cudatoolkit=11.8 transformers=4.46.3 pandas numpy jupyterlab and the resultant env may have pulled in over 100+ packages. This recipe of python, pytorch, cudatoolkit, need to be captured somewhere. Normally, you checkin to your git source code repo, but the poetry yaml file could also provide a way to document/specify this step.
- BTW, in using conda/mamba, I found a utility called conda-tree. Input a package, it can list what packages it is depends on, and also what other packages in your conda env are using this package. It can also dump out the whole dependency tree. (Unfortunately, this works only for conda packages but not for pip)/