79406705

Date: 2025-02-02 13:48:54
Score: 0.5
Natty:
Report link

The textbook answer is that threads share the memory context of the parent process. This makes them, faster to spawn, allows threads to share memory and exchange information very fast.

That also means that a failure or a memory leak on a single thread affects and persists for all of them. Depending on your code, you could have threads fighting for the memory bus, the L caches, and having a lot of context switches that can lead to significant slowdown.

Process on the other hand are given a new, fresh, private memory space. This means that they take longer to start, but they are a lot more isolated. Linux allows you to control process a lot more, limit memory and cpu for example. This can make them a lot easier to debug.

On the other hand, their interprocess communication (IPC) is slower, requiring shared memory which can be a complicated task, altough because they tend to share minimal data, they tend to lead to less synchronization errors (mutex, locks, etc)

Outside of that something that is not usually mentioned is:

Process are your only option for distributed, multi-node deployment.

Process can lead to a lot more control and utilization in NUMA cpus which are a lot more common in servers.

In general though, if you have to parallelize a section of your code, like a function, a loop, etc you use threads. If you need to parallelize a problem space, every entity does the same work on different chunks of data, you tend to use processes.

That is just a fast rule, people can disagree with it.

P.D.: if you are looking at python stuff, this changes as python threads are not really parallel under most python implementation because of something called the GIL (global interpreter lock)

Reasons:
  • Long answer (-1):
  • No code block (0.5):
  • Low reputation (1):
Posted by: jcernuda95