A thread mixes two different things, this is why it is hard to understand. First, there is a processor that executes something. Second, there is an instruction that needs to be executed. In very early days a processor was given an instruction and was running it to the end. There was no point to run multiple instructions at once.
Reason: If we have jobs A and B and each takes 5 minutes, then if we do it one after another, A will be ready in 5 minutes and B in 10. But if we somehow switch between them every minute then A will be ready in 9 minutes and B in 10. So what is the point of switching? And this is even if we assume that switching itself is instantaneous.
Then computers got additional processors. Those were specialized; for example, they were helping to service disk requests. As a result the situation changed so: there is the main processor doing something. It then makes a request to a specialized processor to do something special, say read or write data. That processor will do it on its own, but it will take some time. During that time the main processor has nothing to do. Now this becomes wasteful; it could be doing some other instruction as well.
The instructions are unrelated, so the the simplest and most semantically sound way to organize that would be to write each instruction as if it was a sole instruction run by a single processor and let the processor to handle the switching transparently to the instruction. So this is how it was done. The processor runs an instruction and then at a suitable moment it stops it, places a bookmark, and puts it aside. Then it picks another bookmarked instruction, reads the bookmark and continues from where it was. An instruction has no notion it shares the processor with any other instruction.
The core idea of a modern thread is that it is such an independent instruction that is assumed to run sequentially from start to finish. It rarely exists in such a pure form though. I would love to give SQL as an example: although in most cases it actually runs concurrently there is absolutely no notion of concurrency in SQL itself. But SQL is not a good example because it has no instructions either and I cannot think of a similar procedural language.
In most other cases the notion of concurrency seeps in in the form of special resources that need to be locked and unlocked or about certain values that may change on their own, or even in nearly explicit form of asynchronous functions and so on. There are quite a few such concepts.
So a thread is a) first, an instruction that is written as if it was the sole instruction to be run; b) a bookmark in that instruction.
Does a thread need a stack? Not really; this comes from the processor. A processor needs some memory to lay out the data for the next step and that memory could be in the form of a stack.
But first, it does not have to be a stack. For example, in Pascal the size of a stack frame is precalculated at compilation time (it may have an internal stack of fixed size) and it is possible to give the processor memory in the form of individual frames. We can place these frames on a stack or we can just as well place them anywhere and just link them into a list. This is actually a good solution for concurrent programs because the memory is not reserved in relatively large stacks but is doled out in small frames as needed. (Concurrent Pascal worked this way with a QuickFit-like allocator.)
Second, even if we used a stack for working memory, we could have a single stack per processor provided we do not switch between threads arbitrarily. If every job had a unique priority and we always did the one with the highest priority, then we would interrupt a job only to do a more urgent one, and by the time we resumed it the stack would be clear again and we could just continue the previous job using the same stack.
So the reason a thread normally gets its own stack is not inherent to the concept of a thread, but is more like a specific implementation of a specific strategy.