I have similar situation in my app and I do not use any mutexes/semaphores. I just don't allow threads directly access data.
My main thread reserves an array with state structure (this is up to your imagination) accordingly to number of child threads and when they are started, they know the pointer to the belonging structure.
The structure should contain:
ready-to-send (or logically it's better to call data-ready?) flag which is set by child thread when data ready and cleared by parent before processing;
ready-to-receive flag set by parent when the data was taken and cleared by child before putting new data;
thread-id to have a control over it.
Probably pointer to prepared data
health-flag of child process (running,paused,finished,finished with error etc.)
Exit code
Child threads prepare some data then check and wait ready-to-receive flag in this structure and put new results there, clearing ready-to-receive flag by itself. Then set ready-to-send flag.
The main thread walks around this array and check ready-to-send flag. If it is set - then you clear this flag and collect data to your queue from child. At the end set flag ready-to-receive.
When child thread is finished its structure can be reused for new one.
Commonly that's all. I have no any Race Condition in my app.