When you set n_jobs > 1
, Optuna runs your objective function in multiple threads at the same time.
Hugging Face models (like GPT-2) and PyTorch don’t like being run in multiple threads in the same Python process. They share some internal data, and the threads end up stepping on each other’s toes.
That’s why you get the weird meta tensor
error.
Once it happens, the Python session is “polluted” until you restart it (because that broken shared state is still there).
That’s why:
With n_jobs=1
→ works (because only one thread runs).
With n_jobs=2
→ fails (threads clash).
Even after switching back to n_jobs=1
→ still fails until you restart (because the clash already broke the shared state).
Instead of running trials in threads, you need to run them in separate processes (so they don’t share memory/state).
There are two simple ways:
n_jobs=1
in Optuna, but run multiple copies of your script# terminal 1
python tune.py
# terminal 2
python tune.py
Both processes will write their results into the same Optuna storage (e.g., a SQLite database).
Example in code:
import optuna
def objective(trial):
# your Hugging Face model code here
...
if __name__ == "__main__":
study = optuna.create_study(
storage="sqlite:///optuna.db", # shared DB file
study_name="gpt2_tuning",
load_if_exists=True
)
study.optimize(objective, n_trials=10, n_jobs=1) # <- keep n_jobs=1
Now you can run as many parallel processes as you want, and they won’t interfere.
n_jobs > 1
→ uses threads → Hugging Face breaks.
Solution = use processes instead of threads.
The easiest way: keep n_jobs=1
and launch the script multiple times, all writing to the same Optuna storage (SQLite file or database).