I may be wrong, but I understand that you'd like to optimize the kernel parameters such that the GPR (Gaussian Process Regression) evaluated on your X_train data is not lower than 90% of the lowest value in the y_train data. Please let me know if this is not the case or if I misinterpreted your code.
More broadly, the intended function to optimize in GPR is the likelihood. I don't think you'll be able to achieve your goal by replacing the function to optimize as you are doing.
If you want to work with positive data and GPR, I suggest two approaches:
Implementing a GPR with positive coefficients: I can provide code for this, but it might be slow.
Transforming and inverse-transforming your data:
np.where(x < t, 2*x - t, x)
, which expands the positive space to include some negative values. Then, apply your GPR and use the inverse function f⁻¹, such as np.where(x < t, 0.5*(x + t), x)
, to map the results back to the positive space. This gives the GPR some flexibility, and you end up with positive values without hard thresholding as in the previous approach.