I'm also stuck on this step of the process. I already figured out which content-type to use through trial and error, but am currently receiving the following: "Invalid JWE format"
Currently, I'm building the payload in the form of a hash, then converting it to a string.
Then, I use the exponent and modulus (e & n) from the JWK embedded in the capture context and use these to create an RSA key. I then use this key and the payload to generate a JWE.
It looks like you're using the literal key ID (kid) from the embedded JWK to do your encryption. Did you end up changing that? Let me know if you see anything wrong with my approach.