The issue "Could not create inference context" tends to occur when executing Core ML models on the iOS Simulator. Certain higher-level Core ML models (particularly those employing newer layers or hardware acceleration) need the neural engine or GPU, which are exclusively present on a native iOS device.
If the same code runs with Apple's demo model (such as MobileNetV2) but not with your custom model, try running it on a real device.