It seems the PrioritizedEpisodeReplayBuffer can't handle complex observations with the custom RLModule for both CTCE and CTDE. However, by using EpisodeReplayBuffer (CTCE/CTDE) and MultiAgentEpisodeReplayBuffer (DTDE), and setting replay_buffer_config={"replay_sequence_length": 1, "replay_burn_in": 0, "replay_zero_init_states": True} and env_runners(batch_mode="complete_episodes"), it works correctly.