It seems the PrioritizedEpisodeReplayBuffer
can't handle complex observations with the custom RLModule for both CTCE and CTDE. However, by using EpisodeReplayBuffer
(CTCE/CTDE) and MultiAgentEpisodeReplayBuffer
(DTDE), and setting replay_buffer_config={"replay_sequence_length": 1, "replay_burn_in": 0, "replay_zero_init_states": True}
and env_runners(batch_mode="complete_episodes")
, it works correctly.