Replace tfds.load('imdb_reviews/subwords8k', ...)
with tfds.load('imdb_reviews', ...)
, then manually create a tokenizer using SubwordTextEncoder.build_from_corpus
on the training split, map this tokenizer over the dataset with tf.py_function
to encode the text into integer IDs, and finally use padded_batch
to handle variable-length sequences before feeding them into your model.