GANs are not the best choice for text-to-image generation. Theyre unstable and outdated for this task. Instead, consider lightweight diffusion models like Kandinsky, which offer better quality, are easier to fine-tune, and require less computer.
* Sorry for my English. I’m still learning, so there might be some grammar or syntax mistakes.