Text-to-Image Generation

Text-to-Image Generation is the task of generating an image conditioned on the input text.

Try it for yourself

1. Enter a Caption (or choose one from the examples)

2. Run a model

X-LXMERT

X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal TransformersJaemin ChoJiasen LuDustin SchwenkHannaneh HajishirziAniruddha KembhaviEMNLP2020

An extension to LXMERT with training refinements including: discretizing visual representations, using uniform masking with a large range of masking ratios and aligning the right pre-training datasets to the right objectives which enables it to paint.