Based on trial-and-error, it seems the limit is 4096 tokens. You get the message: `Failed to run inference: Context length of 4096 was exceeded".
(this seems pretty basic and couldn't find the answer on Google so figured I'll document here)