decode/swscale directly into the buffer
That would be so fantastic, but how? Like:
(AVCodecContext) int (*get_buffer2)(struct AVCodecContext *s, AVFrame *frame, int flags);
?