Perhaps you can rewrite your JOIN as IN clause so that Spark able to push it down to the db? And if the IN list is too long for a single query, partition it into multiple sub-lists. Painful as it is, it may be a better option than creating a temp table in rdbms.