Caching the tables puts the whole table in memory as spark works on the principle of lazy evaluation. So all the transformations to be done are done when the data is transferred to memory, only the final action requires the data to be retrieved and follow the smart path ie. DAG.
There's a setting for this too
spark.sql.inMemoryColumnarStorage.compressed = true