The cache() is used only the default storage level MEMORY_ONLY. But with persist(), you can specify which storage level you want. So cache() is the same as calling persist() with the default storage level. The default persist() will store the data in the JVM heap as unserialized objects. When you write data to a disk, that data is also always serialized.