Calling cacheTable() on some table t multiple times causes table t to be cached multiple times. This semantics is different from RDD.cache(), which is idempotent.
We can check whether a table is already cached by checking:
- whether the structure of the underlying logical plan of the table is matches the pattern Subquery(_, SparkLogicalPlan(inMem @ InMemoryColumnarTableScan(_, _)))
- whether inMem.cachedColumnBuffers.getStorageLevel.useMemory is true