Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19798

Query returns stale results when tables are modified on other sessions

    XMLWordPrintableJSON

Details

    • Bug
    • Status: In Progress
    • Major
    • Resolution: Unresolved
    • 2.2.0, 3.0.0
    • None
    • SQL
    • None

    Description

      I observed the problem on master branch with thrift server in multisession mode (default), but I was able to replicate also with spark-shell as well (see below the sequence for replicating).
      I observed cases where changes made in a session (table insert, table renaming) are not visible to other derived sessions (created with session.newSession).

      The problem seems due to the fact that each session has its own tableRelationCache and it does not get refreshed.
      IMO tableRelationCache should be shared in sharedState, maybe in the cacheManager so that refresh of caches for data that is not session-specific such as temporary tables gets centralized.

      — Spark shell script

      val spark2 = spark.newSession
      spark.sql("CREATE TABLE test (a int) using parquet")
      spark2.sql("select * from test").show // OK returns empty
      spark.sql("select * from test").show // OK returns empty
      spark.sql("insert into TABLE test values 1,2,3")
      spark2.sql("select * from test").show // ERROR returns empty
      spark.sql("select * from test").show // OK returns 3,2,1
      spark.sql("create table test2 (a int) using parquet")
      spark.sql("insert into TABLE test2 values 4,5,6")
      spark2.sql("select * from test2").show // OK returns 6,4,5
      spark.sql("select * from test2").show // OK returns 6,4,5
      spark.sql("alter table test rename to test3")
      spark.sql("alter table test2 rename to test")
      spark.sql("alter table test3 rename to test2")
      spark2.sql("select * from test").show // ERROR returns empty
      spark.sql("select * from test").show // OK returns 6,4,5
      spark2.sql("select * from test2").show // ERROR throws java.io.FileNotFoundException
      spark.sql("select * from test2").show // OK returns 3,1,2

      Attachments

        Activity

          People

            Unassigned Unassigned
            gbloisi Giambattista
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: