Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-33507 Improve and fix cache behavior in v1 and v2
  3. SPARK-34262

ALTER TABLE .. SET LOCATION doesn't refresh v1 table cache

Attach filesAttach ScreenshotVotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 3.0.2, 3.1.1, 3.2.0
    • 3.0.2, 3.1.1
    • SQL

    Description

      The example below portraits the issue:
      1. Create a source table:

      spark-sql> CREATE TABLE src_tbl (c0 int, part int) USING hive PARTITIONED BY (part);
      spark-sql> INSERT INTO src_tbl PARTITION (part=0) SELECT 0;
      spark-sql> SHOW TABLE EXTENDED LIKE 'src_tbl' PARTITION (part=0);
      default	src_tbl	false	Partition Values: [part=0]
      Location: file:/Users/maximgekk/proj/refresh-cache-set-location/spark-warehouse/src_tbl/part=0
      ...
      

      2. Load data from the source table to a cached destination table:

      spark-sql> CREATE TABLE dst_tbl (c0 int, part int) USING hive PARTITIONED BY (part);
      spark-sql> ALTER TABLE dst_tbl ADD PARTITION (part=0);
      spark-sql> INSERT INTO dst_tbl PARTITION (part=1) SELECT 1;
      spark-sql> CACHE TABLE dst_tbl;
      spark-sql> SELECT * FROM dst_tbl;
      1	1
      spark-sql> ALTER TABLE dst_tbl PARTITION (part=0) SET LOCATION '/Users/maximgekk/proj/refresh-cache-set-location/spark-warehouse/src_tbl/part=0';
      spark-sql> SELECT * FROM dst_tbl;
      1	1
      

      The last query does not show recently loaded data from the source table.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            maxgekk Max Gekk
            maxgekk Max Gekk
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment