Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-18752

"isSrcLocal" parameter to Hive loadTable / loadPartition should come from user

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 2.1.0
    • 2.2.0
    • SQL
    • None

    Description

      We ran into an issue with the HiveShim code that calls "loadTable" and "loadPartition" while testing with some recent changes in upstream Hive.

      The semantics in Hive changed slightly, and if you provide the wrong value for "isSrcLocal" you now can end up with an invalid table: the Hive code will move the temp directory to the final destination instead of moving its children.

      The problem in Spark is that HiveShim.scala tries to figure out the value of "isSrcLocal" based on where the source and target directories are; that's not correct. "isSrcLocal" should be set based on the user query (e.g. "LOAD DATA LOCAL" would set it to "true"). So we need to propagate that information from the user query down to HiveShim.

      Attachments

        Activity

          People

            vanzin Marcelo Masiero Vanzin
            vanzin Marcelo Masiero Vanzin
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: