Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-24400

Issue with spark while accessing managed table with partitions across multiple namespaces - HDFS Federation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • 2.3.0
    • None
    • Spark Submit

    Description

      Facing Issue with spark while accessing managed table with partitions across multiple namespaces

      Test steps :
      1) Create HDFS federated cluster with two namespaces.
      2) Create a managed table whose location is in default namespace (CREATE TABLE test_managed_tbl (id int, name string, dept string) PARTITIONED BY (year int))
      3) Insert a row into table and check that action is going through fine.
      4) Try to alter the table and set the new location which is in Namespace2. (ALTER TABLE test_managed_tbl SET LOCATION 'hdfs://ns2/apps/hive/warehouse/test_managed_tbl')
      5) Try to insert new value into the table (INSERT INTO test_managed_tbl PARTITION (year=2017) VALUES (9,'Harris','CSE'))

      This action is failing with below error :

      18/05/23 02:50:59 INFO FileUtils: Creating directory if it doesn't exist: hdfs://ns2/apps/hive/warehouse/test_managed_tbl/year=2017
      Traceback (most recent call last):
        File "/tmp/federation_managed.py", line 17, in <module>
          spark.sql("INSERT INTO test_managed_tbl PARTITION (year=2017) VALUES (9,'Harris','CSE')")
        File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/sql/session.py", line 714, in sql
        File "/usr/hdp/current/spark2-client/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
        File "/usr/hdp/current/spark2-client/python/lib/pyspark.zip/pyspark/sql/utils.py", line 69, in deco
      pyspark.sql.utils.AnalysisException: u'java.lang.IllegalArgumentException: Wrong FS: hdfs://ns2/apps/hive/warehouse/test_managed_tbl/.hive-staging_hive_2018-05-23_02-50-56_484_3662347267719413000-1/-ext-10000/part-00000-5ee3003b-d41f-41d8-adaa-8937919f896d-c000, expected: hdfs://ns1;'
      18/05/23 02:50:59 INFO SparkContext: Invoking stop() from shutdown hook 
      

      Spark-submit command :

      spark-submit --master yarn-client --conf spark.sql.catalogImplementation=hive /tmp/federation_managed_table.py
      

      Attaching federation_managed_table.py .

      Attachments

        1. federation_managed_table.py
          1 kB
          Supreeth Sharma

        Activity

          People

            Unassigned Unassigned
            ssharma@hortonworks.com Supreeth Sharma
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: