Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-11194

Unable to LOAD DATA from HDFS configured with ranger

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Duplicate
    • Impala 4.0.0
    • None
    • None
    • None
    • ghx-label-10

    Description

      Currently, there is a case where I LOAD DATA from hdfs configured with ranger, and the following exception occurs:

      // sql
      LOAD DATA INPATH 'hdfs://...' OVERWRITE INTO TABLE tbl PARTITION(status='origin');
      
      // impalad exception
      org.apache.impala.common.AnalysisException: Unable to LOAD DATA from hdfs://... because Impala does not have READ permissions on this file
              at org.apache.impala.analysis.LoadDataStmt.analyzePaths(LoadDataStmt.java:194)
              at org.apache.impala.analysis.LoadDataStmt.analyze(LoadDataStmt.java:122)
              at org.apache.impala.analysis.AnalysisContext.analyze(AnalysisContext.java:491)
              at org.apache.impala.analysis.AnalysisContext.analyzeAndAuthorize(AnalysisContext.java:451)
              at org.apache.impala.service.Frontend.doCreateExecRequest(Frontend.java:1736)
              at org.apache.impala.service.Frontend.getTExecRequest(Frontend.java:1702)
              at org.apache.impala.service.Frontend.createExecRequest(Frontend.java:1672)
              at org.apache.impala.service.JniFrontend.createExecRequest(JniFrontend.java:164) 
      
      

      According to the `org.apache hadoop. Fs. FileStatus# permission`, the impalad process user does not hdfs file owner and do not have permission to read.

      [hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/
      -rw-------   1 hdfs hdfs        270 2022-03-17 20:13 /user_tag/import_staging/user_tag/user_tag_p19_data_3
      

      But  I have already authorized the impalad process user in Ranger, the process user of impalad had actual read and write permissions.

      [hdfs@hybrid02 ~]$ klist
      Ticket cache: FILE:/tmp/krb5cc_7007
      Default principal: impala/hybrid02@SENSORSDATA
      
      [hdfs@hybrid02 ~]$ hdfs dfs -ls -R /user_tag/import_staging/user_tag/ 
      -rw-------   1 hdfs hdfs        270 2022-03-17 20:13 /user_tag/import_staging/user_tag/user_tag_p19_data_3
      
      [hdfs@hybrid02 ~]$ hdfs dfs -get /tmp/user_tag_p19_data_3_test 
      [hdfs@hybrid02 ~]$ ll -f user_tag_p19_data_3_test 
      user_tag_p19_data_3_test
      

      In my opinion, because in `org.apache.impala.analysis.LoadDataStmt#analyzePaths`, the permission check on files is mainly based on `org.apache hadoop.fs.filestatus # permission`.That's why there are these exception.

      Attachments

        Issue Links

          Activity

            People

              lipenglin Li Penglin
              lipenglin Li Penglin
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: