Uploaded image for project: 'HCatalog'
  1. HCatalog
  2. HCATALOG-571 Hadoop namenode federation support in HCatalog
  3. HCATALOG-554

Loading data using HCatLoader() from a table on non default namenode fails

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.4, 0.5
    • 0.5, 0.4.1, 0.6
    • None
    • Hadoop 0.23.3
      hcatalog 0.4
      hive 0.9

    Description

      1. Create hive table:

      CREATE TABLE small_table(
        id                int,
        score             int
      )
      stored as SequenceFile
      location "viewfs:///database/small_table";
      

      2. Data:

      1,32
      2,235
      3,32532
      4,23
      5,2
      

      3. Load data onto the HCatalog table:
      DATA = LOAD '/tmp/data.csv' as (id:int, score:int);
      store DATA into 'default.small_table' using org.apache.hcatalog.pig.HCatStorer();

      4. Confirm that the load has been stored in the table:

      hadoopqa@gsbl90385:/tmp$ hive -e "select * from default.small_table"
      Logging initialized using configuration in file:/grid/0/homey/libexec/hive/conf/hive-log4j.properties
      Hive history file=/homes/hadoopqa/hivelogs/hive_job_log_hadoopqa_201211212228_1532947518.txt
      OK
      1       32
      2       235
      3       32532
      4       23
      5       2 
      

      5. Now try to read the same table using HCatLoader():

      a = load 'default.small_table_arup' using org.apache.hcatalog.pig.HCatLoader();
      dump a;
      

      Exception seen is:

      012-11-21 22:30:50,087 [Thread-6] ERROR org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as:hadoopqa@DEV.YGRID.YAHOO.COM (auth:KERBEROS) cause:org.apache.pig.backend.executionengine.ExecException: ERROR 2118: viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
      2012-11-21 22:30:50,088 [Thread-6] INFO  org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob - PigLatin:select_in_pig.pig got an error while submitting
      org.apache.pig.backend.executionengine.ExecException: ERROR 2118: viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288)
              at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:449)
              at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:466)
              at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:358)
              at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1216)
              at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1213)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:396)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1212)
              at org.apache.hadoop.mapreduce.Job.submit(Job.java:1213)
              at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
              at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:233)
              at java.lang.Thread.run(Thread.java:619)
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:260)
      Caused by: java.io.IOException: viewfs://gsbl90897.blue.ygrid.yahoo.com:8020/
              at org.apache.hadoop.fs.viewfs.InodeTree.<init>(InodeTree.java:338)
              at org.apache.hadoop.fs.viewfs.ViewFileSystem$1.<init>(ViewFileSystem.java:164)
              at org.apache.hadoop.fs.viewfs.ViewFileSystem.initialize(ViewFileSystem.java:164)
              at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2190)
              at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:84)
              at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2224)
              at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2206)
              at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:305)
              at org.apache.hadoop.fs.Path.getFileSystem(Path.java:194)
              at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:98)
              at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:81)
              at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:187)
              at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:45)
              at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:251)
              at org.apache.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:149)
              at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
              ... 13 more
      

      Here viewfs:///database/ resolves to the non default namenode and is defined in the client side mount table.

      Observation:

      This issue seems similar to HCATALOG-553, probably HCatLoader() doesn't have the right token for the non default namenode.

      Attachments

        1. HCATALOG-554-branch_0.patch
          2 kB
          Arup Malakar
        2. HCATALOG-554-branch-0.4_1.patch
          0.7 kB
          Arup Malakar
        3. HCATALOG-554-trunk_0.patch
          1 kB
          Arup Malakar
        4. HCATALOG-554-trunk_1.patch
          1 kB
          Arup Malakar
        5. HCATALOG-554-trunk_2.patch
          1 kB
          Arup Malakar

        Activity

          People

            amalakar Arup Malakar
            amalakar Arup Malakar
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: