Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-6153

Table Map Reduce job after a Snapshot based job fails with CorruptedSnapshotException

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 4.15.0, 4.14.3
    • 5.1.0, 4.16.0
    • core
    • None

    Description

      Different MR job requests which reach MapReduceParallelScanGrouper getRegionBoundaries we currently make use of shared configuration among jobs to figure out snapshot names. 

      Example jobs' sequence: first two jobs work over snapshot and the third job over a regular table.

      Prininting hashcode of objects when entering: https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/iterate/MapReduceParallelScanGrouper.java#L65

      Job 1: (over snapshot of  ABC_TABLE_1 and is successful)

      context.getConnection(): 521093916
      ConnectionQueryServices: 1772519705
      Configuration conf: 813285994
          conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY):ABC_TABLE_1

       

      Job 2: (over snapshot of ABC_TABLE_2 and is successful)

      context.getConnection(): 1928017473
      ConnectionQueryServices: 961279422
      Configuration conf: 813285994
          conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): ABC_TABLE_2

       

      Job 3: (over the table ABC_TABLE_3 but fails with CorruptedSnapshotException while it got nothing to do with snapshot)

      context.getConnection(): 28889670
      ConnectionQueryServices: 424389847
      Configuration: 813285994
          conf.get(PhoenixConfigurationUtil.SNAPSHOT_NAME_KEY): ABC_TABLE_2

       

      Exception which we get:
      [2020:08:18 20:56:17.409] [MigrationRetryPoller-Executor-1] [ERROR] [c.s.hgrate.mapreduce.MapReduceImpl] - Error submitting M/R job for Job 3
      java.lang.RuntimeException: org.apache.hadoop.hbase.snapshot.CorruptedSnapshotException: Couldn't read snapshot info from:hdfs://.../hbase/.hbase-snapshot/ABC_TABLE_2_1597687413477/.snapshotinfo
      at org.apache.phoenix.iterate.MapReduceParallelScanGrouper.getRegionBoundaries(MapReduceParallelScanGrouper.java:81) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.iterate.BaseResultIterators.getRegionBoundaries(BaseResultIterators.java:541) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:893) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.iterate.BaseResultIterators.getParallelScans(BaseResultIterators.java:641) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.iterate.BaseResultIterators.<init>(BaseResultIterators.java:511) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.iterate.ParallelIterators.<init>(ParallelIterators.java:62) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.execute.ScanPlan.newIterator(ScanPlan.java:278) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:367) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:218) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.execute.BaseQueryPlan.iterator(BaseQueryPlan.java:213) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansWithScanGrouper(PhoenixInputFormat.java:252) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.mapreduce.PhoenixInputFormat.setupParallelScansFromQueryPlan(PhoenixInputFormat.java:235) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.mapreduce.PhoenixInputFormat.generateSplits(PhoenixInputFormat.java:94) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.phoenix.mapreduce.PhoenixInputFormat.getSplits(PhoenixInputFormat.java:89) ~[phoenix-core-4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT.jar:4.14.3-hbase-1.6-sfdc-1.0.9-SNAPSHOT]
      at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:301) ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
      at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:318) ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
      at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196) ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
      at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
      at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) ~[hadoop-mapreduce-client-core-2.7.7-sfdc-1.0.18.jar:2.7.7-sfdc-1.0.18]
      at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_172]
      at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_172]
       

       

       Change Required:

      1. While setting the snapshot name in a shared configuration we also need to add a mechanism to remove it as well when jobs are not snapshot related:

      https://github.com/apache/phoenix/blob/f9e304754bad886344a856dd2565e3f24e345ed2/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/PhoenixInputFormat.java#L210

       

       

       

      Attachments

        1. PHOENIX-6153.4.x.v1.patch
          13 kB
          Saksham Gangwar
        2. PHOENIX-6153.master.v1.patch
          10 kB
          Saksham Gangwar
        3. PHOENIX-6153.master.v2.patch
          7 kB
          Saksham Gangwar
        4. PHOENIX-6153.master.v3.patch
          2 kB
          Saksham Gangwar
        5. PHOENIX-6153.master.v4.patch
          13 kB
          Saksham Gangwar
        6. PHOENIX-6153.master.v5.patch
          15 kB
          Saksham Gangwar
        7. Screen Shot 2020-09-30 at 4.00.58 AM.png
          850 kB
          Saksham Gangwar
        8. Screen Shot 2020-09-30 at 4.01.10 AM.png
          414 kB
          Saksham Gangwar
        9. Screen Shot 2020-09-30 at 4.01.10 AM.png
          414 kB
          Saksham Gangwar
        10. Screen Shot 2020-09-30 at 4.01.19 AM.png
          428 kB
          Saksham Gangwar
        11. Screen Shot 2020-09-30 at 4.01.19 AM.png
          428 kB
          Saksham Gangwar
        12. Screen Shot 2020-09-30 at 4.01.19 AM.png
          428 kB
          Saksham Gangwar
        13. Screen Shot 2020-09-30 at 4.01.34 AM.png
          504 kB
          Saksham Gangwar
        14. Screen Shot 2020-09-30 at 4.01.52 AM.png
          468 kB
          Saksham Gangwar
        15. Screen Shot 2020-09-30 at 4.01.52 AM.png
          468 kB
          Saksham Gangwar
        16. Screen Shot 2020-09-30 at 9.30.06 AM.png
          486 kB
          Saksham Gangwar

        Issue Links

          Activity

            People

              saksham.gangwar Saksham Gangwar
              saksham.gangwar Saksham Gangwar
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: