Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-6517 Snapshot support for Ozone
  3. HDDS-8072

[snapshot] OM error due to 'Error during Snapshot sst filtering'

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Resolved
    • None
    • None
    • Ozone Manager

    Description

      After creating 13525 snapshots on a cluster across various volume/buckets there is an OM error due to 'Error during Snapshot sst filtering'

      ozone-om.log stacktrace -

      2023-03-02 20:24:13,850 ERROR org.apache.hadoop.ozone.om.SstFilteringService: Error during Snapshot sst filtering
      java.io.IOException: Failed init RocksDB, db path : /var/lib/hadoop-ozone/om/data/db.snapshots/om.db-efd0f99b-3689-4ed6-a671-d2cb4d5fa982, exception :org.rocksdb.RocksDBException While open a file for random read: /var/lib/hadoop-ozone/om/data/db.snapshots/om.db-efd0f99b-3689-4ed6-a671-d2cb4d5fa982/000152.sst: Too many open files
              at org.apache.hadoop.hdds.utils.db.RDBStore.<init>(RDBStore.java:180)
              at org.apache.hadoop.hdds.utils.db.DBStoreBuilder.build(DBStoreBuilder.java:219)
              at org.apache.hadoop.ozone.om.OmMetadataManagerImpl.loadDB(OmMetadataManagerImpl.java:481)
              at org.apache.hadoop.ozone.om.SstFilteringService$SstFilteringTask.call(SstFilteringService.java:147)
              at org.apache.hadoop.hdds.utils.BackgroundService$PeriodicalTask.lambda$run$0(BackgroundService.java:121)
              at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
              at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
              at java.util.concurrent.FutureTask.run(FutureTask.java:266)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
              at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
              at java.lang.Thread.run(Thread.java:748)
      Caused by: java.io.IOException: class org.apache.hadoop.hdds.utils.db.RocksDatabase: Failed to open /var/lib/hadoop-ozone/om/data/db.snapshots/om.db-efd0f99b-3689-4ed6-a671-d2cb4d5fa982; status : IOError; message : While open a file for random read: /var/lib/hadoop-ozone/om/data/db.snapshots/om.db-efd0f99b-3689-4ed6-a671-d2cb4d5fa982/000152.sst: Too many open files
              at org.apache.hadoop.hdds.utils.HddsServerUtil.toIOException(HddsServerUtil.java:576)
              at org.apache.hadoop.hdds.utils.db.RocksDatabase.toIOException(RocksDatabase.java:85)
              at org.apache.hadoop.hdds.utils.db.RocksDatabase.open(RocksDatabase.java:162)
              at org.apache.hadoop.hdds.utils.db.RDBStore.<init>(RDBStore.java:116)
              ... 12 more
      Caused by: org.rocksdb.RocksDBException: While open a file for random read: /var/lib/hadoop-ozone/om/data/db.snapshots/om.db-efd0f99b-3689-4ed6-a671-d2cb4d5fa982/000152.sst: Too many open files
              at org.rocksdb.RocksDB.openROnly(Native Method)
              at org.rocksdb.RocksDB.openReadOnly(RocksDB.java:488)
              at org.rocksdb.RocksDB.openReadOnly(RocksDB.java:443)
              at org.apache.hadoop.hdds.utils.db.managed.ManagedRocksDB.openReadOnly(ManagedRocksDB.java:45)
              at org.apache.hadoop.hdds.utils.db.RocksDatabase.open(RocksDatabase.java:145)
              ... 13 more 

       

      Attachments

        Issue Links

          Activity

            People

              dteng Dave Teng
              jyosin Jyotirmoy Sinha
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: