Uploaded image for project: 'Apache Ozone'
  1. Apache Ozone
  2. HDDS-3217

Datanode startup is slow due to iterating container DB 2-3 times

    XMLWordPrintableJSON

Details

    Description

      During Datanode startup, for each container we iterate 2 times entire DB
      1. For Setting block length
      2. For finding delete Key count.

      And for open containers, we do step 1 again.

      Code Snippet:
      ContainerReader.java:

      For setting Bytes Used:

            List<Map.Entry<byte[], byte[]>> liveKeys = metadata.getStore()
                .getRangeKVs(null, Integer.MAX_VALUE,
                    MetadataKeyFilters.getNormalKeyFilter());
      
            bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
              BlockData blockData;
              try {
                blockData = BlockUtils.getBlockData(e.getValue());
                return blockData.getSize();
              } catch (IOException ex) {
                return 0L;
              }
            }).sum();
            kvContainerData.setBytesUsed(bytesUsed);
      

      For setting pending deleted Key count

                MetadataKeyFilters.KeyPrefixFilter filter =
                    new MetadataKeyFilters.KeyPrefixFilter()
                        .addFilter(OzoneConsts.DELETING_KEY_PREFIX);
                int numPendingDeletionBlocks =
                    containerDB.getStore().getSequentialRangeKVs(null,
                        Integer.MAX_VALUE, filter)
                        .size();
                kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
      

      For open Containers

                if (kvContainer.getContainerState()
                    == ContainerProtos.ContainerDataProto.State.OPEN) {
                  // commitSpace for Open Containers relies on usedBytes
                  initializeUsedBytes(kvContainer);
                }
      

      Jstack of DN during startup

      "Thread-8" #34 prio=5 os_prio=0 tid=0x00007f5df5070000 nid=0x8ee runnable [0x00007f4d840f3000]
         java.lang.Thread.State: RUNNABLE
              at org.rocksdb.RocksIterator.next0(Native Method)
              at org.rocksdb.AbstractRocksIterator.next(AbstractRocksIterator.java:70)
              at org.apache.hadoop.hdds.utils.RocksDBStore.getRangeKVs(RocksDBStore.java:195)
              at org.apache.hadoop.hdds.utils.RocksDBStore.getRangeKVs(RocksDBStore.java:155)
              at org.apache.hadoop.ozone.container.keyvalue.helpers.KeyValueContainerUtil.parseKVContainerData(KeyValueContainerUtil.java:158)
              at org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyAndFixupContainerData(ContainerReader.java:191)
              at org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyContainerFile(ContainerReader.java:168)
              at org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.readVolume(ContainerReader.java:146)
              at org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.run(ContainerReader.java:101)
              at java.lang.Thread.run(Thread.java:748)
      

      Attachments

        1. Datanode restart problem.pdf
          56 kB
          Bharat Viswanadham

        Issue Links

          Activity

            People

              bharat Bharat Viswanadham
              bharat Bharat Viswanadham
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m