Uploaded image for project: 'Hadoop Distributed Data Store'
  1. Hadoop Distributed Data Store
  2. HDDS-3217

Datanode startup is slow due to iterating container DB 2-3 times

    XMLWordPrintableJSON

    Details

      Description

      During Datanode startup, for each container we iterate 2 times entire DB
      1. For Setting block length
      2. For finding delete Key count.

      And for open containers, we do step 1 again.

      Code Snippet:
      ContainerReader.java:

      For setting Bytes Used:

            List<Map.Entry<byte[], byte[]>> liveKeys = metadata.getStore()
                .getRangeKVs(null, Integer.MAX_VALUE,
                    MetadataKeyFilters.getNormalKeyFilter());
      
            bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
              BlockData blockData;
              try {
                blockData = BlockUtils.getBlockData(e.getValue());
                return blockData.getSize();
              } catch (IOException ex) {
                return 0L;
              }
            }).sum();
            kvContainerData.setBytesUsed(bytesUsed);
      

      For setting pending deleted Key count

                MetadataKeyFilters.KeyPrefixFilter filter =
                    new MetadataKeyFilters.KeyPrefixFilter()
                        .addFilter(OzoneConsts.DELETING_KEY_PREFIX);
                int numPendingDeletionBlocks =
                    containerDB.getStore().getSequentialRangeKVs(null,
                        Integer.MAX_VALUE, filter)
                        .size();
                kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
      

      For open Containers

                if (kvContainer.getContainerState()
                    == ContainerProtos.ContainerDataProto.State.OPEN) {
                  // commitSpace for Open Containers relies on usedBytes
                  initializeUsedBytes(kvContainer);
                }
      

      Jstack of DN during startup

      "Thread-8" #34 prio=5 os_prio=0 tid=0x00007f5df5070000 nid=0x8ee runnable [0x00007f4d840f3000]
         java.lang.Thread.State: RUNNABLE
              at org.rocksdb.RocksIterator.next0(Native Method)
              at org.rocksdb.AbstractRocksIterator.next(AbstractRocksIterator.java:70)
              at org.apache.hadoop.hdds.utils.RocksDBStore.getRangeKVs(RocksDBStore.java:195)
              at org.apache.hadoop.hdds.utils.RocksDBStore.getRangeKVs(RocksDBStore.java:155)
              at org.apache.hadoop.ozone.container.keyvalue.helpers.KeyValueContainerUtil.parseKVContainerData(KeyValueContainerUtil.java:158)
              at org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyAndFixupContainerData(ContainerReader.java:191)
              at org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyContainerFile(ContainerReader.java:168)
              at org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.readVolume(ContainerReader.java:146)
              at org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.run(ContainerReader.java:101)
              at java.lang.Thread.run(Thread.java:748)
      

        Attachments

        1. Datanode restart problem.pdf
          56 kB
          Bharat Viswanadham

          Issue Links

            Activity

              People

              • Assignee:
                bharat Bharat Viswanadham
                Reporter:
                bharat Bharat Viswanadham
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 10m
                  10m