Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-28399

region size can be wrong from RegionSizeCalculator

    XMLWordPrintableJSON

Details

    Description

      The RegionSizeCalculator calculates region byte size using the following method

      private static final long MEGABYTE = 1024L * 1024L;
      
      long regionSizeBytes =
        ((long) regionLoad.getStoreFileSize().get(Size.Unit.MEGABYTE)) * MEGABYTE; 

      However, this method will lose accuracy. For example, the result of 

      ((long) new Size(1, Size.Unit.BYTE).get(Size.Unit.MEGABYTE)) * MEGABYTE 

      is 0. This will result in a TableInputSplit with a length of 0, but in fact this TableInputSplit has a small amount of data.

       

      This TableInputSplit will be ignored if we enable spark.hadoopRDD.ignoreEmptySplits.

      Attachments

        Issue Links

          Activity

            People

              frostruan ruanhui
              frostruan ruanhui
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: