Uploaded image for project: 'Phoenix'
  1. Phoenix
  2. PHOENIX-2970

SpoolingResultIterator using memory too conservatively , which leads to using temp file unnecessaryly

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 4.7.0
    • None
    • None
    • None

    Description

      Even if SpoolingResultIterator will be deprecated, but HBase older than 0.98.17 would continue use it, and I think the DeferredByteBufferSegmentQueue class simialr to SpoolingResultIterator in the future version may also has the same problem.

      In SpoolingResultIterator's ctor, it tries to allocate maximum "phoenix.query.spoolThresholdBytes" bytes from MemoryManager,and uses the allocated MemoryChunk's size as DeferredFileOutputStream's threshold,just as the following code:

              final MemoryChunk chunk = mm.allocate(0, thresholdBytes);
              long waitTime = System.currentTimeMillis() - startTime;
              GLOBAL_MEMORY_WAIT_TIME.update(waitTime);
              memoryMetrics.getMemoryWaitTimeMetric().change(waitTime);
              DeferredFileOutputStream spoolTo = null;
              try {
                  // Can't be bigger than int, since it's the max of the above allocation
                  int size = (int)chunk.getSize();
                  spoolTo = new DeferredFileOutputStream(size, "ResultSpooler",".bin", new File(spoolDirectory)) {
                      @Override
                      protected void thresholdReached() throws IOException {
                          try {
                              super.thresholdReached();
                          } finally {
                              chunk.close();
                          }
                      }
                  };
             

      SpoolingResultIterator assumes that the wrapped ResultIterator would always fetch "phoenix.query.spoolThresholdBytes" bytes,but most of time it won't. For example, if we execute "select count(1) from table" on a big table with many regions, the ScanPlan will parallel too many SpoolingResultIterators to fetch the result, and each SpoolingResultIterator tries to allocate maximum "phoenix.query.spoolThresholdBytes" bytes from MemoryManager.If we don't have too many memory, lots of SpoolingResultIterators will allocate 0 bytes from MemoryManager, and the corresponding DeferredFileOutputStream's threshold will be 0, so the DeferredFileOutputStream will unnecessaryly use temp file to put the results, even if the result is just a single count value. This behavior will slow the query.

      Can we remove the MemoryManager in SpoolingResultIterator,just as the MappedByteBufferSortedQueue does?

      Attachments

        Activity

          People

            Unassigned Unassigned
            comnetwork chenglei
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: