Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4415

DCHECK failure in SimpleScheduler::CreateScanInstances()

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:
      None

      Description

      Due to a floating-point rounding error, CreateScanInstances() can fail to assign the last scan range:

          // BUG: avg_bytes_per_instance * num_instances may be < total_size
          float avg_bytes_per_instance = static_cast<float>(total_size) / num_instances;
          int64_t total_assigned_bytes = 0;
          int params_idx = 0;  // into params_list
          for (int i = 0; i < num_instances; ++i) {
            fragment_params->instance_exec_params.emplace_back(
                schedule->GetNextInstanceId(), host, i, *fragment_params);
            FInstanceExecParams& instance_params = fragment_params->instance_exec_params.back();
      
            // Threshold beyond which we want to assign to the next instance.
            // Bug: when i+1 = num_instances, threshold_total_bytes may be < total_size
            int64_t threshold_total_bytes = avg_bytes_per_instance * (i + 1);
      
            // Assign each scan range in params_list. When the per-instance threshold is
            // reached, move onto the next instance.
            // Bug: loop may exit before all bytes have been assigned for last instance
            while (params_idx < params_list.size() && total_assigned_bytes < threshold_total_bytes) {
              const TScanRangeParams& scan_range_params = params_list[params_idx];
              instance_params.per_node_scan_ranges[leftmost_scan_id].push_back(
                  scan_range_params);
              if (scan_range_params.scan_range.__isset.hdfs_file_split) {
                total_assigned_bytes += scan_range_params.scan_range.hdfs_file_split.length;
              } else {
                // for Kudu and Hbase every split has length 1
                ++total_assigned_bytes;
              }
              ++params_idx;
            }
            if (params_idx == params_list.size()) break; // nothing left to assign
          }
          // Bug: loop exits before params_list.size() ranges have been assigned.
          DCHECK_EQ(params_idx, params_list.size());  // everything got assigned
      

        Activity

        Show
        henryr Henry Robinson added a comment - Fixed by https://github.com/apache/incubator-impala/commit/dead771d1172b6095d19a64cef48ab9adf7ccb09

          People

          • Assignee:
            henryr Henry Robinson
            Reporter:
            henryr Henry Robinson
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development