Uploaded image for project: 'IMPALA'
  1. IMPALA
  2. IMPALA-4415

DCHECK failure in SimpleScheduler::CreateScanInstances()

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Backend
    • Labels:
      None

      Description

      Due to a floating-point rounding error, CreateScanInstances() can fail to assign the last scan range:

          // BUG: avg_bytes_per_instance * num_instances may be < total_size
          float avg_bytes_per_instance = static_cast<float>(total_size) / num_instances;
          int64_t total_assigned_bytes = 0;
          int params_idx = 0;  // into params_list
          for (int i = 0; i < num_instances; ++i) {
            fragment_params->instance_exec_params.emplace_back(
                schedule->GetNextInstanceId(), host, i, *fragment_params);
            FInstanceExecParams& instance_params = fragment_params->instance_exec_params.back();
      
            // Threshold beyond which we want to assign to the next instance.
            // Bug: when i+1 = num_instances, threshold_total_bytes may be < total_size
            int64_t threshold_total_bytes = avg_bytes_per_instance * (i + 1);
      
            // Assign each scan range in params_list. When the per-instance threshold is
            // reached, move onto the next instance.
            // Bug: loop may exit before all bytes have been assigned for last instance
            while (params_idx < params_list.size() && total_assigned_bytes < threshold_total_bytes) {
              const TScanRangeParams& scan_range_params = params_list[params_idx];
              instance_params.per_node_scan_ranges[leftmost_scan_id].push_back(
                  scan_range_params);
              if (scan_range_params.scan_range.__isset.hdfs_file_split) {
                total_assigned_bytes += scan_range_params.scan_range.hdfs_file_split.length;
              } else {
                // for Kudu and Hbase every split has length 1
                ++total_assigned_bytes;
              }
              ++params_idx;
            }
            if (params_idx == params_list.size()) break; // nothing left to assign
          }
          // Bug: loop exits before params_list.size() ranges have been assigned.
          DCHECK_EQ(params_idx, params_list.size());  // everything got assigned
      

        Attachments

          Activity

            People

            • Assignee:
              henryr Henry Robinson
              Reporter:
              henryr Henry Robinson
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: