Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8558

Adding NumericDocValuesFields is slowing down the indexing process significantly

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 7.4, 7.5
    • 7.6, 8.0
    • core/index
    • New

    Description

      The indexing time for my ~2M documents has gone up significantly when I started adding fields of type NumericDocValuesField.

       

      Upon debugging found the bottleneck to be in the PerFieldMergeState#FilterFieldInfos constructor. The contains check in the below code snippet was the culprit. 

      this.filteredNames = new HashSet<>(filterFields);
      this.filtered = new ArrayList<>(filterFields.size());
      for (FieldInfo fi : src) {
        if (filterFields.contains(fi.name)) {
      

      A simple change as below seems to have fixed my issue

      this.filteredNames = new HashSet<>(filterFields);
      this.filtered = new ArrayList<>(filterFields.size());
      for (FieldInfo fi : src) {
        if (this.filteredNames.contains(fi.name)) {
      

       

      Attachments

        1. LUCENE-8558.patch
          0.8 kB
          Kranthi

        Activity

          People

            simonw Simon Willnauer
            Chalasani Kranthi
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: