Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8558

Adding NumericDocValuesFields is slowing down the indexing process significantly

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 7.4, 7.5
    • Fix Version/s: 7.6, 8.0
    • Component/s: core/index
    • Labels:
    • Lucene Fields:
      New

      Description

      The indexing time for my ~2M documents has gone up significantly when I started adding fields of type NumericDocValuesField.

       

      Upon debugging found the bottleneck to be in the PerFieldMergeState#FilterFieldInfos constructor. The contains check in the below code snippet was the culprit. 

      this.filteredNames = new HashSet<>(filterFields);
      this.filtered = new ArrayList<>(filterFields.size());
      for (FieldInfo fi : src) {
        if (filterFields.contains(fi.name)) {
      

      A simple change as below seems to have fixed my issue

      this.filteredNames = new HashSet<>(filterFields);
      this.filtered = new ArrayList<>(filterFields.size());
      for (FieldInfo fi : src) {
        if (this.filteredNames.contains(fi.name)) {
      

       

        Attachments

        1. LUCENE-8558.patch
          0.8 kB
          Kranthi

          Activity

            People

            • Assignee:
              simonw Simon Willnauer
              Reporter:
              Chalasani Kranthi
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: