Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-1000

DIH FileListEntityProcessor fileName filters directory names and stops recursion

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.3
    • Fix Version/s: 1.4
    • Labels:
      None

      Description

      I have been trying to find out why DIH in FileListEntityProcessor mode did not appear to be recursing into subdirectories. Going through FileListEntityProcessor.java I eventually tumbled to the fact that my filename filter setting from data-config.xml also applied to directory names.

      Now, I feel that the fieldName filter should be applied to files fed into the parser, it should not be applied to the directory names we are recursing through. I bodged the code to adjust the behavior so that the "FileName" and "excludes" attributes of "entity" only apply to filenames and not directory names. It now recurses though my directory tree only indexing the appropriate files! I think the new behavior is more standard.

      I will submit the a patch once I have constructed one!

        Attachments

        1. SOLR-1000.patch
          6 kB
          Fergus McMenemie
        2. SOLR-1000.patch
          5 kB
          Fergus McMenemie

          Activity

            People

            • Assignee:
              shalinmangar Shalin Shekhar Mangar
              Reporter:
              fergus Fergus McMenemie
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified