Solr
  1. Solr
  2. SOLR-1973

Empty fields in update messages confuse DataImportHandler

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4, 1.4.1
    • Fix Version/s: 3.1, 4.0-ALPHA
    • Labels:
      None
    • Environment:

      CentOS 5, Java 1.6, Tomcat 6

      Description

      I seem to be running into an issue with Solr (maybe just the DataImportHandler?) not liking empty field elements in the docs, and getting the wrong values into the fields of the index. Here's the entity declaration from data-config.xml for my isolated example:

      <document>
      <entity name="contentAsSolrAdd"
      dataSource="xml"
      processor="XPathEntityProcessor"
      stream="true"
      url="http://example.com/Content.xml"
      useSolrAddSchema="true">
      </entity>
      </document>

      And here's the Content.xml being pulled in by the DIH:

      <add>
      <doc>
      <field name="empty"></field>
      <field name="full">Lorem Ipsum Dolor</field>
      <field name="other">Some content is me!</field>
      </doc>
      </add>

      And here's the relevant portion of the output from the DIH in debug mode:

      <lst name="document#1">
      <str name="query">
      http://example.com/Content.xml
      </str>
      <str name="time-taken">0:0:0.6</str>
      <str>----------- row #1-------------</str>
      <str name="full">Some content is me!</str>
      <str name="empty">Lorem Ipsum Dolor</str>
      <str>---------------------------------------------</str>
      </lst>

      Notice that the field "full" doesn't appear here, but the following field "empty" has the content that was there for "full". The "other" field, which was non-empty, and preceded by a non-empty field, shows up correctly.

      1. SOLR-1973.patch
        4 kB
        Koji Sekiguchi
      2. SOLR-1973.patch
        2 kB
        Koji Sekiguchi
      3. SOLR-1973-test.patch
        2 kB
        Koji Sekiguchi

        Activity

        Hide
        Koji Sekiguchi added a comment -

        I can reproduce this.

        Show
        Koji Sekiguchi added a comment - I can reproduce this.
        Hide
        Koji Sekiguchi added a comment -

        Attached test code to reproduce the problem.

        Show
        Koji Sekiguchi added a comment - Attached test code to reproduce the problem.
        Hide
        Koji Sekiguchi added a comment -

        Attached the patch that fixes the problem. All tests in dataimport package pass.

        Show
        Koji Sekiguchi added a comment - Attached the patch that fixes the problem. All tests in dataimport package pass.
        Hide
        Koji Sekiguchi added a comment -

        Added more tests in the attached patch.

        I'll commit in a few days if no one objects.

        Show
        Koji Sekiguchi added a comment - Added more tests in the attached patch. I'll commit in a few days if no one objects.
        Hide
        Koji Sekiguchi added a comment -

        trunk: Committed revision 1032433.
        branch_3x: Committed revision 1032438.

        Show
        Koji Sekiguchi added a comment - trunk: Committed revision 1032433. branch_3x: Committed revision 1032438.
        Hide
        Grant Ingersoll added a comment -

        Bulk close for 3.1.0 release

        Show
        Grant Ingersoll added a comment - Bulk close for 3.1.0 release

          People

          • Assignee:
            Koji Sekiguchi
            Reporter:
            Sixten Otto
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development