Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-1498

RegexTransformer: sourceColName version not handling multiValued fields correctly

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 1.4
    • Labels:
      None
    • Environment:

      Windows XP, JDK 6, Tomcat 6
      Linux (RedHat), JDK, Tomcat 5

      Description

      Versions in use/compared:
      Solr 1.3
      (Nightly 5th August)
      Nightly 22nd September

      As RegexTransformer is not different between the two nightlies, the
      issue probably appeared before.

      ISSUE:
      Using RegexTransformer with the 'sourceColName' notation will not populate
      multiValued (actually containing multiple values) fields with a list but
      instead add only one value per document.

      The version with 'groupNames' does.

      worked for 1.3 (regression):
      <field column="participant" sourceColName="person" regex="([^\|]+)|.*" />
      <field column="role" sourceColName="person"
      regex="[^\|]|\d,\d+,\d+,(.*)" />

      works for nightly 22nd Sept:
      <field column="person" groupNames="participant,role"
      regex="([^\|])|\d,\d+,\d+,(.*)" />

      (Both fields are of type solr.StrField and multiValued.)

      Comparing the source code of RegexTransformer 1.3 vs. 22nd Sept, I found:

      for (Object result : results)
      row.put(col, result);

      (lines 106-107 of transformRow() 22nd of Sept)

        Attachments

        1. SOLR-1498.patch
          3 kB
          Noble Paul

          Activity

            People

            • Assignee:
              noble.paul Noble Paul
              Reporter:
              chantal Chantal Ackermann
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: