Solr
  1. Solr
  2. SOLR-6257

More than two "!"-s in a doc ID throws an ArrayIndexOutOfBoundsException when using the composite id router

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.10, 6.0
    • Component/s: None
    • Labels:
      None

      Description

      Since CompositeIdRouter is the default router, it has to be able to deal with any ID string without throwing an exception.

      The following test (added to TestHashPartitioner) currently fails:

        public void testNonConformingCompositeId() throws Exception {
          DocRouter router = DocRouter.getDocRouter(CompositeIdRouter.NAME);
          DocCollection coll = createCollection(4, router);
          Slice targetSlice = coll.getRouter().getTargetSlice("A!B!C!D", null, null, coll);
          assertNotNull(targetSlice);
        }
      

      with the following output:

         [junit4] Suite: org.apache.solr.cloud.TestHashPartitioner
         [junit4]   2> log4j:WARN No such property [conversionPattern] in org.apache.solr.util.SolrLogLayout.
         [junit4]   2> Creating dataDir: /Users/sarowe/svn/lucene/dev/trunk/solr/build/solr-core/test/J0/./temp/solr.cloud.TestHashPartitioner-19514036FB5C5E56-001/init-core-data-001
         [junit4]   2> 1233 T11 oas.SolrTestCaseJ4.buildSSLConfig Randomized ssl (false) and clientAuth (false)
         [junit4]   2> 1296 T11 oas.SolrTestCaseJ4.setUp ###Starting testNonConformingCompositeId
         [junit4]    > Throwable #1: java.lang.ArrayIndexOutOfBoundsException: 2
         [junit4]    > 	at __randomizedtesting.SeedInfo.seed([19514036FB5C5E56:3A131EC016F531A4]:0)
         [junit4]    > 	at org.apache.solr.common.cloud.CompositeIdRouter$KeyParser.getHash(CompositeIdRouter.java:296)
         [junit4]    > 	at org.apache.solr.common.cloud.CompositeIdRouter.sliceHash(CompositeIdRouter.java:58)
         [junit4]    > 	at org.apache.solr.common.cloud.HashBasedRouter.getTargetSlice(HashBasedRouter.java:33)
         [junit4]    > 	at org.apache.solr.cloud.TestHashPartitioner.testNonConformingCompositeId(TestHashPartitioner.java:205)
         [junit4]    > 	at java.lang.Thread.run(Thread.java:745)
      
      1. SOLR-6257.patch
        7 kB
        Steve Rowe
      2. SOLR-6257.patch
        6 kB
        Steve Rowe

        Issue Links

          Activity

          Hide
          Steve Rowe added a comment - - edited

          Patch with two tests and a fix.

          The first test routes ID strings known not to conform to the composite ID format, and the second routes ID strings containing a random sequence of "!"-s, "/"-s, random numbers, and random Unicode strings. Both fail without the fix.

          The included fix changes CompositeIdRouter.KeyParser to treat the first two "!"-s in an ID as separators, but then treat any further "!"-s as regular text.

          With the fix, the random test passed when I set it to test 5,000,000 IDs, but it's dialed down to 10,000 in the patch.

          Running all Solr tests now.

          Anshum Gupta, I'd appreciate a review if you have time.

          Show
          Steve Rowe added a comment - - edited Patch with two tests and a fix. The first test routes ID strings known not to conform to the composite ID format, and the second routes ID strings containing a random sequence of "!"-s, "/"-s, random numbers, and random Unicode strings. Both fail without the fix. The included fix changes CompositeIdRouter.KeyParser to treat the first two "!"-s in an ID as separators, but then treat any further "!"-s as regular text. With the fix, the random test passed when I set it to test 5,000,000 IDs, but it's dialed down to 10,000 in the patch. Running all Solr tests now. Anshum Gupta , I'd appreciate a review if you have time.
          Hide
          Steve Rowe added a comment - - edited

          The random test included in the patch would have caught SOLR-5502 too.

          Show
          Steve Rowe added a comment - - edited The random test included in the patch would have caught SOLR-5502 too.
          Hide
          Steve Rowe added a comment -

          Thinking about back-compat: this will change how docs are routed, but the behavior is only different for documents with IDs that would currently trigger an AIOOBE and thus fail to index, so I think it's fine.

          There is one corner case I can think of that does not currently throw an exception, but that hashes differently: two trailing "!"-s - String.split("!") (used currently to break up IDs into pieces, but not by my fixed version) only returns one piece for "A!!", but my changes would return two pieces (the second being empty). KeyParser adds another "piece" if it finds a trailing "!" in the key, so the current behavior will operate with pieces=2, my changes with pieces=3.

          I confirmed the problem: the hash produced by KeyParser.getHash() in the fixed version for "A!!" is different from the currentlly produced hash. (By contrast, the hash for "A!" is the same.)

          Should be fairly easy to change this corner case back to the previous behavior. I'll post a patch shortly.

          Show
          Steve Rowe added a comment - Thinking about back-compat: this will change how docs are routed, but the behavior is only different for documents with IDs that would currently trigger an AIOOBE and thus fail to index, so I think it's fine. There is one corner case I can think of that does not currently throw an exception, but that hashes differently: two trailing "!"-s - String.split("!") (used currently to break up IDs into pieces, but not by my fixed version) only returns one piece for "A!!", but my changes would return two pieces (the second being empty). KeyParser adds another "piece" if it finds a trailing "!" in the key, so the current behavior will operate with pieces=2, my changes with pieces=3. I confirmed the problem: the hash produced by KeyParser.getHash() in the fixed version for "A!!" is different from the currentlly produced hash. (By contrast, the hash for "A!" is the same.) Should be fairly easy to change this corner case back to the previous behavior. I'll post a patch shortly.
          Hide
          Steve Rowe added a comment -

          Modified patch fixing the back-compat issue.

          Added a CHANGES.txt entry.

          I think it's ready to go.

          Show
          Steve Rowe added a comment - Modified patch fixing the back-compat issue. Added a CHANGES.txt entry. I think it's ready to go.
          Hide
          Anshum Gupta added a comment -

          This looks good to me. The only suggestion is to move the test to TriLevelCompositeIdRoutingTest perhaps?
          But yes, keep it here if you think this works better.

          Show
          Anshum Gupta added a comment - This looks good to me. The only suggestion is to move the test to TriLevelCompositeIdRoutingTest perhaps? But yes, keep it here if you think this works better.
          Hide
          Steve Rowe added a comment -

          Thanks Anshum.

          The only suggestion is to move the test to TriLevelCompositeIdRoutingTest perhaps? But yes, keep it here if you think this works better.

          The two new tests are unit tests and don't require indexing/querying, so I put them with the other (non-cloud) tests in TestHashPartitioner - there were already a couple of other composite id router tests in there.

          I'll commit tomorrow if there are no objections.

          Show
          Steve Rowe added a comment - Thanks Anshum. The only suggestion is to move the test to TriLevelCompositeIdRoutingTest perhaps? But yes, keep it here if you think this works better. The two new tests are unit tests and don't require indexing/querying, so I put them with the other (non-cloud) tests in TestHashPartitioner - there were already a couple of other composite id router tests in there. I'll commit tomorrow if there are no objections.
          Hide
          Anshum Gupta added a comment -

          +1. LGTM!

          Show
          Anshum Gupta added a comment - +1. LGTM!
          Hide
          Steve Rowe added a comment - - edited

          Committed:

          Show
          Steve Rowe added a comment - - edited Committed: trunk: r1611934 branch_4x: r1611940

            People

            • Assignee:
              Steve Rowe
              Reporter:
              Steve Rowe
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development