Uploaded image for project: 'Solr'
  1. Solr
  2. SOLR-8266

Remove Java Serialization from the Streaming API

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Implemented
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      This is being done mainly for security reasons but it's also architecturally the right thing to do.

      Going forward only Streaming Expressions will be used to serialize Streaming API Objects.

      1. SOLR-8266.patch
        8 kB
        Jason Gerlowski
      2. SOLR-8266-comment-out-testParallelEOF.patch
        11 kB
        Jason Gerlowski

        Issue Links

          Activity

          Hide
          gerlowskija Jason Gerlowski added a comment -

          I can write a patch to try and get the ball rolling on this, unless you've started on it Joel?

          Show
          gerlowskija Jason Gerlowski added a comment - I can write a patch to try and get the ball rolling on this, unless you've started on it Joel?
          Hide
          joel.bernstein Joel Bernstein added a comment - - edited

          I haven't started. Feel free to start working on a patch. Here are the places that will need to be changed:

          1) StreamHandler: Remove code that handles Java de-serialization. Always use Streaming Expressions.
          2) ParallelStream: Remove code that handles Java serialization. Always use Streaming Expressions.
          3) StreamingTest: Change all parallels tests so that the StreamFactory is properly configured and added to the StreamContext. Then set the StreamContext on to the TupleStream before calling the getTuples() method. The StreamHandler does this logic so you can copy the behavior from there.

          I'll be happy to review your work.

          Show
          joel.bernstein Joel Bernstein added a comment - - edited I haven't started. Feel free to start working on a patch. Here are the places that will need to be changed: 1) StreamHandler: Remove code that handles Java de-serialization. Always use Streaming Expressions. 2) ParallelStream: Remove code that handles Java serialization. Always use Streaming Expressions. 3) StreamingTest: Change all parallels tests so that the StreamFactory is properly configured and added to the StreamContext. Then set the StreamContext on to the TupleStream before calling the getTuples() method. The StreamHandler does this logic so you can copy the behavior from there. I'll be happy to review your work.
          Hide
          gerlowskija Jason Gerlowski added a comment -

          Quick update- I haven't had time to get around to this yet. I'm still happy to take a look, but I won't be able to pick it up in the next few days, and I don't want to deter anyone else from working on it in the meantime.

          I'll comment again if I'm able to start on it.

          Show
          gerlowskija Jason Gerlowski added a comment - Quick update- I haven't had time to get around to this yet. I'm still happy to take a look, but I won't be able to pick it up in the next few days, and I don't want to deter anyone else from working on it in the meantime. I'll comment again if I'm able to start on it.
          Hide
          gerlowskija Jason Gerlowski added a comment -

          After having some time to read up on the Streaming API/Expression stuff, I've now got enough context to pick this up again. Should have a patch up later this morning.

          Show
          gerlowskija Jason Gerlowski added a comment - After having some time to read up on the Streaming API/Expression stuff, I've now got enough context to pick this up again. Should have a patch up later this morning.
          Hide
          gerlowskija Jason Gerlowski added a comment -

          Making the first two changes Joel laid out (StreamHandler and ParallelStream) seemed easy.

          I'm having a little more trouble with the third piece of this patch- tweaking StreamingTest's use of parallel streams to use a properly configured StreamFactory/StreamContext.

          The current patch gets through several of the parallel-stream tests, before hitting testParallelEOF and failing with the follow stack trace:

          java.io.IOException: java.util.concurrent.ExecutionException: java.io.IOException: --> https://127.0.0.1:35540/collection1/:Invalid stream expression count(merge(search(collection1,q="id:(4 1 8 7 9)",fl="id,a_s,a_i",sort="a_i asc",partitionKeys=a_i,zkHost="127.0.0.1:36053/solr"),search(collection1,q="id:(0 2 3 6)",fl="id,a_s,a_i",sort="a_i asc",partitionKeys=a_i,zkHost="127.0.0.1:36053/solr"),on="a_i asc")) - function 'count' is unknown (not mapped to a valid TupleStream)
          	at __randomizedtesting.SeedInfo.seed([19FC8C86B647325A:E5A7AB021C6ECB2F]:0)
          	at org.apache.solr.client.solrj.io.stream.CloudSolrStream.openStreams(CloudSolrStream.java:353)
          	at org.apache.solr.client.solrj.io.stream.CloudSolrStream.open(CloudSolrStream.java:234)
          	at org.apache.solr.client.solrj.io.stream.StreamingTest.getTuples(StreamingTest.java:1821)
          	at org.apache.solr.client.solrj.io.stream.StreamingTest.testParallelEOF(StreamingTest.java:1697)
          	at org.apache.solr.client.solrj.io.stream.StreamingTest.streamTests(StreamingTest.java:1795)
          
          

          I've still got some investigation to do here; probably just overlooking something simple. But it's late here, and I'd like to upload my progress so I can continue from a different computer in the morning. (And if anyone chimes in with thoughts/advice, all the better).

          Show
          gerlowskija Jason Gerlowski added a comment - Making the first two changes Joel laid out (StreamHandler and ParallelStream) seemed easy. I'm having a little more trouble with the third piece of this patch- tweaking StreamingTest's use of parallel streams to use a properly configured StreamFactory/StreamContext. The current patch gets through several of the parallel-stream tests, before hitting testParallelEOF and failing with the follow stack trace: java.io.IOException: java.util.concurrent.ExecutionException: java.io.IOException: --> https: //127.0.0.1:35540/collection1/:Invalid stream expression count(merge(search(collection1,q= "id:(4 1 8 7 9)" ,fl= "id,a_s,a_i" ,sort= "a_i asc" ,partitionKeys=a_i,zkHost= "127.0.0.1:36053/solr" ),search(collection1,q= "id:(0 2 3 6)" ,fl= "id,a_s,a_i" ,sort= "a_i asc" ,partitionKeys=a_i,zkHost= "127.0.0.1:36053/solr" ),on= "a_i asc" )) - function 'count' is unknown (not mapped to a valid TupleStream) at __randomizedtesting.SeedInfo.seed([19FC8C86B647325A:E5A7AB021C6ECB2F]:0) at org.apache.solr.client.solrj.io.stream.CloudSolrStream.openStreams(CloudSolrStream.java:353) at org.apache.solr.client.solrj.io.stream.CloudSolrStream.open(CloudSolrStream.java:234) at org.apache.solr.client.solrj.io.stream.StreamingTest.getTuples(StreamingTest.java:1821) at org.apache.solr.client.solrj.io.stream.StreamingTest.testParallelEOF(StreamingTest.java:1697) at org.apache.solr.client.solrj.io.stream.StreamingTest.streamTests(StreamingTest.java:1795) I've still got some investigation to do here; probably just overlooking something simple. But it's late here, and I'd like to upload my progress so I can continue from a different computer in the morning. (And if anyone chimes in with thoughts/advice, all the better).
          Hide
          joel.bernstein Joel Bernstein added a comment - - edited

          Jason,

          The key part of the error message is: " function 'count' is unknown (not mapped to a valid TupleStream)"

          The StreamFactory has to have each function mapped.

          Show
          joel.bernstein Joel Bernstein added a comment - - edited Jason, The key part of the error message is: " function 'count' is unknown (not mapped to a valid TupleStream)" The StreamFactory has to have each function mapped.
          Hide
          gerlowskija Jason Gerlowski added a comment -

          Yeah, I'd keyed in to that part of the error message too. Looking at StreamingTest, it looks like the StreamFactory does have a "count" mapping specified already: .withFunctionName("count", RecordCountStream.class).

          So either that line doesn't do what I thought it did, or something else is up here.

          But in any case this is a good excuse to dig a little deeper into the streaming code. Will be taking another pass this afternoon.

          Show
          gerlowskija Jason Gerlowski added a comment - Yeah, I'd keyed in to that part of the error message too. Looking at StreamingTest , it looks like the StreamFactory does have a "count" mapping specified already: .withFunctionName("count", RecordCountStream.class) . So either that line doesn't do what I thought it did, or something else is up here. But in any case this is a good excuse to dig a little deeper into the streaming code. Will be taking another pass this afternoon.
          Hide
          dpgove Dennis Gove added a comment - - edited

          While that mapping is setup in the test you've got to remember that in a parallel situation the expression is passed off to a different process which doesn't use the factory created in the test. Instead it uses a factory created in StreamHandler. That one has some defaults set and also reads solrconfig.xml to get any new mappings. Check that factory to make sure count is mapped there. I've tried to keep the default up to date with all default stream classes but may have missed some.

          This setup is there so that people can add their own implementation of both streams and expressions (and really anything else that is in the Expressible hierarchy).

          Show
          dpgove Dennis Gove added a comment - - edited While that mapping is setup in the test you've got to remember that in a parallel situation the expression is passed off to a different process which doesn't use the factory created in the test. Instead it uses a factory created in StreamHandler. That one has some defaults set and also reads solrconfig.xml to get any new mappings. Check that factory to make sure count is mapped there. I've tried to keep the default up to date with all default stream classes but may have missed some. This setup is there so that people can add their own implementation of both streams and expressions (and really anything else that is in the Expressible hierarchy).
          Hide
          joel.bernstein Joel Bernstein added a comment -

          Just checked and the StreamHandler is using the CountMetric. It may be that this test is incompatible with the current StreamHandler setup.

          Probably best to just comment it out for now. If you run into other test errors dealing with count comment them out as well.

          I'll take a look a those tests when I review the patch.

          Show
          joel.bernstein Joel Bernstein added a comment - Just checked and the StreamHandler is using the CountMetric. It may be that this test is incompatible with the current StreamHandler setup. Probably best to just comment it out for now. If you run into other test errors dealing with count comment them out as well. I'll take a look a those tests when I review the patch.
          Hide
          gerlowskija Jason Gerlowski added a comment - - edited

          Thanks for the early look/help guys. I've updated the patch to align with your suggestions. The tests all pass now, with the caveat that I had to comment out the testParallelEOF call in testStreams(). Everything else seems to work fine.

          While I'm confident in the code I wrote, I do have some questions about how this code is working:

          1.) I just wanted to double-check my understanding of the StreamFactory configuration discrepancies causing the stack trace I mentioned above.

          StreamFactory is used to convert between TupleStreams and StreamExpressionParameters and are used in both SolrJ ( ParallelStream for example) and SOLR (StreamHandler). These factories need to be in sync. If not, SolrJ can end up sending serialized values to Solr that it doesn't know what to do with (as in the conflicting "count" types from a few comments ago).

          Just trying to make sure I understand the big picture, so I can make edits with more confidence next time.

          2.) I noticed that StreamFactory.withFunctionName() takes any Class as it's second argument. Is there any value in narrowing the type of that parameter, so that it's harder to misconfigure StreamFactory? (i.e. withFunctionName(String name, Class<? extends Expressible> clazz)).

          3.) StreamingTest has a single test that triggers 20ish independent sub-tests (e.g. testParallelEOF, testStatsStream). Updating these tests to use the StreamFactory was a bit more painful that it probably could've been. With the sub-tests all bundled together, the process for fixing them looked like: (a) run them all, (b) wait for a failure, (c) scroll back through the output to see which clause failed, (d) fix and repeat. Had the sub-tests been separate JUnit tests, identifying the failing pieces and fixing them would've been easier. Is there a historical reason theses tests have all been grouped together (performance/overhead for example)? If not, do you think there would be any pushback on splitting these test clauses apart (as a part of a separate JIRA).

          Show
          gerlowskija Jason Gerlowski added a comment - - edited Thanks for the early look/help guys. I've updated the patch to align with your suggestions. The tests all pass now, with the caveat that I had to comment out the testParallelEOF call in testStreams(). Everything else seems to work fine. While I'm confident in the code I wrote, I do have some questions about how this code is working: 1.) I just wanted to double-check my understanding of the StreamFactory configuration discrepancies causing the stack trace I mentioned above. StreamFactory is used to convert between TupleStreams and StreamExpressionParameters and are used in both SolrJ ( ParallelStream for example) and SOLR ( StreamHandler ). These factories need to be in sync. If not, SolrJ can end up sending serialized values to Solr that it doesn't know what to do with (as in the conflicting "count" types from a few comments ago). Just trying to make sure I understand the big picture, so I can make edits with more confidence next time. 2.) I noticed that StreamFactory.withFunctionName() takes any Class as it's second argument. Is there any value in narrowing the type of that parameter, so that it's harder to misconfigure StreamFactory ? (i.e. withFunctionName(String name, Class<? extends Expressible> clazz) ). 3.) StreamingTest has a single test that triggers 20ish independent sub-tests (e.g. testParallelEOF, testStatsStream). Updating these tests to use the StreamFactory was a bit more painful that it probably could've been. With the sub-tests all bundled together, the process for fixing them looked like: (a) run them all, (b) wait for a failure, (c) scroll back through the output to see which clause failed, (d) fix and repeat. Had the sub-tests been separate JUnit tests, identifying the failing pieces and fixing them would've been easier. Is there a historical reason theses tests have all been grouped together (performance/overhead for example)? If not, do you think there would be any pushback on splitting these test clauses apart (as a part of a separate JIRA).
          Hide
          joel.bernstein Joel Bernstein added a comment -

          #1: You've got the big picture right.

          #2: Probably a good idea.

          #3: The rather unwieldy StreamingTest will eventually be split into smaller test cases, one for each stream. Same with the StreamingExpressions tests. We should probably add a ticket for this.

          Show
          joel.bernstein Joel Bernstein added a comment - #1: You've got the big picture right. #2: Probably a good idea. #3: The rather unwieldy StreamingTest will eventually be split into smaller test cases, one for each stream. Same with the StreamingExpressions tests. We should probably add a ticket for this.
          Hide
          joel.bernstein Joel Bernstein added a comment -

          Jason Gerlowski, thanks for patch!

          I'll will review it in the next day or two. You didn't mention but I'm assuming this is a trunk patch?

          Also typically you would update the same patch file name (SOLR-8266.patch) and jira will display them all. I would then pick your latest patch to apply. Don't worry about it in this case, I'll just take your latest patch file.

          Show
          joel.bernstein Joel Bernstein added a comment - Jason Gerlowski , thanks for patch! I'll will review it in the next day or two. You didn't mention but I'm assuming this is a trunk patch? Also typically you would update the same patch file name ( SOLR-8266 .patch) and jira will display them all. I would then pick your latest patch to apply. Don't worry about it in this case, I'll just take your latest patch file.
          Hide
          gerlowskija Jason Gerlowski added a comment -

          No problem; hopefully you're still as thankful after giving it a review :-p

          I did write it against trunk, though I can put it on top of another branch if you'd prefer.

          I was aware of the patch naming convention. I broke it in an effort to call out that this patch contains a commented out test, and (probably? maybe?) shouldn't be committed as-is. If that break from convention isn't worthwhile though, I'll stop doing that sort of thing. Still learning the ropes here a bit.

          As for the questions/change-suggestions in my earlier comment, I'll throw together JIRAs for those and toss up a starter patch on them.

          Show
          gerlowskija Jason Gerlowski added a comment - No problem; hopefully you're still as thankful after giving it a review :-p I did write it against trunk, though I can put it on top of another branch if you'd prefer. I was aware of the patch naming convention. I broke it in an effort to call out that this patch contains a commented out test, and (probably? maybe?) shouldn't be committed as-is. If that break from convention isn't worthwhile though, I'll stop doing that sort of thing. Still learning the ropes here a bit. As for the questions/change-suggestions in my earlier comment, I'll throw together JIRAs for those and toss up a starter patch on them.
          Hide
          mdrob Mike Drob added a comment -

          I was aware of the patch naming convention. I broke it in an effort to call out that this patch contains a commented out test, and (probably? maybe?) shouldn't be committed as-is. If that break from convention isn't worthwhile though, I'll stop doing that sort of thing. Still learning the ropes here a bit.

          I've seen folks put a //nocommit tag to intentionally fail the validation check in cases like these.

          Show
          mdrob Mike Drob added a comment - I was aware of the patch naming convention. I broke it in an effort to call out that this patch contains a commented out test, and (probably? maybe?) shouldn't be committed as-is. If that break from convention isn't worthwhile though, I'll stop doing that sort of thing. Still learning the ropes here a bit. I've seen folks put a //nocommit tag to intentionally fail the validation check in cases like these.
          Hide
          joel.bernstein Joel Bernstein added a comment -

          The patch looks great. I changed the testParallelEOF method to not test for count. Now it just tests that an EOF Tuple is collected from each worker. No current streams actually use the EOF Tuples from the workers so this is really only a place holder until we have real functionality.

          Running all tests and precommit and then I plan to commit.

          Show
          joel.bernstein Joel Bernstein added a comment - The patch looks great. I changed the testParallelEOF method to not test for count. Now it just tests that an EOF Tuple is collected from each worker. No current streams actually use the EOF Tuples from the workers so this is really only a place holder until we have real functionality. Running all tests and precommit and then I plan to commit.
          Hide
          joel.bernstein Joel Bernstein added a comment -

          As this ticket also removes the security risk associated with Object serialization, I plan to also uncomment the /stream handler in the sample solrconfig.xml so the SQL interface etc... will be operational out of the box.

          The /stream handler now only excepts a Streaming Expression query and returns json.

          Show
          joel.bernstein Joel Bernstein added a comment - As this ticket also removes the security risk associated with Object serialization, I plan to also uncomment the /stream handler in the sample solrconfig.xml so the SQL interface etc... will be operational out of the box. The /stream handler now only excepts a Streaming Expression query and returns json.
          Hide
          jira-bot ASF subversion and git services added a comment -

          Commit 1718947 from Joel Bernstein in branch 'dev/trunk'
          [ https://svn.apache.org/r1718947 ]

          SOLR-8266: Remove Java Serialization from the Streaming API

          Show
          jira-bot ASF subversion and git services added a comment - Commit 1718947 from Joel Bernstein in branch 'dev/trunk' [ https://svn.apache.org/r1718947 ] SOLR-8266 : Remove Java Serialization from the Streaming API
          Hide
          joel.bernstein Joel Bernstein added a comment -
          Show
          joel.bernstein Joel Bernstein added a comment - Thanks Jason Gerlowski !
          Hide
          gerlowskija Jason Gerlowski added a comment -

          I've added a patch addressing #2 in SOLR-8385.

          I've added a patch addressing #3 in SOLR-8432.

          Show
          gerlowskija Jason Gerlowski added a comment - I've added a patch addressing #2 in SOLR-8385 . I've added a patch addressing #3 in SOLR-8432 .

            People

            • Assignee:
              Unassigned
              Reporter:
              joel.bernstein Joel Bernstein
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development