Solr
  1. Solr
  2. SOLR-1817

Fix Solr error reporting to work correctly with multicore

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 4.0
    • Component/s: None
    • Labels:
      None

      Description

      Here is a rough patch that attempts to fix how error reporting works with multi-core (not in terms of logs, but what you see on an http request).

      The patch is not done - more to consider and havn't worked with how this changes solrconfigs abortOnConfigurationError, but the basics are here.

      If you attempt to access the path of a core that could not load, you are shown the errors that kept the core from properly loading.

      1. SOLR-1817.patch
        13 kB
        Mark Miller
      2. SOLR-1817.patch
        10 kB
        Mark Miller
      3. SOLR-1817.patch
        10 kB
        Mark Miller

        Issue Links

          Activity

          Hide
          Mark Miller added a comment -

          current patch - still doesn't work right if the problem is with the default core

          Show
          Mark Miller added a comment - current patch - still doesn't work right if the problem is with the default core
          Hide
          Mark Miller added a comment -

          Okay, prob done playing for a bit - this patch mostly fixes the issue where there is an error loading the default core - it will show you the error and other cores will work.

          Main prob with it is that at /solr, instead seeing a list of cores that did load (like if the error was not with the default core), you see the default core error.

          Getting closer though - need to resolve that, as well as some other cruft - certain situations still use SolrConfig.severErrors - hopefully just issues with solr.xml - or anything else higher than a Core level problem - that all still needs to be checked and fleshed out though.

          Show
          Mark Miller added a comment - Okay, prob done playing for a bit - this patch mostly fixes the issue where there is an error loading the default core - it will show you the error and other cores will work. Main prob with it is that at /solr, instead seeing a list of cores that did load (like if the error was not with the default core), you see the default core error. Getting closer though - need to resolve that, as well as some other cruft - certain situations still use SolrConfig.severErrors - hopefully just issues with solr.xml - or anything else higher than a Core level problem - that all still needs to be checked and fleshed out though.
          Hide
          Mark Miller added a comment -

          More of the solution. If you access localhost:8983/solr it will now show the cores list (those that loaded) rather than the default cores errors (if it didn't load).

          That takes care of most of the core path handling from what I can see - trying to access cores that don't exist gives 404 - trying to access cores that couldn't load shows the error, trying to access anything that should work, should work. Accessing /solr shows the core list.

          The open points remain.

          Show
          Mark Miller added a comment - More of the solution. If you access localhost:8983/solr it will now show the cores list (those that loaded) rather than the default cores errors (if it didn't load). That takes care of most of the core path handling from what I can see - trying to access cores that don't exist gives 404 - trying to access cores that couldn't load shows the error, trying to access anything that should work, should work. Accessing /solr shows the core list. The open points remain.
          Hide
          Mark Miller added a comment -

          Just did a check - solr.xml errors will also be displayed - no matter what path you access.

          So the main thing to work out is the continue stuff. I'm kind of thinking it should apply to each core (like it used to), and there should be another setting in solr.xml that continues if a core can't load, or fails if any can't load.

          Show
          Mark Miller added a comment - Just did a check - solr.xml errors will also be displayed - no matter what path you access. So the main thing to work out is the continue stuff. I'm kind of thinking it should apply to each core (like it used to), and there should be another setting in solr.xml that continues if a core can't load, or fails if any can't load.
          Hide
          Hoss Man added a comment -

          Hey mark: i've only had a chance to skim your patch so far, and i'm still not sure if i have jury duty today, so i don't know if i'll have any time to really test it out this afternoon, but here are my quick impressions (mixed with my thoughts on how to do this before i saw your patch):

          1) fundementally we have two differnet kinds of initialization exceptions – the ones SolrCore can deal with and keep going, and the ones that are complete show stoppers. Regardless of what the "abortOnServerCOnfError" configuration looks like, it seems like these exceptions should be tracked separately. We should let SolrCore catch and keep track of any exceptions that it can "ignore" while still providing functionality; but if anything it can't deal with occurs it should just throw it and then let caller (ie: CoreContainer) keep track of it.

          That way SOlrCore (and the errors it's tracking) are still usable by embedded users who may not even be using coreContainer (i think there's an NPE possibility there in your current patch ... if people construct a SolrCore without a CoreDescriptor)

          2) It looks like you still have SolrDispatchFilter looking at SolrConfig.severeErrors. It seems like the logic there should be something like...

          SolrCore core = coreContainer.getCoreNyName(corepath)
          if (null == core) {
            Throwable err = coreContainer.getCoreInitError(corepath)
            if (null == err) {
              write_init_errors_for_core(corepath, err)
            }
          }
          if (core.abortOnConfigError() && 0 < core.getSevereErrors().size()) {
             write_init_errors_for_core(corepath, err)
          }
          

          3) we should think about how the no-arg behavior of the CoreAdminHandler should deal with reporting about cores that couldn't be initialized

          Show
          Hoss Man added a comment - Hey mark: i've only had a chance to skim your patch so far, and i'm still not sure if i have jury duty today, so i don't know if i'll have any time to really test it out this afternoon, but here are my quick impressions (mixed with my thoughts on how to do this before i saw your patch): 1) fundementally we have two differnet kinds of initialization exceptions – the ones SolrCore can deal with and keep going, and the ones that are complete show stoppers. Regardless of what the "abortOnServerCOnfError" configuration looks like, it seems like these exceptions should be tracked separately. We should let SolrCore catch and keep track of any exceptions that it can "ignore" while still providing functionality; but if anything it can't deal with occurs it should just throw it and then let caller (ie: CoreContainer) keep track of it. That way SOlrCore (and the errors it's tracking) are still usable by embedded users who may not even be using coreContainer (i think there's an NPE possibility there in your current patch ... if people construct a SolrCore without a CoreDescriptor) 2) It looks like you still have SolrDispatchFilter looking at SolrConfig.severeErrors. It seems like the logic there should be something like... SolrCore core = coreContainer.getCoreNyName(corepath) if ( null == core) { Throwable err = coreContainer.getCoreInitError(corepath) if ( null == err) { write_init_errors_for_core(corepath, err) } } if (core.abortOnConfigError() && 0 < core.getSevereErrors().size()) { write_init_errors_for_core(corepath, err) } 3) we should think about how the no-arg behavior of the CoreAdminHandler should deal with reporting about cores that couldn't be initialized
          Hide
          Mark Miller added a comment -

          1 and 2) I agree - this is part of the big open issue I think is left here - how to properly deal with abortOnServerConfError. So far I havn't really dabbled there. Currently, I don't think any of that works in a manner that is very friendly with Embedded Solr. Its really focused on http - its not great for tests either.

          I do think we should differentiate between the two types of errors as well - in fact, I think most of the errors where you can still continue still use SolrConfig.severeErrors because some of them don't have access to the core name. That is the issue where that possible NPE is - getting access to the core name. I'm not sure what the right solution for these cases is yet. I think figuring all this out will help with the issue yonik recently filed as well - where it will continue if it couldn't load a filter and possibly index incorrectly. That should be a show stopper error.

          3) yes - I agree - it would be great to see a list of the cores that could not load - then you could click them or something and see the exception page.

          Show
          Mark Miller added a comment - 1 and 2) I agree - this is part of the big open issue I think is left here - how to properly deal with abortOnServerConfError. So far I havn't really dabbled there. Currently, I don't think any of that works in a manner that is very friendly with Embedded Solr. Its really focused on http - its not great for tests either. I do think we should differentiate between the two types of errors as well - in fact, I think most of the errors where you can still continue still use SolrConfig.severeErrors because some of them don't have access to the core name. That is the issue where that possible NPE is - getting access to the core name. I'm not sure what the right solution for these cases is yet. I think figuring all this out will help with the issue yonik recently filed as well - where it will continue if it couldn't load a filter and possibly index incorrectly. That should be a show stopper error. 3) yes - I agree - it would be great to see a list of the cores that could not load - then you could click them or something and see the exception page.
          Hide
          Hoss Man added a comment -

          Some rore comments now that i've read things a little more in depth...

          • I should have read your comments more carefully, you already noted the remaining usages of SolrConfig.severErrors
          • two of the places you removed "adds" to SolrConfig.severErrors are in IndexSchema, where exceptions are logged, but not thrown (so your new code never sees them).
            • Personally i think this is fine, because I don't think those two situations really fit the definition of a "severe init error" as it was designed (ie: a plugin like a request handler which might not be used in many situations can't be initialized).
            • I think errors in IndexSchema init should either be fatal (ie: thrown by the constructor and prevent the core from ever working) or just logged as being "bad news"
            • FWIW, the two code places i'm talking about are...
              • when a field is declared twice
              • when a dynamicfield is declared twice
          • why is "admin" suddenly a magic alias for "" in SolrDispatchFilter? (line 196)
          • the big comment about servlet container behavior if you throw an error during init doesn't make sense where you copied it (in doFilter, line 292)
          Show
          Hoss Man added a comment - Some rore comments now that i've read things a little more in depth... I should have read your comments more carefully, you already noted the remaining usages of SolrConfig.severErrors two of the places you removed "adds" to SolrConfig.severErrors are in IndexSchema, where exceptions are logged, but not thrown (so your new code never sees them). Personally i think this is fine, because I don't think those two situations really fit the definition of a "severe init error" as it was designed (ie: a plugin like a request handler which might not be used in many situations can't be initialized). I think errors in IndexSchema init should either be fatal (ie: thrown by the constructor and prevent the core from ever working) or just logged as being "bad news" FWIW, the two code places i'm talking about are... when a field is declared twice when a dynamicfield is declared twice why is "admin" suddenly a magic alias for "" in SolrDispatchFilter? (line 196) the big comment about servlet container behavior if you throw an error during init doesn't make sense where you copied it (in doFilter, line 292)
          Hide
          Mark Miller added a comment -

          two of the places you removed "adds" to SolrConfig.severErrors are in IndexSchema,

          Yeah, actually locally I had put those back for now - I didnt really mean to leave that in the patch - I originally started by removing SolrConfig.severErrors and trying to work stop using it, and I just removed those - but easy to lose track of fixing those points so I had put them back.

          why is "admin" suddenly a magic alias for "" in SolrDispatchFilter? (line 196)

          I had to do that to get /solr from displaying exceptions when the default core couldn't load I think - that and another change.
          That is all kind of weird right now anyway - when the dispatchFilter sees /solr/admin, it actually looks for a core named admin - and just doesn't (hopefully) find it, so that it can continue and actually load the admin page. But anyway, at the bottom of that method, if /solr/admin could not be loaded, I actually need to look for severe errors from the "" core - not the admin core. So right now it's simply a workaround for that case. It would normally fall into that if statement anyway - because trying to get the 'admin' core would return null. But in this case, I only want it to fall in if it sees admin - otherwise - ugg, I can't remember, but something doesn't work right - I'd have to get back into it - I spent a bunch of time using the debugger and trying all the different paths to figure out how to get each case to work. I setup a default core and another core and took turns breaking one or the other.

          the big comment about servlet container behavior...

          Indeed, that needs to go.

          Show
          Mark Miller added a comment - two of the places you removed "adds" to SolrConfig.severErrors are in IndexSchema, Yeah, actually locally I had put those back for now - I didnt really mean to leave that in the patch - I originally started by removing SolrConfig.severErrors and trying to work stop using it, and I just removed those - but easy to lose track of fixing those points so I had put them back. why is "admin" suddenly a magic alias for "" in SolrDispatchFilter? (line 196) I had to do that to get /solr from displaying exceptions when the default core couldn't load I think - that and another change. That is all kind of weird right now anyway - when the dispatchFilter sees /solr/admin, it actually looks for a core named admin - and just doesn't (hopefully) find it, so that it can continue and actually load the admin page. But anyway, at the bottom of that method, if /solr/admin could not be loaded, I actually need to look for severe errors from the "" core - not the admin core. So right now it's simply a workaround for that case. It would normally fall into that if statement anyway - because trying to get the 'admin' core would return null. But in this case, I only want it to fall in if it sees admin - otherwise - ugg, I can't remember, but something doesn't work right - I'd have to get back into it - I spent a bunch of time using the debugger and trying all the different paths to figure out how to get each case to work. I setup a default core and another core and took turns breaking one or the other. the big comment about servlet container behavior... Indeed, that needs to go.
          Hide
          Mark Miller added a comment -

          why is "admin" suddenly a magic alias for "" in SolrDispatchFilter? (line 196)

          Okay, looking closer, I think its this - if you go to a path for a core name eg /solr/core1, and it there is no core, and so there where no errors, if its drops in the there, the core name will change to "" - and at the bottom of that method when it prints out the erorrs, it will get the errors for core "". If if the default core couldnt load, any path that should get you a 404, will actually show the default core errors.

          That check is actually just there so that if you ask for solr/admin, you will end up getting the "" core - so it makes sense to only allow it in if the corename is admin anyway.

          Though I have never really liked that logic where it looks for the admin core and when it can't find it it drops to the "" core.

          Show
          Mark Miller added a comment - why is "admin" suddenly a magic alias for "" in SolrDispatchFilter? (line 196) Okay, looking closer, I think its this - if you go to a path for a core name eg /solr/core1, and it there is no core, and so there where no errors, if its drops in the there, the core name will change to "" - and at the bottom of that method when it prints out the erorrs, it will get the errors for core "". If if the default core couldnt load, any path that should get you a 404, will actually show the default core errors. That check is actually just there so that if you ask for solr/admin, you will end up getting the "" core - so it makes sense to only allow it in if the corename is admin anyway. Though I have never really liked that logic where it looks for the admin core and when it can't find it it drops to the "" core.
          Hide
          Hoss Man added a comment -

          this is part of the big open issue I think is left here - how to properly deal with abortOnServerConfError.

          Here's what i think makes the most sense in a multi-core world, and is the most in the "spirit" of what that options was ment to do when it was added for single cores.

          a) SolrCore should itself maintain a list of "Severe Initialization Exceptions" that it was able to "get passed" when initializing itself. specificly: when a plugin could not be initialized, and it therefore is ignoring that plugin declaration.

          b) SolrCore should expose an easy way of asking it for it's list of initialization exceptions

          c) SolrCore should pay attention to wether it's solrconfig.xml file indicats if the core should be usable if there were severe initialization exceptions.

          d) SolrCore should refuse to "execute" any requests if (a) contains Exceptions and (c) is true

          e) SolrCore should throw any exceptions it can't "get passed"

          f) CoreContainer should keep track of which core names completely failed to initialize, and what exception was encountered while trying (ie: Map<SolrCore,Throwable> ... no List needed). This should be the first exception involved – even if it came from trying to instantiate the IndexSchema, or parse the solrcofig.xml file before it ever got to the SolrCore. CoreContainer shouldn't know/care about (a) or (c)

          g) CoreContainer should provide an easy way to query for (f) by core name

          h) If SolrDispatchFilter asks CoreContainer for a corename, and no SolrCore is found with that name, it should then use (g) to generate an error message

          i) SolrDispatchFilter shouldn't know/care about (a) or (c) ... it should just ask SolrCore to execute a request, and SolrCore should fail as needed based on it's settings (this will potentially allow things like SOLR-141 to work even with init errors, as long as the ResponseWriter was initialized successfully)

          j) SolrConfig.severeErrors should be deprecated, but for back-compat SolrCore and CoreContainer can add to it whenever they add an exception to their own internal state.

          k) CoreContainer.Initializer.*AbortOnConfigurationError should be deprecated, but can still continue to provide the same behavior they do on the trunk (ie: influence the "default" value for each core prior to init, and return true if any of the cores have a value of "true" for that property after init)

          l) we could concievable make solr.xml have it's own "abortOnConfigError" type property, but frankly i think if there are any errors in solr.xml, that should just be a stop the world type situation, where CoreContainer.Initializer.initialize() just throws a big fat error and CoreContainer deals with it ... i can't think of any good that could possibly come from letting solr proceed when it encounteres an error in solr.xml

          Show
          Hoss Man added a comment - this is part of the big open issue I think is left here - how to properly deal with abortOnServerConfError. Here's what i think makes the most sense in a multi-core world, and is the most in the "spirit" of what that options was ment to do when it was added for single cores. a) SolrCore should itself maintain a list of "Severe Initialization Exceptions" that it was able to "get passed" when initializing itself. specificly: when a plugin could not be initialized, and it therefore is ignoring that plugin declaration. b) SolrCore should expose an easy way of asking it for it's list of initialization exceptions c) SolrCore should pay attention to wether it's solrconfig.xml file indicats if the core should be usable if there were severe initialization exceptions. d) SolrCore should refuse to "execute" any requests if (a) contains Exceptions and (c) is true e) SolrCore should throw any exceptions it can't "get passed" f) CoreContainer should keep track of which core names completely failed to initialize, and what exception was encountered while trying (ie: Map<SolrCore,Throwable> ... no List needed). This should be the first exception involved – even if it came from trying to instantiate the IndexSchema, or parse the solrcofig.xml file before it ever got to the SolrCore. CoreContainer shouldn't know/care about (a) or (c) g) CoreContainer should provide an easy way to query for (f) by core name h) If SolrDispatchFilter asks CoreContainer for a corename, and no SolrCore is found with that name, it should then use (g) to generate an error message i) SolrDispatchFilter shouldn't know/care about (a) or (c) ... it should just ask SolrCore to execute a request, and SolrCore should fail as needed based on it's settings (this will potentially allow things like SOLR-141 to work even with init errors, as long as the ResponseWriter was initialized successfully) j) SolrConfig.severeErrors should be deprecated, but for back-compat SolrCore and CoreContainer can add to it whenever they add an exception to their own internal state. k) CoreContainer.Initializer.*AbortOnConfigurationError should be deprecated, but can still continue to provide the same behavior they do on the trunk (ie: influence the "default" value for each core prior to init, and return true if any of the cores have a value of "true" for that property after init) l) we could concievable make solr.xml have it's own "abortOnConfigError" type property, but frankly i think if there are any errors in solr.xml, that should just be a stop the world type situation, where CoreContainer.Initializer.initialize() just throws a big fat error and CoreContainer deals with it ... i can't think of any good that could possibly come from letting solr proceed when it encounteres an error in solr.xml
          Hide
          Hoss Man added a comment -

          Ugh... lots of "cross talk", sorry still procesisng some of your earlier comments...

          That check is actually just there so that if you ask for solr/admin, you will end up getting the "" core - so it makes sense to only allow it in if the corename is admin anyway.

          Though I have never really liked that logic where it looks for the admin core and when it can't find it it drops to the "" core.

          Hmmm... so then this is special behavior needed to make the "admin/*.jsp" type URLs work with the default core?

          then why was this check only added as part of your patch for this issue? how do the admin JSPs work on the current trunk?

          In either case: why do we need to specificly test for "admin" shouldn't that code path just fall through regardless of the patch? ie:

          if (core == null && errors == null) { 
            corename=""; 
            core=cores.getCore(corename); 
          }
          

          ?

          Show
          Hoss Man added a comment - Ugh... lots of "cross talk", sorry still procesisng some of your earlier comments... That check is actually just there so that if you ask for solr/admin, you will end up getting the "" core - so it makes sense to only allow it in if the corename is admin anyway. Though I have never really liked that logic where it looks for the admin core and when it can't find it it drops to the "" core. Hmmm... so then this is special behavior needed to make the "admin/*.jsp" type URLs work with the default core? then why was this check only added as part of your patch for this issue? how do the admin JSPs work on the current trunk? In either case: why do we need to specificly test for "admin" shouldn't that code path just fall through regardless of the patch? ie: if (core == null && errors == null ) { corename=""; core=cores.getCore(corename); } ?
          Hide
          Hoss Man added a comment -

          That is the issue where that possible NPE is - getting access to the core name.

          Just ot be clear: it's not just the core name – you've got code that assumes a SolrCore.getCoreDescriptor() will allways be non null, but that's not allways going to be true.

          Show
          Hoss Man added a comment - That is the issue where that possible NPE is - getting access to the core name. Just ot be clear: it's not just the core name – you've got code that assumes a SolrCore.getCoreDescriptor() will allways be non null, but that's not allways going to be true.
          Hide
          Mark Miller added a comment -

          In either case: why do we need to specificly test for "admin" shouldn't that code path just fall through regardless of the patch? ie:

          Just a simple way to get what I needed working. I use that corename later to get the exceptions for the core. The second code block below that comes later.
          So normally, any core that cannot be found, it will then look for the "" core. This includes when you goto solr/admin. When you try to go to solr/admin it first tries to
          get the admin core - when it cannot find it, it falls into the first block below and gets the "" core. This is so that it can put it in the request for the jsp to access. Then when
          it doesn't find the handler, it just falls through and the jsp ends up handling it. But that fall through only makes sense when the path is solr/admin. It just didn't hurt when it
          was /solr/somecorethatdoesntexist - cause you end up with a 404 anyway - so it doesn't matter that the wrong core was put into the request.

          if (core == null && errors == null) { 
            corename=""; 
            core=cores.getCore(corename); 
          }
          
          List<Throwable> errors = cores.getSevereErrors(corename);
          

          So it doesn't really affect anything if you only drop in there and get the "" if the core it was looking for was admin. but with my current logic - if you have multiple cores, and the "" core could not load (so it has errors), and you then try and go to solr/corethatdoesnotexist - if it drops into that first block and sets the corename to "", later it will print the "" core errors. When you really wanted to see a 404.

          So its a simple work around for that - and because that first block was only helpful in the solr/admin case, it should still work.

          I don't know that this is all how things should end up - I think this was all originally kind of messy - like I said, my first step was getting every url to do what I would expect given a combination of default cores and other cores, and one or the other not working - and trying to access cores that don't exist.

          Hope thats a better explanation.

          Show
          Mark Miller added a comment - In either case: why do we need to specificly test for "admin" shouldn't that code path just fall through regardless of the patch? ie: Just a simple way to get what I needed working. I use that corename later to get the exceptions for the core. The second code block below that comes later. So normally, any core that cannot be found, it will then look for the "" core. This includes when you goto solr/admin. When you try to go to solr/admin it first tries to get the admin core - when it cannot find it, it falls into the first block below and gets the "" core. This is so that it can put it in the request for the jsp to access. Then when it doesn't find the handler, it just falls through and the jsp ends up handling it. But that fall through only makes sense when the path is solr/admin. It just didn't hurt when it was /solr/somecorethatdoesntexist - cause you end up with a 404 anyway - so it doesn't matter that the wrong core was put into the request. if (core == null && errors == null ) { corename=""; core=cores.getCore(corename); } List<Throwable> errors = cores.getSevereErrors(corename); So it doesn't really affect anything if you only drop in there and get the "" if the core it was looking for was admin. but with my current logic - if you have multiple cores, and the "" core could not load (so it has errors), and you then try and go to solr/corethatdoesnotexist - if it drops into that first block and sets the corename to "", later it will print the "" core errors. When you really wanted to see a 404. So its a simple work around for that - and because that first block was only helpful in the solr/admin case, it should still work. I don't know that this is all how things should end up - I think this was all originally kind of messy - like I said, my first step was getting every url to do what I would expect given a combination of default cores and other cores, and one or the other not working - and trying to access cores that don't exist. Hope thats a better explanation.
          Hide
          Mark Miller added a comment -

          That is the issue where that possible NPE is - getting access to the core name.

          Just ot be clear: it's not just the core name - you've got code that assumes a SolrCore.getCoreDescriptor() will allways be non null, but that's not allways going to be true.

          Right - I'm just saying that I'm doing that to get the corename - I realize the issue with it, but I need the corename. So when we solve how we are going to handle needing the core name where we don't have it and need to register an error, hopefully that will apply here too and we won't need the core descriptor.

          Thats what I was thinking at least - now I realize that's dumb - I can just do solrcore.getName() - wasn't thinking clearly when I put that in I guess. My bad.

          Show
          Mark Miller added a comment - That is the issue where that possible NPE is - getting access to the core name. Just ot be clear: it's not just the core name - you've got code that assumes a SolrCore.getCoreDescriptor() will allways be non null, but that's not allways going to be true. Right - I'm just saying that I'm doing that to get the corename - I realize the issue with it, but I need the corename. So when we solve how we are going to handle needing the core name where we don't have it and need to register an error, hopefully that will apply here too and we won't need the core descriptor. Thats what I was thinking at least - now I realize that's dumb - I can just do solrcore.getName() - wasn't thinking clearly when I put that in I guess. My bad.
          Hide
          Mark Miller added a comment -

          Ugggg - I'm sorry - I'm being confusing - it wasn't to get the corename - it was to get the corecontainer the register the error.

          This will end up fine though - because thats an error that will end up being tracked by the core rather than the container based upon what you laid out above.

          Show
          Mark Miller added a comment - Ugggg - I'm sorry - I'm being confusing - it wasn't to get the corename - it was to get the corecontainer the register the error. This will end up fine though - because thats an error that will end up being tracked by the core rather than the container based upon what you laid out above.
          Hide
          Hoss Man added a comment -

          I started looking a little more closely at the singleton SolrCOnfig.severErrors, since eliminating usage of it is really the key to being able to support abortOnConfigurationError=true/false for multiple cores independently. The more I looked at it, the more this whole thing seems futile.

          For starters: I discovered SOLR-1832 ... in a nutshell a failure to init most types of plugins in 1.4 causes SolrCore to fail to init, regardless of whether abortOnConfigurationError=false. The only types of plugins where initialization failures are logged but the remaining instances are loaded anyway is RequestHandlers and schema related classes (ie: FieldType and Token*Factories) ... but as noted in SOLR-1824 it's actually a really, really, REALLY bad thing for IndexSchema to "ignore" when a FieldType or analysis factory can't be initialized, because it could result in incorrect values getting indexed.

          So we could:

          • fix SOLR-1824 so that any init error in IndexSchema caused a hard fail.
          • fix SOLR-1832 so that SolrCore.initPlugins "skipped" any instances that failed to init and recorded the exceptions directly with the SolrCore.
          • officially deprecated the *PluginLoader classes, and remove the spots where it adds to SolrCOnfig.severErrors

          ...so then we wouldn't have anyone writting to SolrCOnfig.severErrors anymore.

          But should we even bother?

          I'm starting to think the whole idea of abortOnConfigurationError is a bad idea ... especially if no one noticed SOLR-1832 before now. Maybe we should just kill the whole concept, and have SolrCore initialization fail fast and propogate any type of Exception up to CoreContainer?

          Show
          Hoss Man added a comment - I started looking a little more closely at the singleton SolrCOnfig.severErrors, since eliminating usage of it is really the key to being able to support abortOnConfigurationError=true/false for multiple cores independently. The more I looked at it, the more this whole thing seems futile. For starters: I discovered SOLR-1832 ... in a nutshell a failure to init most types of plugins in 1.4 causes SolrCore to fail to init, regardless of whether abortOnConfigurationError=false. The only types of plugins where initialization failures are logged but the remaining instances are loaded anyway is RequestHandlers and schema related classes (ie: FieldType and Token*Factories) ... but as noted in SOLR-1824 it's actually a really, really, REALLY bad thing for IndexSchema to "ignore" when a FieldType or analysis factory can't be initialized, because it could result in incorrect values getting indexed. So we could: fix SOLR-1824 so that any init error in IndexSchema caused a hard fail. fix SOLR-1832 so that SolrCore.initPlugins "skipped" any instances that failed to init and recorded the exceptions directly with the SolrCore. officially deprecated the *PluginLoader classes, and remove the spots where it adds to SolrCOnfig.severErrors ...so then we wouldn't have anyone writting to SolrCOnfig.severErrors anymore. But should we even bother? I'm starting to think the whole idea of abortOnConfigurationError is a bad idea ... especially if no one noticed SOLR-1832 before now. Maybe we should just kill the whole concept, and have SolrCore initialization fail fast and propogate any type of Exception up to CoreContainer?
          Hide
          Yonik Seeley added a comment -

          I'm starting to think the whole idea of abortOnConfigurationError is a bad idea ... especially if no one noticed SOLR-1832 before now. Maybe we should just kill the whole concept, and have SolrCore initialization fail fast and propogate any type of Exception up to CoreContainer?

          Well, we are bumping major version numbers!
          abortOnConfigurationError=false seems like a very minor feature to lose. We could always poll solr-user to see though.

          Show
          Yonik Seeley added a comment - I'm starting to think the whole idea of abortOnConfigurationError is a bad idea ... especially if no one noticed SOLR-1832 before now. Maybe we should just kill the whole concept, and have SolrCore initialization fail fast and propogate any type of Exception up to CoreContainer? Well, we are bumping major version numbers! abortOnConfigurationError=false seems like a very minor feature to lose. We could always poll solr-user to see though.
          Hide
          Mark Miller added a comment -

          Maybe we should just kill the whole concept, and have SolrCore initialization fail fast and propogate any type of Exception up to CoreContainer?

          Heh - I can't say I'm against this idea. It really does seem safer to force the user to address the problem. Take a problem plugin out of the config - or fix it. It has always been hard for me to imagine the case where I just didn't care if some things could not load (who knows which ones), but I just want to run anyway with what I can. It really seems like a niche need. It just becomes a lot harder to realize that there is a problem - something that I wanted to work is now not working - other things might be. Who knows when I will be alerted. But if the whole thing is just down ...

          So I'm on board with whatever you'd like to do. Attempt to fix these issues, or just drop it all together. I kind of like the idea of dropping it.

          Show
          Mark Miller added a comment - Maybe we should just kill the whole concept, and have SolrCore initialization fail fast and propogate any type of Exception up to CoreContainer? Heh - I can't say I'm against this idea. It really does seem safer to force the user to address the problem. Take a problem plugin out of the config - or fix it. It has always been hard for me to imagine the case where I just didn't care if some things could not load (who knows which ones), but I just want to run anyway with what I can. It really seems like a niche need. It just becomes a lot harder to realize that there is a problem - something that I wanted to work is now not working - other things might be. Who knows when I will be alerted. But if the whole thing is just down ... So I'm on board with whatever you'd like to do. Attempt to fix these issues, or just drop it all together. I kind of like the idea of dropping it.
          Hide
          Hoss Man added a comment -

          Tangential Comment...

          If we do decide that it's worth keeping abortOnConfigurationError, then my earlier suggestion of how it should work was overly complicated...

          a) SolrCore should itself maintain a list of "Severe Initialization Exceptions" that it was able to "get passed" when initializing itself. specificly: when a plugin could not be initialized, and it therefore is ignoring that plugin declaration.

          b) SolrCore should expose an easy way of asking it for it's list of initialization exceptions

          c) SolrCore should pay attention to wether it's solrconfig.xml file indicats if the core should be usable if there were severe initialization exceptions.

          d) SolrCore should refuse to "execute" any requests if (a) contains Exceptions and (c) is true

          There's really no reason for SolrCore to maintain/expose a special list of Exceptions and fail to execute if solrconfig says it should.

          Instead: SolrCore can maintain a list of Exception during is initialization and then if solrconfig.xml says abortOnConfigurationError=true, the the last line of the SolrCore constructor can check check if the list is empty, and throw a nice fat SOlrException wrapping that List if it's not, which CoreContainer can keep track of (just like any solrconfig.xml or schema.xml parse exceptions that it might encounter before that.)

          (this would change the behavior of "new SolrCore" when abortOnConfigurationError=true for embedded users constructing it themselves – but frankly i think it changes it in a good way)

          Show
          Hoss Man added a comment - Tangential Comment... If we do decide that it's worth keeping abortOnConfigurationError, then my earlier suggestion of how it should work was overly complicated... a) SolrCore should itself maintain a list of "Severe Initialization Exceptions" that it was able to "get passed" when initializing itself. specificly: when a plugin could not be initialized, and it therefore is ignoring that plugin declaration. b) SolrCore should expose an easy way of asking it for it's list of initialization exceptions c) SolrCore should pay attention to wether it's solrconfig.xml file indicats if the core should be usable if there were severe initialization exceptions. d) SolrCore should refuse to "execute" any requests if (a) contains Exceptions and (c) is true There's really no reason for SolrCore to maintain/expose a special list of Exceptions and fail to execute if solrconfig says it should. Instead: SolrCore can maintain a list of Exception during is initialization and then if solrconfig.xml says abortOnConfigurationError=true, the the last line of the SolrCore constructor can check check if the list is empty, and throw a nice fat SOlrException wrapping that List if it's not, which CoreContainer can keep track of (just like any solrconfig.xml or schema.xml parse exceptions that it might encounter before that.) (this would change the behavior of "new SolrCore" when abortOnConfigurationError=true for embedded users constructing it themselves – but frankly i think it changes it in a good way)
          Hide
          Hoss Man added a comment -

          linking to related issues

          Show
          Hoss Man added a comment - linking to related issues
          Hide
          Hoss Man added a comment -

          This should be a lot easier if we take care of SLR-1846 first.

          Show
          Hoss Man added a comment - This should be a lot easier if we take care of SLR-1846 first.
          Hide
          Hoss Man added a comment -

          Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email...

          http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E

          Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed.

          A unique token for finding these 240 issues in the future: hossversioncleanup20100527

          Show
          Hoss Man added a comment - Bulk updating 240 Solr issues to set the Fix Version to "next" per the process outlined in this email... http://mail-archives.apache.org/mod_mbox/lucene-dev/201005.mbox/%3Calpine.DEB.1.10.1005251052040.24672@radix.cryptio.net%3E Selection criteria was "Unresolved" with a Fix Version of 1.5, 1.6, 3.1, or 4.0. email notifications were suppressed. A unique token for finding these 240 issues in the future: hossversioncleanup20100527
          Hide
          Robert Muir added a comment -

          Bulk move 3.2 -> 3.3

          Show
          Robert Muir added a comment - Bulk move 3.2 -> 3.3
          Hide
          Robert Muir added a comment -

          3.4 -> 3.5

          Show
          Robert Muir added a comment - 3.4 -> 3.5
          Hide
          Hoss Man added a comment -

          Bulk changing fixVersion 3.6 to 4.0 for any open issues that are unassigned and have not been updated since March 19.

          Email spam suppressed for this bulk edit; search for hoss20120323nofix36 to identify all issues edited

          Show
          Hoss Man added a comment - Bulk changing fixVersion 3.6 to 4.0 for any open issues that are unassigned and have not been updated since March 19. Email spam suppressed for this bulk edit; search for hoss20120323nofix36 to identify all issues edited
          Hide
          Hoss Man added a comment -

          bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment

          Show
          Hoss Man added a comment - bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment
          Hide
          Robert Muir added a comment -

          rmuir20120906-bulk-40-change

          Show
          Robert Muir added a comment - rmuir20120906-bulk-40-change
          Hide
          Hoss Man added a comment -

          this issue languished for a long time, but i'm resolving as "fixed" in 4.0 since there have been a lot of improvements since the last time anyone looked at this jira in terms of how multicore erors are handled...

          1) aboutonConfigError no longer even exists
          2) solr can now start up and server some cores even if other cores fail
          3) CoreAdminHandler and the admin UI can now display errors about cores that failed to init - either on startup, or because of an explicit CREATE/RELOAD request

          Show
          Hoss Man added a comment - this issue languished for a long time, but i'm resolving as "fixed" in 4.0 since there have been a lot of improvements since the last time anyone looked at this jira in terms of how multicore erors are handled... 1) aboutonConfigError no longer even exists 2) solr can now start up and server some cores even if other cores fail 3) CoreAdminHandler and the admin UI can now display errors about cores that failed to init - either on startup, or because of an explicit CREATE/RELOAD request
          Hide
          Uwe Schindler added a comment -

          Closed after release.

          Show
          Uwe Schindler added a comment - Closed after release.

            People

            • Assignee:
              Unassigned
              Reporter:
              Mark Miller
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development