Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.10, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      This has always been a mess: analyzers are easy enough to make on your own, we don't need to "take responsibility" for the users analysis chain for 2 major releases.

      The code maintenance is horrible here.

      This creates a huge usability issue too, and as seen from numerous mailing list issues, users don't even understand how this versioning works anyway.

      I'm sure someone will whine if i try to remove these constants, but we can at least make no-arg ctors forwarding to VERSION_CURRENT so that people who don't care about back compat (e.g. just prototyping) don't have to deal with the horribly complex versioning system.

      If you want to make the argument that doing this is "trappy" (i heard this before), i think thats bogus, and ill counter by trying to remove them. Either way, I'm personally not going to add any of this kind of back compat logic myself ever again.

      Updated: description of the issue updated as expected. We should remove this API completely. No one else on the planet has APIs that require a mandatory version parameter.

      1. LUCENE-5859_dead_code.patch
        597 kB
        Robert Muir
      2. LUCENE-5859.patch
        808 kB
        Ryan Ernst

        Activity

        Hide
        Michael McCandless added a comment -

        +1

        Show
        Michael McCandless added a comment - +1
        Hide
        Shai Erera added a comment -

        What about the proposal that at some point someone made – if you want to work w/ the 4.8 FooAnalyzer, you use the one from the 4.8 jar, and so we remove the Version parameter altogether from Analyzers? Or, we can name Analyzers like Codecs: Foo48Analyzer, Foo49Analyzer...?

        Show
        Shai Erera added a comment - What about the proposal that at some point someone made – if you want to work w/ the 4.8 FooAnalyzer, you use the one from the 4.8 jar, and so we remove the Version parameter altogether from Analyzers? Or, we can name Analyzers like Codecs: Foo48Analyzer, Foo49Analyzer...?
        Hide
        Robert Muir added a comment -

        I don't want to make the back compat even more intrusive than it is today. I think the user can just take care of this instead of us. Thats the point of this issue. if you want to reconstruct your old analyzer as it worked 5 years ago, you can do it. We dont have to do it for you.

        Show
        Robert Muir added a comment - I don't want to make the back compat even more intrusive than it is today. I think the user can just take care of this instead of us. Thats the point of this issue. if you want to reconstruct your old analyzer as it worked 5 years ago, you can do it. We dont have to do it for you.
        Hide
        Uwe Schindler added a comment -

        We should carefully decide this for Lucene 5.0. We should not start to fuck this up again in a minor release.

        Show
        Uwe Schindler added a comment - We should carefully decide this for Lucene 5.0. We should not start to fuck this up again in a minor release.
        Hide
        Shai Erera added a comment -

        The thing that worries me is that if you upgrade your app from 4.8 to 4.9 and a FooAnalyzer changed runtime behavior, and you didn't read the CHANGES, your app will break silently or loudly, depending on the back-compat break. And you may not notice that break until it's too late.

        At least with Codecs, you cannot continue to index w/ a Foo48Codec because we remove the indexing classes and you will get a RuntimeException. That's why I proposed the Foo48Analyzer, so a Foo49Analyzer does not silently break your app, and if you upgrade your app to use it, then it's your responsibility to understand the changes and determine what needs to be done in your app.

        I'm not against the proposal here, just raising some concerns that I have. Going w/ a no-Version ctor does not address the silent break you may experience if you don't read the CHANGES ...

        Show
        Shai Erera added a comment - The thing that worries me is that if you upgrade your app from 4.8 to 4.9 and a FooAnalyzer changed runtime behavior, and you didn't read the CHANGES, your app will break silently or loudly, depending on the back-compat break. And you may not notice that break until it's too late. At least with Codecs, you cannot continue to index w/ a Foo48Codec because we remove the indexing classes and you will get a RuntimeException. That's why I proposed the Foo48Analyzer, so a Foo49Analyzer does not silently break your app, and if you upgrade your app to use it, then it's your responsibility to understand the changes and determine what needs to be done in your app. I'm not against the proposal here, just raising some concerns that I have. Going w/ a no-Version ctor does not address the silent break you may experience if you don't read the CHANGES ...
        Hide
        Robert Muir added a comment -

        That's why I proposed the Foo48Analyzer, so a Foo49Analyzer does not silently break your app

        This argument is bogus. As i said, if you want back compat, you can use the ctor parameter with the version constant, otherwise you dont have to.

        I tried to be reasonable here and open an issue with a compromise. Its clear this is a mistake. I'm now going down the path of removing Version completely. This will at the same time address my other concerns with having multiple version apis.

        Show
        Robert Muir added a comment - That's why I proposed the Foo48Analyzer, so a Foo49Analyzer does not silently break your app This argument is bogus. As i said, if you want back compat, you can use the ctor parameter with the version constant, otherwise you dont have to. I tried to be reasonable here and open an issue with a compromise. Its clear this is a mistake. I'm now going down the path of removing Version completely. This will at the same time address my other concerns with having multiple version apis.
        Hide
        Shai Erera added a comment -

        I tried to be reasonable here and open an issue with a compromise. Its clear this is a mistake

        Why is this a mistake? Are we not allowed to discuss the proposal? I raised some concerns around runtime back-compat, not as a means to oppress your proposal, but to note something to think about. It's fine if you want to say "I don't care about it, users should RTFM". I personally won't be against it. But maybe others in the community will want to say something about it ...

        Show
        Shai Erera added a comment - I tried to be reasonable here and open an issue with a compromise. Its clear this is a mistake Why is this a mistake? Are we not allowed to discuss the proposal? I raised some concerns around runtime back-compat, not as a means to oppress your proposal, but to note something to think about. It's fine if you want to say "I don't care about it, users should RTFM". I personally won't be against it. But maybe others in the community will want to say something about it ...
        Hide
        ASF subversion and git services added a comment -

        Commit 1614698 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1614698 ]

        LUCENE-5859: remove Version param from LuceneTestCase.newIndexWriterConfig, the grand sum of 2 tests making use of it can use the 3-arg version and reduce the noise everywhere else

        Show
        ASF subversion and git services added a comment - Commit 1614698 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1614698 ] LUCENE-5859 : remove Version param from LuceneTestCase.newIndexWriterConfig, the grand sum of 2 tests making use of it can use the 3-arg version and reduce the noise everywhere else
        Hide
        Hoss Man added a comment -

        we can at least make no-arg ctors forwarding to VERSION_CURRENT so that people who don't care about back compat (e.g. just prototyping) don't have to deal with the horribly complex versioning system.

        +1 .. As long as we have some boilerplate javadocs on all of these constructors ("The runtime behavior of this class may change btween releases when using this constructor, please consider the 'Version' constructor to maximize compatibility between releases") this seems like a good idea.

        I tried to be reasonable here and open an issue with a compromise. Its clear this is a mistake. I'm now going down the path of removing Version completely.

        Your proposal seems very reasonable – but you also seem to have gotten very antagonistic and unreasonable when Shai suggested his own, reasonable, alternative proposal for discussion.

        I happen to prefer your proposal to the one Shai made, but you are not coming across, to me, as being very reasonable in your response.

        Show
        Hoss Man added a comment - we can at least make no-arg ctors forwarding to VERSION_CURRENT so that people who don't care about back compat (e.g. just prototyping) don't have to deal with the horribly complex versioning system. +1 .. As long as we have some boilerplate javadocs on all of these constructors ("The runtime behavior of this class may change btween releases when using this constructor, please consider the 'Version' constructor to maximize compatibility between releases") this seems like a good idea. I tried to be reasonable here and open an issue with a compromise. Its clear this is a mistake. I'm now going down the path of removing Version completely. Your proposal seems very reasonable – but you also seem to have gotten very antagonistic and unreasonable when Shai suggested his own, reasonable, alternative proposal for discussion. I happen to prefer your proposal to the one Shai made, but you are not coming across, to me, as being very reasonable in your response.
        Hide
        ASF subversion and git services added a comment -

        Commit 1614713 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1614713 ]

        LUCENE-5859: remove Version param from LuceneTestCase.newIndexWriterConfig, the grand sum of 2 tests making use of it can use the 3-arg version and reduce the noise everywhere else

        Show
        ASF subversion and git services added a comment - Commit 1614713 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1614713 ] LUCENE-5859 : remove Version param from LuceneTestCase.newIndexWriterConfig, the grand sum of 2 tests making use of it can use the 3-arg version and reduce the noise everywhere else
        Hide
        Robert Muir added a comment -

        I don't think there is a general understanding of how bad the bugs are i fixed in LUCENE-5850, probably because of how confusing the versioning is.

        If we had issued a broken release with these, people would be freaking out and acting unreasonable. I'm just fixing it before it happens. Perpetuating the current brokenness even more into the APIs is the wrong direction.

        Show
        Robert Muir added a comment - I don't think there is a general understanding of how bad the bugs are i fixed in LUCENE-5850 , probably because of how confusing the versioning is. If we had issued a broken release with these, people would be freaking out and acting unreasonable. I'm just fixing it before it happens. Perpetuating the current brokenness even more into the APIs is the wrong direction.
        Hide
        Hoss Man added a comment -

        I happen to prefer your proposal to the one Shai made, but you are not coming across, to me, as being very reasonable in your response.

        I realize now that my entire comment, in total, may have been miscounstrued.

        Just to be clear: this is the proposal i agree with...

        ...we can at least make no-arg ctors forwarding to VERSION_CURRENT so that people who don't care about back compat ...

        ...I am -1 on this newer proposal...

        ... I'm now going down the path of removing Version completely. ...

        ...since it would effectively eliminate the ability for anyone to confidently upgrade Lucene minor versions, w/o re-indexing all data.

        Show
        Hoss Man added a comment - I happen to prefer your proposal to the one Shai made, but you are not coming across, to me, as being very reasonable in your response. I realize now that my entire comment, in total, may have been miscounstrued. Just to be clear: this is the proposal i agree with... ...we can at least make no-arg ctors forwarding to VERSION_CURRENT so that people who don't care about back compat ... ...I am -1 on this newer proposal... ... I'm now going down the path of removing Version completely. ... ...since it would effectively eliminate the ability for anyone to confidently upgrade Lucene minor versions, w/o re-indexing all data.
        Hide
        Mark Miller added a comment -

        but we can at least make no-arg ctors forwarding to VERSION_CURRENT

        +1

        I'm not against the proposal here, just raising some concerns that I have.

        I tried to be reasonable here and open an issue with a compromise. Its clear this is a mistake.

        but you also seem to have gotten very antagonistic and unreasonable

        +1. Less coffee Rob

        Doesn't seem like much is in your way yet, keep the pin in the grenade for later.

        Let's start with some sane constructors before we nuke the whole system. I half way know the versioning system, and every time I've tried to toss together some quick Lucene code it has really annoyed me to deal with this. Good doc issue IMO.

        Show
        Mark Miller added a comment - but we can at least make no-arg ctors forwarding to VERSION_CURRENT +1 I'm not against the proposal here, just raising some concerns that I have. I tried to be reasonable here and open an issue with a compromise. Its clear this is a mistake. but you also seem to have gotten very antagonistic and unreasonable +1. Less coffee Rob Doesn't seem like much is in your way yet, keep the pin in the grenade for later. Let's start with some sane constructors before we nuke the whole system. I half way know the versioning system, and every time I've tried to toss together some quick Lucene code it has really annoyed me to deal with this. Good doc issue IMO.
        Hide
        Robert Muir added a comment -

        I havent pulled out the grenade. The grenade is when i get so frustrated with this crap that i do the following:

        svn copy https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x https://svn.apache.org/repos/asf/lucene/dev/branches/branch_5x

        Call a vote for 5.0 (and get 2 other people to agree with me), and then make trunk 6.0 and remove all the back compat mess.

        Show
        Robert Muir added a comment - I havent pulled out the grenade. The grenade is when i get so frustrated with this crap that i do the following: svn copy https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x https://svn.apache.org/repos/asf/lucene/dev/branches/branch_5x Call a vote for 5.0 (and get 2 other people to agree with me), and then make trunk 6.0 and remove all the back compat mess.
        Hide
        Robert Muir added a comment -

        ...I am -1 on this newer proposal...

        ... I'm now going down the path of removing Version completely. ...

        ...since it would effectively eliminate the ability for anyone to confidently upgrade Lucene minor versions, w/o re-indexing all data.

        This is already the case though. So I suppose you are volunteering to fix StandardAnalyzer on trunk to be backwards compatible with 4.0-4.6?

        We should at least be consistent. Its hilarious it has the Version parameter right now, doing nothing, providing no back compat, and yet there are -1's to clean this stuff up

        The back compat here is clearly too complex.

        Show
        Robert Muir added a comment - ...I am -1 on this newer proposal... ... I'm now going down the path of removing Version completely. ... ...since it would effectively eliminate the ability for anyone to confidently upgrade Lucene minor versions, w/o re-indexing all data. This is already the case though. So I suppose you are volunteering to fix StandardAnalyzer on trunk to be backwards compatible with 4.0-4.6? We should at least be consistent. Its hilarious it has the Version parameter right now, doing nothing, providing no back compat, and yet there are -1's to clean this stuff up The back compat here is clearly too complex.
        Hide
        Mark Miller added a comment -

        and yet there are -1's to clean this stuff up

        Let's not over-blow a -1. It has gotten so conflated with a veto these days. By my reading, it means, "I don't currently agree with this".

        We can't skip the step of trying to convince people of an argument and just go to throwing up our hands and saying "there is no convincing anybody". I think people want to be on board with a lot of these ideas, but you are making it harder than it needs to be.

        I'm easily sold on the better constructers. I'm willing to be convinced on the broader version thing in general.

        So I suppose you are volunteering to fix StandardAnalyzer on trunk to be backwards compatible with 4.0-4.6?

        I'm pretty open to the idea that everything can break over major versions...

        Show
        Mark Miller added a comment - and yet there are -1's to clean this stuff up Let's not over-blow a -1. It has gotten so conflated with a veto these days. By my reading, it means, "I don't currently agree with this". We can't skip the step of trying to convince people of an argument and just go to throwing up our hands and saying "there is no convincing anybody". I think people want to be on board with a lot of these ideas, but you are making it harder than it needs to be. I'm easily sold on the better constructers. I'm willing to be convinced on the broader version thing in general. So I suppose you are volunteering to fix StandardAnalyzer on trunk to be backwards compatible with 4.0-4.6? I'm pretty open to the idea that everything can break over major versions...
        Hide
        Robert Muir added a comment -

        Why should I waste my time trying to convince anybody to remove a parameter that does absolutely nothing today?

        This is called "dead code". I will make a commit shortly to trunk to remove such dead code. Don't worry, I won't remove any Version logic that is actually doing something.

        Show
        Robert Muir added a comment - Why should I waste my time trying to convince anybody to remove a parameter that does absolutely nothing today? This is called "dead code". I will make a commit shortly to trunk to remove such dead code. Don't worry, I won't remove any Version logic that is actually doing something .
        Hide
        Mark Miller added a comment -

        Why should I waste my time trying to convince anybody to remove a parameter that does absolutely nothing today?

        It seems like we basically have consensus on that issue already, but the answer is, because this is a community project. That's why there is veto power, that's just how things work if you want to make progress

        Don't worry, I won't remove any Version logic that is actually doing something.

        That's the broader issue that you seem to be after and the one that you probably would need to convince people of. You are generally not bad at it, so it seems silly to avoid it when a contentious issue comes up. This issue was filed this morning. You threw up your hands on it hours later. A little intense man

        Show
        Mark Miller added a comment - Why should I waste my time trying to convince anybody to remove a parameter that does absolutely nothing today? It seems like we basically have consensus on that issue already, but the answer is, because this is a community project. That's why there is veto power, that's just how things work if you want to make progress Don't worry, I won't remove any Version logic that is actually doing something. That's the broader issue that you seem to be after and the one that you probably would need to convince people of. You are generally not bad at it, so it seems silly to avoid it when a contentious issue comes up. This issue was filed this morning. You threw up your hands on it hours later. A little intense man
        Hide
        Robert Muir added a comment -

        It seems like we basically have consensus on that issue already, but the answer is, because this is a community project. That's why there is veto power, that's just how things work if you want to make progress

        Veto must have technical justification. Good luck with the technical justification of me not removing dead code.

        Show
        Robert Muir added a comment - It seems like we basically have consensus on that issue already, but the answer is, because this is a community project. That's why there is veto power, that's just how things work if you want to make progress Veto must have technical justification. Good luck with the technical justification of me not removing dead code.
        Hide
        Uwe Schindler added a comment - - edited

        I think we should all calm down here. We are all strong personalities, so it is always a bit hard if different opinions crush against each other. I had the same problem today in LUCENE-5850, two different opinions. Yes, I am now disappointed, because I spent hard work on the issue and instead of looking at the key parts of the patch and the process of moving forward and improving the stuff, only quibbling around stupid ALPHA/BETA versions was the response - just disappointing. I don't think the other issue participant ever looked at the patch, he just read what I wrote down as patch description and then did not like it because of a small detail. I just stopped working on the issue and maybe I will work on it tomorrow, maybe.

        We should stop heavy committing here and all calm down!

        Veto must have technical justification. Good luck with the technical justification of me not removing dead code.

        There is indeed a technical justification: The Version parameter may be useless now, but once we add new features to StandardAnalyzer, the Version parameter gets useful again, so we need to re-add it - leading to the known backwards problems. This again brings the problem that uses, who started at 5.0 will not have a defined value for it in 5.1 and then we have problems. In my personal opinion, every Analyzer must have the Version parameter mandatory.

        On the other hand I know the problems with the mandatory parameter. I have seen users tripping in the same trap like without the parameter: Once they upgrade Lucene, they just search/replace every occurence of the old Version parameter, just to get the "latest and best features" (documentation) and to remove deprecations, because all previous versions are deprecated. But they don't reindex - and boom they have the same problem like without the parameter. So it had no good effect. Other projects like Elasticsearch don't care about the version parameter. They have a global static constant in every release. Users just upgrade - boom, same problem.

        In my opinion, the documentation is much too general. We should only ever use Version in the analyzers. Version should also be only part of the Analyzers module, not Lucene core. In my opinion - and I agree with Robert - it is only useful there! And in the analysis moudle it should be correctly documented, why it is there and what it does. And why it is wrong to just use the latest and best without reindexing.

        Show
        Uwe Schindler added a comment - - edited I think we should all calm down here. We are all strong personalities, so it is always a bit hard if different opinions crush against each other. I had the same problem today in LUCENE-5850 , two different opinions. Yes, I am now disappointed, because I spent hard work on the issue and instead of looking at the key parts of the patch and the process of moving forward and improving the stuff, only quibbling around stupid ALPHA/BETA versions was the response - just disappointing. I don't think the other issue participant ever looked at the patch, he just read what I wrote down as patch description and then did not like it because of a small detail. I just stopped working on the issue and maybe I will work on it tomorrow, maybe. We should stop heavy committing here and all calm down! Veto must have technical justification. Good luck with the technical justification of me not removing dead code. There is indeed a technical justification: The Version parameter may be useless now, but once we add new features to StandardAnalyzer, the Version parameter gets useful again, so we need to re-add it - leading to the known backwards problems. This again brings the problem that uses, who started at 5.0 will not have a defined value for it in 5.1 and then we have problems. In my personal opinion, every Analyzer must have the Version parameter mandatory. On the other hand I know the problems with the mandatory parameter. I have seen users tripping in the same trap like without the parameter: Once they upgrade Lucene, they just search/replace every occurence of the old Version parameter, just to get the "latest and best features" (documentation) and to remove deprecations, because all previous versions are deprecated. But they don't reindex - and boom they have the same problem like without the parameter. So it had no good effect. Other projects like Elasticsearch don't care about the version parameter. They have a global static constant in every release. Users just upgrade - boom, same problem. In my opinion, the documentation is much too general. We should only ever use Version in the analyzers. Version should also be only part of the Analyzers module, not Lucene core. In my opinion - and I agree with Robert - it is only useful there! And in the analysis moudle it should be correctly documented, why it is there and what it does. And why it is wrong to just use the latest and best without reindexing.
        Hide
        Robert Muir added a comment -

        There is indeed a technical justification: The Version parameter may be useless now, but once we add new features to StandardAnalyzer, the Version parameter gets useful again, so we need to re-add it - leading to the known backwards problems. This again brings the problem that uses, who started at 5.0 will not have a defined value for it in 5.1 and then we have problems. In my personal opinion, every Analyzer must have the Version parameter mandatory.

        This is so bogus, Im not even going to have this conversation with you.

        We cannot corrupt our API with useless parameters that do nothing.

        As soon as I have tests passing, I am removing all this dead code!

        Show
        Robert Muir added a comment - There is indeed a technical justification: The Version parameter may be useless now, but once we add new features to StandardAnalyzer, the Version parameter gets useful again, so we need to re-add it - leading to the known backwards problems. This again brings the problem that uses, who started at 5.0 will not have a defined value for it in 5.1 and then we have problems. In my personal opinion, every Analyzer must have the Version parameter mandatory. This is so bogus, Im not even going to have this conversation with you. We cannot corrupt our API with useless parameters that do nothing. As soon as I have tests passing, I am removing all this dead code!
        Hide
        Robert Muir added a comment -

        Here is a patch removing all the dead code from trunk.

        It doesnt "remove version.java completely" or anything like that. It changes no runtime behavior at all.

        All it does is remove useless parameters: parameters that were added way back in the day like in Lucene 2.4 when a bug was found, that are now totally irrelevant.

        This is really simple dead code removal. There is no reason for CharArraySet to require a Version parameter anymore, its a simple collection that just stores strings.

        Sure, its possible some of this could have a bug in the future, but we can't add Version parameters to every api "in case it might have a bug". If it happens, we deprecate and do the right thing if we must.

        But for now, we should remove this dead code. It was a simple omission when I tried to remove deprecations in trunk previously, I removed some of them, but I also missed some, I didn't get all of them. Here i just do a more thorough job.

        We should still rethink this version mechanism: e.g. make it an optional parameter for those who care about it, or remove it completely. It seriously hurts the usability of the api.

        But there is no way in hell I will just leave dead code in trunk. We have to refactor away these relics.

        Show
        Robert Muir added a comment - Here is a patch removing all the dead code from trunk. It doesnt "remove version.java completely" or anything like that. It changes no runtime behavior at all. All it does is remove useless parameters: parameters that were added way back in the day like in Lucene 2.4 when a bug was found, that are now totally irrelevant. This is really simple dead code removal. There is no reason for CharArraySet to require a Version parameter anymore, its a simple collection that just stores strings. Sure, its possible some of this could have a bug in the future, but we can't add Version parameters to every api "in case it might have a bug". If it happens, we deprecate and do the right thing if we must. But for now, we should remove this dead code. It was a simple omission when I tried to remove deprecations in trunk previously, I removed some of them, but I also missed some, I didn't get all of them. Here i just do a more thorough job. We should still rethink this version mechanism: e.g. make it an optional parameter for those who care about it, or remove it completely. It seriously hurts the usability of the api. But there is no way in hell I will just leave dead code in trunk. We have to refactor away these relics.
        Hide
        ASF subversion and git services added a comment -

        Commit 1614778 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1614778 ]

        LUCENE-5859: remove dead code: changes no runtime behavior, these are all unused variables

        Show
        ASF subversion and git services added a comment - Commit 1614778 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1614778 ] LUCENE-5859 : remove dead code: changes no runtime behavior, these are all unused variables
        Hide
        Robert Muir added a comment -

        And you will notice all the backcompat is preserved: again no runtime behavior changed here. I didnt change the back compat.

        ThaiAnalyzer, TurkishAnalyzer, NGrams, unfortunately they all still have this silly parameter, because they actually make use of it, for back compat purposes since they had serious changes in 4.x.

        But the vast majority of these version variables were just unnecessary confusion to the API, relics from a past life, that did nothing at all (you could even pass null if you like, didnt matter).

        Show
        Robert Muir added a comment - And you will notice all the backcompat is preserved: again no runtime behavior changed here. I didnt change the back compat. ThaiAnalyzer, TurkishAnalyzer, NGrams, unfortunately they all still have this silly parameter, because they actually make use of it, for back compat purposes since they had serious changes in 4.x. But the vast majority of these version variables were just unnecessary confusion to the API, relics from a past life, that did nothing at all (you could even pass null if you like, didnt matter).
        Hide
        Hoss Man added a comment -

        Commit 1614778 from Robert Muir in branch 'dev/trunk'

        Please revert this until there can be more discussion.

        And you will notice all the backcompat is preserved: again no runtime behavior changed here. I didnt change the back compat.

        This commit may not change the backcompat behavior of any existing class, but you hobbled our ability to make future changes w/o jumping through hoops to avoid breaking back compat.

        You seem to be conflating a general API convention that allows for variations in behavior with the idea that if every class following this convention doesn't already take advantage of this behavior then it's "dead code" that is in some way broken.

        ThaiAnalyzer, TurkishAnalyzer, NGrams, unfortunately they all still have this silly parameter, because they actually make use of it, for back compat purposes since they had serious changes in 4.x.

        It boggles my mind how you can point out that these classes remain unchanged, and still call the param "silly" while you gut it from all other classes baffles me.

        Let's look at TurkishAnalyzer for example – it uses the matchVersion param to decide whether or not to include "ApostropheFilter" in the stream, because in 4.8, Ahmet suggested adding the ApostropheFilter and you thought it was a great idea and should be part of the default behavior of TurkishAnalyzer.

        The fact that TurkishAnalyzer already had a constructor that took in a Version param is the reason we could make this change to the default behavior of the analyzer, and improve the out of the box experience for all future users of the class, w/o breaking any back compact for existing users.

        if you had committed r1614778 prior to LUCENE-5482 being opened, and included TurkishAnalyzer in the list of classes you purged Version from, we would not have been able to change the default behavior of TurkishAnalyzer w/o breaking backcompat for existing users. This wasn't about a "bug fix" it was about an improved default user experience moving forward, w/o breaking things for existing users who had working systems

        What you have done now is put is in the position that for any similar improvements that anyone ever suggests to any analyzer (except ThaiAnalyzer and TurkishAnalyzer which you spared) we either can't change the default behavior, or we have to break backcompat for existing users.

        (you may argue that this is only a trunk change, but that just kicks the can down the road – w/o these constructors in place when 5.0 is released, we'll face all the same problems in the 5.x releases that the existence of Version has helped us avoid in the 4.x release)


        • Your initial suggestion of adding non-Version constructors for people who don't care about backcompat was a good one.
        • your r1614698 commit to cleanup and simplify 99.9% of the calls to newIndexWriterConfig from tests was also a great simplification
        • r1614778 is a terrible, crippling, blow to our ability to make changes in the future w/o breaking back compat to existing users.
        Show
        Hoss Man added a comment - Commit 1614778 from Robert Muir in branch 'dev/trunk' Please revert this until there can be more discussion. And you will notice all the backcompat is preserved: again no runtime behavior changed here. I didnt change the back compat. This commit may not change the backcompat behavior of any existing class, but you hobbled our ability to make future changes w/o jumping through hoops to avoid breaking back compat. You seem to be conflating a general API convention that allows for variations in behavior with the idea that if every class following this convention doesn't already take advantage of this behavior then it's "dead code" that is in some way broken. ThaiAnalyzer, TurkishAnalyzer, NGrams, unfortunately they all still have this silly parameter, because they actually make use of it, for back compat purposes since they had serious changes in 4.x. It boggles my mind how you can point out that these classes remain unchanged, and still call the param "silly" while you gut it from all other classes baffles me. Let's look at TurkishAnalyzer for example – it uses the matchVersion param to decide whether or not to include "ApostropheFilter" in the stream, because in 4.8, Ahmet suggested adding the ApostropheFilter and you thought it was a great idea and should be part of the default behavior of TurkishAnalyzer. The fact that TurkishAnalyzer already had a constructor that took in a Version param is the reason we could make this change to the default behavior of the analyzer, and improve the out of the box experience for all future users of the class, w/o breaking any back compact for existing users. if you had committed r1614778 prior to LUCENE-5482 being opened, and included TurkishAnalyzer in the list of classes you purged Version from, we would not have been able to change the default behavior of TurkishAnalyzer w/o breaking backcompat for existing users. This wasn't about a "bug fix" it was about an improved default user experience moving forward, w/o breaking things for existing users who had working systems What you have done now is put is in the position that for any similar improvements that anyone ever suggests to any analyzer (except ThaiAnalyzer and TurkishAnalyzer which you spared) we either can't change the default behavior, or we have to break backcompat for existing users. (you may argue that this is only a trunk change, but that just kicks the can down the road – w/o these constructors in place when 5.0 is released, we'll face all the same problems in the 5.x releases that the existence of Version has helped us avoid in the 4.x release) Your initial suggestion of adding non-Version constructors for people who don't care about backcompat was a good one. your r1614698 commit to cleanup and simplify 99.9% of the calls to newIndexWriterConfig from tests was also a great simplification r1614778 is a terrible, crippling, blow to our ability to make changes in the future w/o breaking back compat to existing users.
        Hide
        Robert Muir added a comment -

        I'm not going to revert it. You just want to make Lucene harder to use, so more people will use apache solr instead.

        I removed dead code. There are plenty of lucene classes (including analyzers etc) that don't have Version params. If we need to add one, we just deprecate the default ctor and add it.

        Show
        Robert Muir added a comment - I'm not going to revert it. You just want to make Lucene harder to use, so more people will use apache solr instead. I removed dead code. There are plenty of lucene classes (including analyzers etc) that don't have Version params. If we need to add one, we just deprecate the default ctor and add it.
        Hide
        Hoss Man added a comment -

        I removed dead code. There are plenty of lucene classes (including analyzers etc) that don't have Version params. If we need to add one, we just deprecate the default ctor and add it.

        I want to make sure we're crystal clear about what you are suggesting, because it seems to be the exact opposite of what you were advocating for when you opened this issue.

        You seem to be saying that moving forward, starting with 5.0, some class "FooAnalyzer" should not have any Version param in it's constructor. if at some point (hypothetically: after 5.2 is released, and before 5.3) and issue like LUCENE-5482 pops up that adds improvement to the behavior of FooAnalyzer, but would break backcompat for anyone who had already used FooAnalzyer to build an existing index, then what should happen is:

        • we should mark the FooAnalzyer() constructor as deprecated
        • we should add a FooAnalzyer(Version matchVersion) constructor that adds the improved behavior when 5.3 <= matchVersion otherwise uses the previous, non-improved, behavior
        • we (presumably) update the FooAnalyzer class level javadocs to point out that for the best behavior, users need to use "new FooAnalyzer(5.3)"

        ...leaving us in a position where "people who don't care about back compat (e.g. just prototyping)" will in fact "have to deal with the horribly complex versioning system." in order to get the best possible recommended behavior from the analyzer.

        ...is that really what you suggest moving forward?


        I'm not going to revert it. You just want to make Lucene harder to use, so more people will use apache solr instead.

        I'm making arguments about the merrits of an API that leaves room for future changes, while supporting your (initial) proposal to simplify the Java API for new users.

        If you truly believe that I would argue a position with the primary motivation of making/keeping an API (any API) hard to use, then you should be asking the PMC to remove me – because anyone who would act with that kind of intentional malice towards any part of the code base doesn't deserve to have commit privileges.

        If that's really what you think of me then apparently you don't know me as well as i thought – the fact that you would even type that sentence makes it extremely clear to me that i don't know you as well as i thought.

        Frankly, aside from school yard taunts as a kid, that's the most hurtful comment i can ever remember being directed at me. If a stranger had made that comment I wouldn't care and could easily brush it off, but from someone i admire and respect ... that really fucking sucks.

        Fuck you too Rob. Fuck you too.


        I'm done with this issue, i won't be reading anymore comments from it...
        https://people.apache.org/~hossman/#personal

        Show
        Hoss Man added a comment - I removed dead code. There are plenty of lucene classes (including analyzers etc) that don't have Version params. If we need to add one, we just deprecate the default ctor and add it. I want to make sure we're crystal clear about what you are suggesting, because it seems to be the exact opposite of what you were advocating for when you opened this issue. You seem to be saying that moving forward, starting with 5.0, some class "FooAnalyzer" should not have any Version param in it's constructor. if at some point (hypothetically: after 5.2 is released, and before 5.3) and issue like LUCENE-5482 pops up that adds improvement to the behavior of FooAnalyzer, but would break backcompat for anyone who had already used FooAnalzyer to build an existing index, then what should happen is: we should mark the FooAnalzyer() constructor as deprecated we should add a FooAnalzyer(Version matchVersion) constructor that adds the improved behavior when 5.3 <= matchVersion otherwise uses the previous, non-improved, behavior we (presumably) update the FooAnalyzer class level javadocs to point out that for the best behavior, users need to use "new FooAnalyzer(5.3)" ...leaving us in a position where "people who don't care about back compat (e.g. just prototyping)" will in fact "have to deal with the horribly complex versioning system." in order to get the best possible recommended behavior from the analyzer. ...is that really what you suggest moving forward? I'm not going to revert it. You just want to make Lucene harder to use, so more people will use apache solr instead. I'm making arguments about the merrits of an API that leaves room for future changes, while supporting your (initial) proposal to simplify the Java API for new users. If you truly believe that I would argue a position with the primary motivation of making/keeping an API ( any API) hard to use, then you should be asking the PMC to remove me – because anyone who would act with that kind of intentional malice towards any part of the code base doesn't deserve to have commit privileges. If that's really what you think of me then apparently you don't know me as well as i thought – the fact that you would even type that sentence makes it extremely clear to me that i don't know you as well as i thought. Frankly, aside from school yard taunts as a kid, that's the most hurtful comment i can ever remember being directed at me. If a stranger had made that comment I wouldn't care and could easily brush it off, but from someone i admire and respect ... that really fucking sucks. Fuck you too Rob. Fuck you too. I'm done with this issue, i won't be reading anymore comments from it... https://people.apache.org/~hossman/#personal
        Hide
        Robert Muir added a comment -

        If you truly believe that I would argue a position with the primary motivation of making/keeping an API (any API) hard to use

        For the record, I absolutely do believe this, or I would not have said it.

        If I opened an issue to add a mandatory "matchVersion" parameter to every HTTP solr api "just in case we need the mechanism", thats when people would start backpedaling extremely quickly. I'm sure there would be lengthy explanations about how "its solr but its different" but its not. Its an API either way.

        And Lucene's doesnt have to suck, by having garbage dead parameters that do absolutely nothing.

        Show
        Robert Muir added a comment - If you truly believe that I would argue a position with the primary motivation of making/keeping an API (any API) hard to use For the record, I absolutely do believe this, or I would not have said it. If I opened an issue to add a mandatory "matchVersion" parameter to every HTTP solr api "just in case we need the mechanism", thats when people would start backpedaling extremely quickly. I'm sure there would be lengthy explanations about how "its solr but its different" but its not. Its an API either way. And Lucene's doesnt have to suck, by having garbage dead parameters that do absolutely nothing.
        Hide
        Mark Miller added a comment -

        Wow, this is a sad issue. Thanks Robert.

        Show
        Mark Miller added a comment - Wow, this is a sad issue. Thanks Robert.
        Hide
        Robert Muir added a comment -

        I agree, I will leave the entire topic alone.

        I hope someone else can take over the issues of fixing the versioning/back compat/etc.

        I'm too frustrated with the situation, because i think its too complicated and almost created a broken release, and has the potential to almost do it again.

        You can back out any of my changes if you want (this issue, or any of the other ones). Seriously. I just hope someone who is less emotional about fixing this stuff can make it better.

        Show
        Robert Muir added a comment - I agree, I will leave the entire topic alone. I hope someone else can take over the issues of fixing the versioning/back compat/etc. I'm too frustrated with the situation, because i think its too complicated and almost created a broken release, and has the potential to almost do it again. You can back out any of my changes if you want (this issue, or any of the other ones). Seriously. I just hope someone who is less emotional about fixing this stuff can make it better.
        Hide
        Shai Erera added a comment -

        The way I see it, Rob's commit is exactly what Chris Hostetter (Unused) and others seem to agree we should do on this issue. Removing Version from the Analyzers that don't make use of it is just like having the no-Version ctor along with a Versio'd one. So Chris Hostetter (Unused), if I use your example of upgrade FooAnalyzer from 5.2 to 5.3, in 5.2 FooAnalyzer won't take any Version and in 5.3 will have two ctors: no-Version one which calls this(Version.CURRENT) and a Version one which acts accordingly. If you care about the 5.2- behavior, you pass Version.5_2, otherwise you can pass Version.CURRENT or use the no-Version ctor. So at least from a technical standpoint, Rob committed what people have voted +1 for.

        I do think it would not be the end of the world if Rob waited with the commit a couple hours to get at least a +1 on this, or explain why the patch implements the idea that people seem to agree on. We've waited for +1s on much less controversial issues in the past...

        Show
        Shai Erera added a comment - The way I see it, Rob's commit is exactly what Chris Hostetter (Unused) and others seem to agree we should do on this issue. Removing Version from the Analyzers that don't make use of it is just like having the no-Version ctor along with a Versio'd one. So Chris Hostetter (Unused) , if I use your example of upgrade FooAnalyzer from 5.2 to 5.3, in 5.2 FooAnalyzer won't take any Version and in 5.3 will have two ctors: no-Version one which calls this(Version.CURRENT) and a Version one which acts accordingly. If you care about the 5.2- behavior, you pass Version.5_2, otherwise you can pass Version.CURRENT or use the no-Version ctor. So at least from a technical standpoint, Rob committed what people have voted +1 for. I do think it would not be the end of the world if Rob waited with the commit a couple hours to get at least a +1 on this, or explain why the patch implements the idea that people seem to agree on. We've waited for +1s on much less controversial issues in the past...
        Hide
        Uwe Schindler added a comment -

        Robert Muir: Please revert https://svn.apache.org/r1614778 until the discussion is resolved. This issue is not urgent to resolve so we should plan it better. I already asked for not doing heavy commits on this issue, but you ignored it. This is not respectful behaviour.

        I also made a proposal about Version.java: We should solely use it for analyzers and therefore move it to the analyzers module. It has almost no use outside. I 100% agree with removing Version from anywhere else in Lucene. It makes no sense to have Version for example in IndexReaderConfig.

        Show
        Uwe Schindler added a comment - Robert Muir : Please revert https://svn.apache.org/r1614778 until the discussion is resolved. This issue is not urgent to resolve so we should plan it better. I already asked for not doing heavy commits on this issue, but you ignored it. This is not respectful behaviour. I also made a proposal about Version.java: We should solely use it for analyzers and therefore move it to the analyzers module. It has almost no use outside. I 100% agree with removing Version from anywhere else in Lucene. It makes no sense to have Version for example in IndexReaderConfig.
        Hide
        Robert Muir added a comment -

        Why do you keep perpetuating it Uwe?

        Have you even tried, to look anywhere else outside of Lucene for an API like this? Nobody has such a thing. This Version API is horrible.

        Please, at least think about this, think about how much work i put in the analyzers and how frustrating it is to have to see this horrible API.

        As i said before, I am out of the conversation. I'm not working on this issue anymore. I will back out the commit and you can do whatever you want with it.

        But you might want to seriously think about usability.

        Show
        Robert Muir added a comment - Why do you keep perpetuating it Uwe? Have you even tried, to look anywhere else outside of Lucene for an API like this? Nobody has such a thing. This Version API is horrible. Please, at least think about this, think about how much work i put in the analyzers and how frustrating it is to have to see this horrible API. As i said before, I am out of the conversation. I'm not working on this issue anymore. I will back out the commit and you can do whatever you want with it. But you might want to seriously think about usability.
        Hide
        ASF subversion and git services added a comment -

        Commit 1614852 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1614852 ]

        LUCENE-5859: Literally add back dead code to please a bunch of fucking babies

        Show
        ASF subversion and git services added a comment - Commit 1614852 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1614852 ] LUCENE-5859 : Literally add back dead code to please a bunch of fucking babies
        Hide
        Uwe Schindler added a comment -

        Thanks Robert. One suggestion to think about:

        We can do something like you think. But instead of using LUCENE_CURRENT as default in this(Version.LUCENE_CURRENT), how about this anti-pattern, but maybe useful here.

        We could add a static setter to the Version enum called Version#setDefault(Version). This static setter would then automatically affect all constructors without a Version. So the delegation would be: this(Version.getDefault());

        By default Verseion.getDefault() returns LUCENE_CURRENT, but somebody who upgrades can set a static default. I know static sefaults are horrible in multi-application servers, because one webapp could affect another one. But its better than nothing here.

        Show
        Uwe Schindler added a comment - Thanks Robert. One suggestion to think about: We can do something like you think. But instead of using LUCENE_CURRENT as default in this(Version.LUCENE_CURRENT), how about this anti-pattern, but maybe useful here. We could add a static setter to the Version enum called Version#setDefault(Version). This static setter would then automatically affect all constructors without a Version. So the delegation would be: this(Version.getDefault()); By default Verseion.getDefault() returns LUCENE_CURRENT, but somebody who upgrades can set a static default. I know static sefaults are horrible in multi-application servers, because one webapp could affect another one. But its better than nothing here.
        Hide
        Shai Erera added a comment -

        I don't think that we need to add any get/setDefault to Versoin. The idea of having two ctors, one with and one without Version is perfectly fine. If you want to play it "safe", always pass your desired Version, no matter if that Analyzer currently cares or not. If in the future a FooAnalyzer comes to break backcompat, then your application is still "safe". And if you always use the no-Version ctor, then by definition you throw away any backcompat from your indexes.

        Show
        Shai Erera added a comment - I don't think that we need to add any get/setDefault to Versoin. The idea of having two ctors, one with and one without Version is perfectly fine. If you want to play it "safe", always pass your desired Version, no matter if that Analyzer currently cares or not. If in the future a FooAnalyzer comes to break backcompat, then your application is still "safe". And if you always use the no-Version ctor, then by definition you throw away any backcompat from your indexes.
        Hide
        Jack Krupansky added a comment -

        users don't even understand how this versioning works anyway

        Anybody want to take a shot at a clear description that will make sense to "the rest of us"?

        I mean, don't normal users simply want precisely one thing - back compat with their existing index, like always, plus "auto upgrade" when that is sensible? Should a non-expert user EVER be setting the version explicitly? Some advanced or "expert" users want to create indexes for a specific release, but let's not confuse them with "normal" users.

        I concede that this may be an overly simplistic view, but I think we should start with where normal users should want to be, and at least elaborate in the language of normal users precisely what additional considerations they need to keep in mind and decisions they will have to make and what factors they will need to consider, with specific recommendations.

        And this is just Lucene. Solr... will it stay unchanged at the API level, or is this Lucene change going to ripple out to Solr users as well?

        Show
        Jack Krupansky added a comment - users don't even understand how this versioning works anyway Anybody want to take a shot at a clear description that will make sense to "the rest of us"? I mean, don't normal users simply want precisely one thing - back compat with their existing index, like always, plus "auto upgrade" when that is sensible? Should a non-expert user EVER be setting the version explicitly? Some advanced or "expert" users want to create indexes for a specific release, but let's not confuse them with "normal" users. I concede that this may be an overly simplistic view, but I think we should start with where normal users should want to be, and at least elaborate in the language of normal users precisely what additional considerations they need to keep in mind and decisions they will have to make and what factors they will need to consider, with specific recommendations. And this is just Lucene. Solr... will it stay unchanged at the API level, or is this Lucene change going to ripple out to Solr users as well?
        Hide
        Shai Erera added a comment -

        You don't build an index for a specific version. Rather, much like Codecs, when you create an Analyzer you specify its runtime behavior. With Codecs, that would be Lucene45Codec vs Lucene49Codec. With Analyzers it's FooAnalyzer(Version.45) vs FooAnalyzer(Version.49).

        Now imagine that we didn't have this Version, and you created an index with FooAnalyzer(), which (using Hoss's example) does not use an ApostropheFilter. Then in 4.9 this analyzer decides to add this filter into the chain, because somebody thinks it's so useful. You upgrade your Lucene (or Solr) jar and suddenly your app breaks. If you run tests before, maybe you're lucky to hit the inconsistency, you ask the list and Robert tells you "this is Open Source, grab FooAnalyzer from Lucene 4.5 and use it instead of the one we shipped". Fine, I can bite that bullet. Not sure if others want to bite that bullet, but I'm willing to. Because we did document that backwards break in CHANGES, and user should really read CHANGES...

        But maybe you're not that lucky, and didn't read CHANGES and suddenly months later you spot the inconsistency (i.e. someone complains about degraded search experience). Now copying FooAnalyzer from 4.5 and shipping that to your customer (or deploy in your app) isn't going to solve the problem, because the index is corrupt. Some documents have the apostrophe indexed, others don't. So you have to rebuild your (or your customer's index). If it's a tiny index, maybe you and I both are willing to bite the bullet. But if it's a HUGE collection, think billions of documents, the kind that takes days or weeks to build ... maybe I won't mind biting the bullet (hey, it's not me who needs to re-index), but I'm pretty positive you won't want to do that. Now you email to the list, pissed off of course, and the only thing we can tell you is "hey, you should have read the fucking CHANGES!!".

        Now back to Version, if FooAnalyzer has a ctor which takes a Version then:

        • If your app always uses the Version-ctor and passes Version.45, but still your index breaks ... then you could blame the community. You will still need to re-index though.
        • If your app always uses the no-Version ctor ... you could only blame yourself for being "adventurous". You will still need to re-index though.
        • If your app always uses the Version-ctor, and passes Version.45 and FooAnalyzer behaves well .. then on upgrading to 4.9 you don't hit any inconsistency. Most likely you won't even notice the change, because you don't read the CHANGES (but that's on you, to spot improvements that were added to your precious FooAnalyzer).

        If FooAnalyzer exposes two ctors, w/ and w/o a Version, then you can make a decision in your app if you want to care about such things or not. I.e. if your app indexes 100K docs, and you always rebuild the index upon upgrade, cause you anyway do it once and it's so fast that it finishes during your coffee break, then you don't need to worry about Version. Most likely you have a Constants.STUPID_FUCKING_USELESS_VERSION constant somewhere, and all your code uses it, and on upgrade you just modify it from Version.45 to Version.49. But if you index that HUGE collection, and you've been bitten once by such stupidity of not reading CHANGES, then you decide to play it safe and above that Constants constant you probably have a BIG WARNING about DON'T EVER CHANGE THAT WITHOUT READING CHANGES AND RUNNING TESTS!!!.

        There is another delicate point raised on the issue – should a FooAnalyzer always offer a Version ctor or not. I.e. if it's its first version (we've just released 4.5 and it's a new analyzer), does it need to expose a Version ctor along w/ a default ctor? Since it doesn't have any back-compat logic, Robert claims it shouldn't because no matter what Version you pass to it, it simply ignores it. Hack, if you pass a Version.41, before it even existed, it might be better if it threw an AreYouSeriousException at your face. But then, if your app uses it in 4.5 and in 4.9 it improves to add the ApostropheFilter, you're screwed (read above). Others (like Uwe and me) think that it always needs to have a Version ctor, alongside a no-Version one, so that if you're the "playing it safe" guy with that scary comment above, then you can pass Version.45 to it, then Version.46/47/48 but on upgrading to 4.9, and you pass Version.49, your app still works as it used to.

        About Solr, I don't think it impacts Solr at the moment...

        Show
        Shai Erera added a comment - You don't build an index for a specific version. Rather, much like Codecs, when you create an Analyzer you specify its runtime behavior. With Codecs, that would be Lucene45Codec vs Lucene49Codec. With Analyzers it's FooAnalyzer(Version.45) vs FooAnalyzer(Version.49). Now imagine that we didn't have this Version, and you created an index with FooAnalyzer(), which (using Hoss's example) does not use an ApostropheFilter. Then in 4.9 this analyzer decides to add this filter into the chain, because somebody thinks it's so useful. You upgrade your Lucene (or Solr) jar and suddenly your app breaks. If you run tests before, maybe you're lucky to hit the inconsistency, you ask the list and Robert tells you "this is Open Source, grab FooAnalyzer from Lucene 4.5 and use it instead of the one we shipped". Fine, I can bite that bullet. Not sure if others want to bite that bullet, but I'm willing to. Because we did document that backwards break in CHANGES, and user should really read CHANGES... But maybe you're not that lucky, and didn't read CHANGES and suddenly months later you spot the inconsistency (i.e. someone complains about degraded search experience). Now copying FooAnalyzer from 4.5 and shipping that to your customer (or deploy in your app) isn't going to solve the problem, because the index is corrupt. Some documents have the apostrophe indexed, others don't. So you have to rebuild your (or your customer's index). If it's a tiny index, maybe you and I both are willing to bite the bullet. But if it's a HUGE collection, think billions of documents, the kind that takes days or weeks to build ... maybe I won't mind biting the bullet (hey, it's not me who needs to re-index), but I'm pretty positive you won't want to do that. Now you email to the list, pissed off of course, and the only thing we can tell you is "hey, you should have read the fucking CHANGES!!". Now back to Version, if FooAnalyzer has a ctor which takes a Version then: If your app always uses the Version-ctor and passes Version.45, but still your index breaks ... then you could blame the community. You will still need to re-index though. If your app always uses the no-Version ctor ... you could only blame yourself for being "adventurous". You will still need to re-index though. If your app always uses the Version-ctor, and passes Version.45 and FooAnalyzer behaves well .. then on upgrading to 4.9 you don't hit any inconsistency. Most likely you won't even notice the change, because you don't read the CHANGES (but that's on you, to spot improvements that were added to your precious FooAnalyzer). If FooAnalyzer exposes two ctors, w/ and w/o a Version, then you can make a decision in your app if you want to care about such things or not. I.e. if your app indexes 100K docs, and you always rebuild the index upon upgrade, cause you anyway do it once and it's so fast that it finishes during your coffee break, then you don't need to worry about Version. Most likely you have a Constants.STUPID_FUCKING_USELESS_VERSION constant somewhere, and all your code uses it, and on upgrade you just modify it from Version.45 to Version.49. But if you index that HUGE collection, and you've been bitten once by such stupidity of not reading CHANGES, then you decide to play it safe and above that Constants constant you probably have a BIG WARNING about DON'T EVER CHANGE THAT WITHOUT READING CHANGES AND RUNNING TESTS!!!. There is another delicate point raised on the issue – should a FooAnalyzer always offer a Version ctor or not. I.e. if it's its first version (we've just released 4.5 and it's a new analyzer), does it need to expose a Version ctor along w/ a default ctor? Since it doesn't have any back-compat logic, Robert claims it shouldn't because no matter what Version you pass to it, it simply ignores it. Hack, if you pass a Version.41, before it even existed, it might be better if it threw an AreYouSeriousException at your face. But then, if your app uses it in 4.5 and in 4.9 it improves to add the ApostropheFilter, you're screwed (read above). Others (like Uwe and me) think that it always needs to have a Version ctor, alongside a no-Version one, so that if you're the "playing it safe" guy with that scary comment above, then you can pass Version.45 to it, then Version.46/47/48 but on upgrading to 4.9, and you pass Version.49, your app still works as it used to. About Solr, I don't think it impacts Solr at the moment...
        Hide
        Ryan Ernst added a comment -

        I modified Robert Muir's patch, per my recent proposal for this issue on the dev@ email list. This adds setVersion/getVersion to Analyzer, and removes Version constructors from any concrete Analyzer constructors, as well as from TokeFilters, etc.

        Show
        Ryan Ernst added a comment - I modified Robert Muir 's patch, per my recent proposal for this issue on the dev@ email list. This adds setVersion/getVersion to Analyzer, and removes Version constructors from any concrete Analyzer constructors, as well as from TokeFilters, etc.
        Hide
        Ryan Ernst added a comment -

        Note this is only the Analyzer side of that proposal. I will work on consolidating Version and LUCENE_MAIN_VERSION separately.

        Show
        Ryan Ernst added a comment - Note this is only the Analyzer side of that proposal. I will work on consolidating Version and LUCENE_MAIN_VERSION separately.
        Hide
        Uwe Schindler added a comment -

        Note this is only the Analyzer side of that proposal. I will work on consolidating Version and LUCENE_MAIN_VERSION separately.

        Thanks. There is already a patch going in that direction at LUCENE-5850. You can use it as basis for the refactoring. At least use the parts that cleans up common-build.xml and the testcase that checks that the LUCENE_MAIN_VERSION is identical to the one in common-build.xml.

        Show
        Uwe Schindler added a comment - Note this is only the Analyzer side of that proposal. I will work on consolidating Version and LUCENE_MAIN_VERSION separately. Thanks. There is already a patch going in that direction at LUCENE-5850 . You can use it as basis for the refactoring. At least use the parts that cleans up common-build.xml and the testcase that checks that the LUCENE_MAIN_VERSION is identical to the one in common-build.xml.
        Hide
        Shai Erera added a comment -

        I think we can apply the same approach here to IndexWriterConfig. If it's too much work, we can do it in a separate issue.

        Show
        Shai Erera added a comment - I think we can apply the same approach here to IndexWriterConfig. If it's too much work, we can do it in a separate issue.
        Hide
        Ryan Ernst added a comment -

        Thanks Uwe Schindler, I will indeed use your patch as a base there.

        Shai Erera, I created LUCENE-5871 to separately address IWC issues for Version.

        Show
        Ryan Ernst added a comment - Thanks Uwe Schindler , I will indeed use your patch as a base there. Shai Erera , I created LUCENE-5871 to separately address IWC issues for Version.
        Hide
        Ryan Ernst added a comment -

        I plan to commit this Friday morning PST if there are no objections.

        Show
        Ryan Ernst added a comment - I plan to commit this Friday morning PST if there are no objections.
        Hide
        ASF subversion and git services added a comment -

        Commit 1616901 from Ryan Ernst in branch 'dev/trunk'
        [ https://svn.apache.org/r1616901 ]

        LUCENE-5859: Remove Version from Analyzer constructors

        Show
        ASF subversion and git services added a comment - Commit 1616901 from Ryan Ernst in branch 'dev/trunk' [ https://svn.apache.org/r1616901 ] LUCENE-5859 : Remove Version from Analyzer constructors
        Hide
        Ryan Ernst added a comment -

        I'm going to try and backport this to 4x (leaving the old ctors taking Version as deprecated).

        Show
        Ryan Ernst added a comment - I'm going to try and backport this to 4x (leaving the old ctors taking Version as deprecated).
        Hide
        ASF subversion and git services added a comment -

        Commit 1619283 from Ryan Ernst in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1619283 ]

        LUCENE-5859: Add Analyzer constructors without Version parameter and deprecate those taking Version

        Show
        ASF subversion and git services added a comment - Commit 1619283 from Ryan Ernst in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1619283 ] LUCENE-5859 : Add Analyzer constructors without Version parameter and deprecate those taking Version
        Hide
        ASF subversion and git services added a comment -

        Commit 1619466 from Ryan Ernst in branch 'dev/trunk'
        [ https://svn.apache.org/r1619466 ]

        LUCENE-5859: Update changes entry to reflect backport

        Show
        ASF subversion and git services added a comment - Commit 1619466 from Ryan Ernst in branch 'dev/trunk' [ https://svn.apache.org/r1619466 ] LUCENE-5859 : Update changes entry to reflect backport
        Hide
        ASF subversion and git services added a comment -

        Commit 1619467 from Ryan Ernst in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1619467 ]

        LUCENE-5859: Update changes entry to reflect backport

        Show
        ASF subversion and git services added a comment - Commit 1619467 from Ryan Ernst in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1619467 ] LUCENE-5859 : Update changes entry to reflect backport
        Hide
        ASF subversion and git services added a comment -

        Commit 1619468 from Ryan Ernst in branch 'dev/branches/lucene_solr_4_10'
        [ https://svn.apache.org/r1619468 ]

        LUCENE-5859: Update changes entry to reflect backport

        Show
        ASF subversion and git services added a comment - Commit 1619468 from Ryan Ernst in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1619468 ] LUCENE-5859 : Update changes entry to reflect backport
        Hide
        ASF subversion and git services added a comment -

        Commit 1619623 from Use account "steve_rowe" instead in branch 'dev/trunk'
        [ https://svn.apache.org/r1619623 ]

        LUCENE-5859, LUCENE-5871: Remove Version.LUCENE_CURRENT from javadocs

        Show
        ASF subversion and git services added a comment - Commit 1619623 from Use account "steve_rowe" instead in branch 'dev/trunk' [ https://svn.apache.org/r1619623 ] LUCENE-5859 , LUCENE-5871 : Remove Version.LUCENE_CURRENT from javadocs
        Hide
        ASF subversion and git services added a comment -

        Commit 1619625 from Use account "steve_rowe" instead in branch 'dev/trunk'
        [ https://svn.apache.org/r1619625 ]

        LUCENE-5859: Remove Version.LUCENE_CURRENT from htmlentity.py code generator for HTMLStripCharFilter

        Show
        ASF subversion and git services added a comment - Commit 1619625 from Use account "steve_rowe" instead in branch 'dev/trunk' [ https://svn.apache.org/r1619625 ] LUCENE-5859 : Remove Version.LUCENE_CURRENT from htmlentity.py code generator for HTMLStripCharFilter
        Hide
        ASF subversion and git services added a comment -

        Commit 1619694 from Use account "steve_rowe" instead in branch 'dev/trunk'
        [ https://svn.apache.org/r1619694 ]

        LUCENE-5859: Remove Version.LUCENE_CURRENT from code generated by htmlentity.py

        Show
        ASF subversion and git services added a comment - Commit 1619694 from Use account "steve_rowe" instead in branch 'dev/trunk' [ https://svn.apache.org/r1619694 ] LUCENE-5859 : Remove Version.LUCENE_CURRENT from code generated by htmlentity.py
        Hide
        Jun Ohtani added a comment -

        I think this ticket is Fix Version 4.10 too, right?

        Show
        Jun Ohtani added a comment - I think this ticket is Fix Version 4.10 too, right?
        Hide
        Ryan Ernst added a comment -

        Yes, thanks. I added 4.10 to Fix Versions.

        Show
        Ryan Ernst added a comment - Yes, thanks. I added 4.10 to Fix Versions.

          People

          • Assignee:
            Ryan Ernst
            Reporter:
            Robert Muir
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development