Lucene - Core
  1. Lucene - Core
  2. LUCENE-2979

Simplify configuration API of contrib Query Parser

    Details

    • Lucene Fields:
      New

      Description

      The current configuration API is very complicated and inherit the concept used by Attribute API to store token information in token streams. However, the requirements for both (QP config and token stream) are not the same, so they shouldn't be using the same thing.

      I propose to simplify QP config and make it less scary for people intending to use contrib QP. The task is not difficult, it will just require a lot of code change and figure out the best way to do it. That's why it's a good candidate for a GSoC project.

      I would like to hear good proposals about how to make the API more friendly and less scaring

      1. LUCENE-2979_phillipe_ramalho_2.patch
        162 kB
        Phillipe Ramalho
      2. LUCENE-2979_phillipe_ramalho_3.patch
        167 kB
        Phillipe Ramalho
      3. LUCENE-2979_phillipe_ramalho_3.patch
        167 kB
        Phillipe Ramalho
      4. LUCENE-2979_phillipe_ramalho_4_3x.patch
        159 kB
        Phillipe Ramalho
      5. LUCENE-2979_phillipe_ramalho_4_trunk.patch
        3 kB
        Phillipe Ramalho
      6. LUCENE-2979_phillipe_ramalho_5_3x.patch
        40 kB
        Phillipe Ramalho
      7. LUCENE-2979_phillipe_reamalho.patch
        148 kB
        Phillipe Ramalho

        Issue Links

          Activity

          Hide
          Yonik Seeley added a comment -

          moving out of 3.1 - this is not a release blocker

          Show
          Yonik Seeley added a comment - moving out of 3.1 - this is not a release blocker
          Hide
          Phillipe Ramalho added a comment -

          Hi,

          I am considering doing a gsoc proposal about this, any specific points I should be covering on the proposal?

          I saw Adriano's comment on another LUCENE-1823:

          The map idea is really good and fits well as configuration for the QP, but I would like to restrict the key type, so the user doesn't use a String object as key. String keys may lead to runtime errors, mainly when they are inserted inline. I would prefer to use enums as keys, it would enforce the user to always pass the same object as key when referencing the same configuration. It also avoids duplicated configuration keys, once each enum type has only one instance per JVM.

          If nobody complains about using a Map<Enum<?>, Object> as configuration for QP framework, I will start working on a new patch including these changes soon.

          I will try to initially cover how we can use Map to replace the current config API. Also I would like to cover how/whether we can make the new API compatible with the old one, so users can migrate from old to new slowly, deprecating the old one of course. I will also investigate the best way to enforce the user to always pass the same key object. Also try to suggest an API that will allow the users to retrieve the config values without casting them from Object, maybe Java generic capability will enable it, but I am not sure it will work with Enum.

          Anything else I should be covering on the proposal?

          Show
          Phillipe Ramalho added a comment - Hi, I am considering doing a gsoc proposal about this, any specific points I should be covering on the proposal? I saw Adriano's comment on another LUCENE-1823 : The map idea is really good and fits well as configuration for the QP, but I would like to restrict the key type, so the user doesn't use a String object as key. String keys may lead to runtime errors, mainly when they are inserted inline. I would prefer to use enums as keys, it would enforce the user to always pass the same object as key when referencing the same configuration. It also avoids duplicated configuration keys, once each enum type has only one instance per JVM. If nobody complains about using a Map<Enum<?>, Object> as configuration for QP framework, I will start working on a new patch including these changes soon. I will try to initially cover how we can use Map to replace the current config API. Also I would like to cover how/whether we can make the new API compatible with the old one, so users can migrate from old to new slowly, deprecating the old one of course. I will also investigate the best way to enforce the user to always pass the same key object. Also try to suggest an API that will allow the users to retrieve the config values without casting them from Object, maybe Java generic capability will enable it, but I am not sure it will work with Enum. Anything else I should be covering on the proposal?
          Hide
          Adriano Crestani added a comment -

          Assigning to myself as I am willing to mentor

          Show
          Adriano Crestani added a comment - Assigning to myself as I am willing to mentor
          Hide
          Adriano Crestani added a comment -

          Hi Phillipe,

          It's good to see your interest on this project. I think what you plan to cover on the proposal is good enough. Don't forget to add a timeline and describe how much time you are willing to spend on this project this summer, also include any vacation period.

          Show
          Adriano Crestani added a comment - Hi Phillipe, It's good to see your interest on this project. I think what you plan to cover on the proposal is good enough. Don't forget to add a timeline and describe how much time you are willing to spend on this project this summer, also include any vacation period.
          Hide
          Phillipe Ramalho added a comment -

          Here are my initial thoughts:

          There will be two types of config classes, as there is today, one for global configuration and other for field configuration. The global config class will hold reference to the field configurations. The field configuration will have the field name. They both will extend a common class, which is some kind of map.

          The key will be an special type (ConfigKey). This ConfigKey class will be final and will receive a generic argument when constructed. The user will need to define it and always use the same constants. To make sure the user uses it correctly, we can enforce ConfigKey.equals to only return true when the same instance is passed to equals method, so the map will only return the object for that key when the defined key is used. Example:

          // developer code
          final public static ConfigKey<String> MY_CONFIG_KEY = new ConfigKey<String>();
          ...
          String myConfig = getConfig().get(MY_CONFIG_KEY);
          /////////////////
          // user code
          getConfig().set(MY_CONFIG_KEY, "value1"); // works
          getConfig().set(new ConfigKey<String>(), "value1"); // does not work, as the developer's code will look up for the key the developer has previously defined, any other instance passed as key won't be found when the developer's code is executed

          I couldn't find a way to do the generic capability above with Enum. Actually, I don't see any reason to use Enum in this case.

          Thoughts?

          Show
          Phillipe Ramalho added a comment - Here are my initial thoughts: There will be two types of config classes, as there is today, one for global configuration and other for field configuration. The global config class will hold reference to the field configurations. The field configuration will have the field name. They both will extend a common class, which is some kind of map. The key will be an special type (ConfigKey). This ConfigKey class will be final and will receive a generic argument when constructed. The user will need to define it and always use the same constants. To make sure the user uses it correctly, we can enforce ConfigKey.equals to only return true when the same instance is passed to equals method, so the map will only return the object for that key when the defined key is used. Example: // developer code final public static ConfigKey<String> MY_CONFIG_KEY = new ConfigKey<String>(); ... String myConfig = getConfig().get(MY_CONFIG_KEY); ///////////////// // user code getConfig().set(MY_CONFIG_KEY, "value1"); // works getConfig().set(new ConfigKey<String>(), "value1"); // does not work, as the developer's code will look up for the key the developer has previously defined, any other instance passed as key won't be found when the developer's code is executed I couldn't find a way to do the generic capability above with Enum. Actually, I don't see any reason to use Enum in this case. Thoughts?
          Hide
          Adriano Crestani added a comment -

          Hi Phillip,

          I like your idea, similiar to the one I have, but I was planning to use enum, however, after spent some time thinking, I can't see how I can use generic the way you described only using enum. So go ahead with your idea and create a proposal

          Don't forget to describe how you plan make the old and new API work together.

          Show
          Adriano Crestani added a comment - Hi Phillip, I like your idea, similiar to the one I have, but I was planning to use enum, however, after spent some time thinking, I can't see how I can use generic the way you described only using enum. So go ahead with your idea and create a proposal Don't forget to describe how you plan make the old and new API work together.
          Hide
          Phillipe Ramalho added a comment -

          Don't forget to describe how you plan make the old and new API work together.

          I will do, thanks for reminding me

          Show
          Phillipe Ramalho added a comment - Don't forget to describe how you plan make the old and new API work together. I will do, thanks for reminding me
          Hide
          Phillipe Ramalho added a comment -

          Hi Adriano,

          I finally submitted my proposal. Comments are welcome!

          Note that I used the title "Lucene-2979: Simplify configuration API of contrib Query Parser" in the proposal. I hope you can find it, not sure how I can reference it.

          Show
          Phillipe Ramalho added a comment - Hi Adriano, I finally submitted my proposal. Comments are welcome! Note that I used the title "Lucene-2979: Simplify configuration API of contrib Query Parser" in the proposal. I hope you can find it, not sure how I can reference it.
          Hide
          Adriano Crestani added a comment -

          Hi Phillipe,

          Good proposal, very detailed.

          Looking at the schedule table you created, it sounds like now the project is small for a two and half month project. However, I am probably underestimating the difficulty of the project, mainly because I am already used to the code. So you probably shouldn't worry about it

          Show
          Adriano Crestani added a comment - Hi Phillipe, Good proposal, very detailed. Looking at the schedule table you created, it sounds like now the project is small for a two and half month project. However, I am probably underestimating the difficulty of the project, mainly because I am already used to the code. So you probably shouldn't worry about it
          Hide
          Robert Muir added a comment -

          bulk move 3.2 -> 3.3

          Show
          Robert Muir added a comment - bulk move 3.2 -> 3.3
          Hide
          Phillipe Ramalho added a comment -

          This is finally my first patch. Sorry for taking so long, but I started changing the API and it broke a lot of code, which took forever to fix. Now it's working and all junits are passing.

          So far, I changed the entire configuration API. Next step is to write more junits and update/write javadocs.

          Show
          Phillipe Ramalho added a comment - This is finally my first patch. Sorry for taking so long, but I started changing the API and it broke a lot of code, which took forever to fix. Now it's working and all junits are passing. So far, I changed the entire configuration API. Next step is to write more junits and update/write javadocs.
          Hide
          Phillipe Ramalho added a comment -

          As Adriano asked me, here is the first patch ready to be committed. It includes javadoc and package.html and overview.html updated based on the changes I made to the code.

          I am still working on integrating the new API with the old API.

          Show
          Phillipe Ramalho added a comment - As Adriano asked me, here is the first patch ready to be committed. It includes javadoc and package.html and overview.html updated based on the changes I made to the code. I am still working on integrating the new API with the old API.
          Hide
          Adriano Crestani added a comment -

          Hi Phillipe, thanks for the patch. However, as you did many changes to javadocs, I decided to run "ant javadocs" and it fails. It seems your patch references many times the constants in StandardQueryConfigHandler.ConfigurationKeys using @see tag, unfortunately you forgot to create a javadoc for those constants and it's causing the ant script to fail. Please, add these missing javadocs, run ant javadocs on contrib/queryparser to check if it finishes successfully and then submit a new patch.

          Besides that, great job, tests are running fine even after your big change

          Thanks!

          Show
          Adriano Crestani added a comment - Hi Phillipe, thanks for the patch. However, as you did many changes to javadocs, I decided to run "ant javadocs" and it fails. It seems your patch references many times the constants in StandardQueryConfigHandler.ConfigurationKeys using @see tag, unfortunately you forgot to create a javadoc for those constants and it's causing the ant script to fail. Please, add these missing javadocs, run ant javadocs on contrib/queryparser to check if it finishes successfully and then submit a new patch. Besides that, great job, tests are running fine even after your big change Thanks!
          Hide
          Phillipe Ramalho added a comment -

          Hi Adriano,

          Sorry for that, I forgot to add javadoc to those comments, they are pretty important.

          Anyway, the problem was not really the missing javadoc, but javadoc was not understanding the link to inner classes (Class.InnerClass#Constant), I had to reference the inner class directly (InnerClass#Constant).

          However, there is a still a javadoc warning, I just sent an email to the mailing list, I hope you saw it already, where I report the problem.

          Here is the third patch with javadoc fixes.

          Show
          Phillipe Ramalho added a comment - Hi Adriano, Sorry for that, I forgot to add javadoc to those comments, they are pretty important. Anyway, the problem was not really the missing javadoc, but javadoc was not understanding the link to inner classes (Class.InnerClass#Constant), I had to reference the inner class directly (InnerClass#Constant). However, there is a still a javadoc warning, I just sent an email to the mailing list, I hope you saw it already, where I report the problem. Here is the third patch with javadoc fixes.
          Hide
          Phillipe Ramalho added a comment -

          oops, I had forgotten to check the ASF license.

          Show
          Phillipe Ramalho added a comment - oops, I had forgotten to check the ASF license.
          Hide
          Adriano Crestani added a comment -

          Hi Phillipe, thanks for the quick fix!

          Just committed your last patch (LUCENE-2979_phillipe_ramalho_3.patch) on revision 1142862

          Show
          Adriano Crestani added a comment - Hi Phillipe, thanks for the quick fix! Just committed your last patch ( LUCENE-2979 _phillipe_ramalho_3.patch) on revision 1142862
          Hide
          Uwe Schindler added a comment - - edited

          There was an issue opened today: LUCENE-3352

          Maybe that is related to the config changes here, perhaps already fixed?

          Show
          Uwe Schindler added a comment - - edited There was an issue opened today: LUCENE-3352 Maybe that is related to the config changes here, perhaps already fixed?
          Hide
          Phillipe Ramalho added a comment -

          Here is a patch that backports the new configuration API to 3.x. I did exactly as I described in my proposal and it seems to be working as expected. I changed the documentation as well (I hope I everything, can you double check that Adriano?).

          I also created a simple example of how to use the new API in package.html and added to both 3.x and trunk.

          Please, let me know if everything looks good and if I didn't break any API.

          Show
          Phillipe Ramalho added a comment - Here is a patch that backports the new configuration API to 3.x. I did exactly as I described in my proposal and it seems to be working as expected. I changed the documentation as well (I hope I everything, can you double check that Adriano?). I also created a simple example of how to use the new API in package.html and added to both 3.x and trunk. Please, let me know if everything looks good and if I didn't break any API.
          Hide
          Phillipe Ramalho added a comment -

          Hi Uwe,

          Is there anything to be fixed in 3352? I see it's a "new feature" JIRA. Am I missing something?

          Currently, I am only working on migrating the old to new API and doing no changes on how the configuration is used. So nothing here changes (at least should not) how ParametricQueryNodeProcessor works.

          Show
          Phillipe Ramalho added a comment - Hi Uwe, Is there anything to be fixed in 3352? I see it's a "new feature" JIRA. Am I missing something? Currently, I am only working on migrating the old to new API and doing no changes on how the configuration is used. So nothing here changes (at least should not) how ParametricQueryNodeProcessor works.
          Hide
          Adriano Crestani added a comment -

          Hi Phillipe,

          Thanks for the patch. I just applied your patch for 3x. It looks good.

          As you removed TestAttributes, can you create another junit to test whether configuration is updated when an attribute (like CharTermAttribute) is updated, which is basically the new functionality of the newly deprecated query parser attributes.

          Show
          Adriano Crestani added a comment - Hi Phillipe, Thanks for the patch. I just applied your patch for 3x. It looks good. As you removed TestAttributes, can you create another junit to test whether configuration is updated when an attribute (like CharTermAttribute) is updated, which is basically the new functionality of the newly deprecated query parser attributes.
          Hide
          Phillipe Ramalho added a comment -

          Creating a junit for attributes was a good idea, I was able to find many problems. I spent the last three weeks working on them and reviewing the code. I think this is the last patch for this jira, it should be done now.

          Show
          Phillipe Ramalho added a comment - Creating a junit for attributes was a good idea, I was able to find many problems. I spent the last three weeks working on them and reviewing the code. I think this is the last patch for this jira, it should be done now.
          Hide
          Adriano Crestani added a comment -

          Hi Phillipe,

          Just committed your latest patch. Thanks for the great job.

          I will set this item as completed now.

          Show
          Adriano Crestani added a comment - Hi Phillipe, Just committed your latest patch. Thanks for the great job. I will set this item as completed now.
          Hide
          Michael McCandless added a comment -

          I think TestAttributes was accidentally committed to the wrong place? It's in core now, but should be under contrib/queryparser instead?

          Show
          Michael McCandless added a comment - I think TestAttributes was accidentally committed to the wrong place? It's in core now, but should be under contrib/queryparser instead?
          Hide
          Michael McCandless added a comment -

          OK, I just moved it.

          Show
          Michael McCandless added a comment - OK, I just moved it.

            People

            • Assignee:
              Adriano Crestani
              Reporter:
              Adriano Crestani
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development