Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-301

Index Writer constructor flags unclear - and annoying in certain cases

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4
    • Fix Version/s: 2.1
    • Component/s: core/index
    • Labels:
      None
    • Environment:

      Operating System: other
      Platform: Other

    • Bugzilla Id:
      32088

      Description

      Wouldn't it make more sense if the constructor for the IndexWriter always
      created an index if it doesn't exist - and the boolean parameter should be
      "clear" (instead of "create")

      So instead of this (from javadoc):

      IndexWriter

      public IndexWriter(Directory d,
      Analyzer a,
      boolean create)
      throws IOException

      Constructs an IndexWriter for the index in d. Text will be analyzed with a.
      If create is true, then a new, empty index will be created in d, replacing the
      index already there, if any.

      Parameters:
      d - the index directory
      a - the analyzer to use
      create - true to create the index or overwrite the existing one; false to
      append to the existing index
      Throws:
      IOException - if the directory cannot be read/written to, or if it does not
      exist, and create is false

      We would have this:

      IndexWriter

      public IndexWriter(Directory d,
      Analyzer a,
      boolean clear)
      throws IOException

      Constructs an IndexWriter for the index in d. Text will be analyzed with a.
      If clear is true, and a index exists at location d, then it will be erased, and
      a new, empty index will be created in d.

      Parameters:
      d - the index directory
      a - the analyzer to use
      clear - true to overwrite the existing one; false to append to the existing
      index
      Throws:
      IOException - if the directory cannot be read/written to, or if it does not
      exist.

      Its current behavior is kind of annoying, because I have an app that should
      never clear an existing index, it should always append. So I want create set to
      false. But when I am starting a brand new index, then I have to change the
      create flag to keep it from throwing an exception... I guess for now I will
      have to write code to check if a index actually has content yet, and if it
      doesn't, change the flag on the fly.

        Activity

        Hide
        armbrust Dan Armbrust added a comment -

        Also note that, by changing the parameters in this way, no breakage or behavior
        change occurs for end users.

        When the 3rd parameter flag is set to true - it will erase the existing index,
        just like it does now.

        If it is set to false, it will not erase the existing index - it will append,
        just like it does now.

        The only change (and I consider it an improvment, and unlikely to break anyones
        code) is that if it is set to false, and no index exists yet, then it will
        create a new empty index at the specified location.

        Show
        armbrust Dan Armbrust added a comment - Also note that, by changing the parameters in this way, no breakage or behavior change occurs for end users. When the 3rd parameter flag is set to true - it will erase the existing index, just like it does now. If it is set to false, it will not erase the existing index - it will append, just like it does now. The only change (and I consider it an improvment, and unlikely to break anyones code) is that if it is set to false, and no index exists yet, then it will create a new empty index at the specified location.
        Hide
        armbrust Dan Armbrust added a comment -

        Could this be addresses or WONT FIXED before the 1.9 release?

        Show
        armbrust Dan Armbrust added a comment - Could this be addresses or WONT FIXED before the 1.9 release?
        Hide
        hossman Hoss Man added a comment -

        >The only change (and I consider it an improvment, and unlikely to break anyones
        >code) is that if it is set to false, and no index exists yet, then it will
        >create a new empty index at the specified location.

        ...that could be very bad for tools/apps that wantto update an existing index after reading the directory path from user input or configs when the user makes a typo. In the past "new IndexWriter(d,a,false) would throw an Exception indicating a problem, with the change you describe it would happily make a new index.

        I'd be happy to see a more explict API with more options about what you want IndexWRiter to do if/when the directory does'doesn't exist ... but it should be 100% backwards compatible. deprecating this constructor but leaving it alone and adding a new one with more options seems like a much safer approach.

        Show
        hossman Hoss Man added a comment - >The only change (and I consider it an improvment, and unlikely to break anyones >code) is that if it is set to false, and no index exists yet, then it will >create a new empty index at the specified location. ...that could be very bad for tools/apps that wantto update an existing index after reading the directory path from user input or configs when the user makes a typo. In the past "new IndexWriter(d,a,false) would throw an Exception indicating a problem, with the change you describe it would happily make a new index. I'd be happy to see a more explict API with more options about what you want IndexWRiter to do if/when the directory does'doesn't exist ... but it should be 100% backwards compatible. deprecating this constructor but leaving it alone and adding a new one with more options seems like a much safer approach.
        Hide
        cutting Doug Cutting added a comment -

        I agree with Hoss that we need to be very careful about compatibility.

        Why not add a new constructor, IndexWriter(Directory, Analyzer) which, if no such index exists, creates one? Or we could deprecate the existing constructors and add new ones whose third parameter is a type-safe enumeration of IfExists:OVERWRITE, IfExists:APPEND or somesuch.

        Show
        cutting Doug Cutting added a comment - I agree with Hoss that we need to be very careful about compatibility. Why not add a new constructor, IndexWriter(Directory, Analyzer) which, if no such index exists, creates one? Or we could deprecate the existing constructors and add new ones whose third parameter is a type-safe enumeration of IfExists:OVERWRITE, IfExists:APPEND or somesuch.
        Hide
        armbrust Dan Armbrust added a comment -

        I'm perfectly happy with either new constructor approach - as long as there is a better constructor than what is currently available for my (and I suspect many other folks) "normal" use case of these constructors.

        It doesn't really matter to me if the current one stays or gets deprecated. If it was going to be deprecated, it seems like 1.9 would be a good place to do it.

        Show
        armbrust Dan Armbrust added a comment - I'm perfectly happy with either new constructor approach - as long as there is a better constructor than what is currently available for my (and I suspect many other folks) "normal" use case of these constructors. It doesn't really matter to me if the current one stays or gets deprecated. If it was going to be deprecated, it seems like 1.9 would be a good place to do it.
        Hide
        mikemccand Michael McCandless added a comment -

        I like Doug's solution "add a new constructor, IndexWriter(Directory, Analyzer) which, if no such index exists, creates one" – I think this makses sense. I will commit this.

        Show
        mikemccand Michael McCandless added a comment - I like Doug's solution "add a new constructor, IndexWriter(Directory, Analyzer) which, if no such index exists, creates one" – I think this makses sense. I will commit this.
        Hide
        mikemccand Michael McCandless added a comment -

        OK I've added the new constructors for IndexWriter. Thank you Dan and sorry for the long delay!

        Show
        mikemccand Michael McCandless added a comment - OK I've added the new constructors for IndexWriter. Thank you Dan and sorry for the long delay!
        Hide
        mikemccand Michael McCandless added a comment -

        Closing all issues that were resolved for 2.1.

        Show
        mikemccand Michael McCandless added a comment - Closing all issues that were resolved for 2.1.

          People

          • Assignee:
            mikemccand Michael McCandless
            Reporter:
            armbrust Dan Armbrust
          • Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development