Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.0, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Spinoff from LUCENE-5952:

      • Fix .si to write Version as 3 ints, not a String that requires parsing at read time.
      • Lucene42TermVectorsFormat should not use the same codecName as Lucene41StoredFieldsFormat

      It would also be nice if we had a "bumpCodecVersion" script so rolling a new codec is not so daunting.

      1. LUCENE-5969_part2.patch
        869 kB
        Robert Muir
      2. LUCENE-5969_part3.patch
        1.18 MB
        Robert Muir
      3. LUCENE-5969.patch
        199 kB
        Robert Muir
      4. LUCENE-5969.patch
        73 kB
        Robert Muir

        Issue Links

          Activity

          Hide
          Robert Muir added a comment -

          I dont think we should do this wit a script. More care is needed. I already volunteered to do the bumping. Honestly if we want this to be easier, the fix is to remove or simplify all the crazy tests that need to manipulate the default codec directly. Why must they be so damn complicated like that?

          Show
          Robert Muir added a comment - I dont think we should do this wit a script. More care is needed. I already volunteered to do the bumping. Honestly if we want this to be easier, the fix is to remove or simplify all the crazy tests that need to manipulate the default codec directly. Why must they be so damn complicated like that?
          Hide
          Michael McCandless added a comment -

          Honestly if we want this to be easier, the fix is to remove or simplify all the crazy tests that need to manipulate the default codec directly. Why must they be so damn complicated like that?

          +1, I don't know why so many places must be changed when we move to a new codec ...

          Show
          Michael McCandless added a comment - Honestly if we want this to be easier, the fix is to remove or simplify all the crazy tests that need to manipulate the default codec directly. Why must they be so damn complicated like that? +1, I don't know why so many places must be changed when we move to a new codec ...
          Hide
          Robert Muir added a comment -

          I will try to clean this up some this morning before switching over any codec. Easy ones first.

          Show
          Robert Muir added a comment - I will try to clean this up some this morning before switching over any codec. Easy ones first.
          Hide
          ASF subversion and git services added a comment -

          Commit 1626753 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1626753 ]

          LUCENE-5969: remove unnecessary code

          Show
          ASF subversion and git services added a comment - Commit 1626753 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1626753 ] LUCENE-5969 : remove unnecessary code
          Hide
          ASF subversion and git services added a comment -

          Commit 1626754 from Robert Muir in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1626754 ]

          LUCENE-5969: remove unnecessary code

          Show
          ASF subversion and git services added a comment - Commit 1626754 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1626754 ] LUCENE-5969 : remove unnecessary code
          Hide
          ASF subversion and git services added a comment -

          Commit 1626778 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1626778 ]

          LUCENE-5969: add TestUtil.getDefaultCodec()

          Show
          ASF subversion and git services added a comment - Commit 1626778 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1626778 ] LUCENE-5969 : add TestUtil.getDefaultCodec()
          Hide
          ASF subversion and git services added a comment -

          Commit 1626781 from Robert Muir in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1626781 ]

          LUCENE-5969: add TestUtil.getDefaultCodec()

          Show
          ASF subversion and git services added a comment - Commit 1626781 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1626781 ] LUCENE-5969 : add TestUtil.getDefaultCodec()
          Hide
          ASF subversion and git services added a comment -

          Commit 1626794 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1626794 ]

          LUCENE-5969: improve this test, we cant uncomment the assert until we fix 5.0 codec

          Show
          ASF subversion and git services added a comment - Commit 1626794 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1626794 ] LUCENE-5969 : improve this test, we cant uncomment the assert until we fix 5.0 codec
          Hide
          ASF subversion and git services added a comment -

          Commit 1626796 from Robert Muir in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1626796 ]

          LUCENE-5969: improve this test, we cant uncomment the assert until we fix 5.0 codec

          Show
          ASF subversion and git services added a comment - Commit 1626796 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1626796 ] LUCENE-5969 : improve this test, we cant uncomment the assert until we fix 5.0 codec
          Hide
          ASF subversion and git services added a comment -

          Commit 1626826 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1626826 ]

          LUCENE-5969: let AssertingCodec implement RandomAccessOrds, and test it directly too

          Show
          ASF subversion and git services added a comment - Commit 1626826 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1626826 ] LUCENE-5969 : let AssertingCodec implement RandomAccessOrds, and test it directly too
          Hide
          ASF subversion and git services added a comment -

          Commit 1626830 from Robert Muir in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1626830 ]

          LUCENE-5969: let AssertingCodec implement RandomAccessOrds, and test it directly too

          Show
          ASF subversion and git services added a comment - Commit 1626830 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1626830 ] LUCENE-5969 : let AssertingCodec implement RandomAccessOrds, and test it directly too
          Hide
          ASF subversion and git services added a comment -

          Commit 1626921 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1626921 ]

          LUCENE-5969: move bumping default codec/dv/pf in tests to TestUtil methods, put blocktreeords in rotation

          Show
          ASF subversion and git services added a comment - Commit 1626921 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1626921 ] LUCENE-5969 : move bumping default codec/dv/pf in tests to TestUtil methods, put blocktreeords in rotation
          Hide
          ASF subversion and git services added a comment -

          Commit 1626922 from Robert Muir in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1626922 ]

          LUCENE-5969: move bumping default codec/dv/pf in tests to TestUtil methods, put blocktreeords in rotation

          Show
          ASF subversion and git services added a comment - Commit 1626922 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1626922 ] LUCENE-5969 : move bumping default codec/dv/pf in tests to TestUtil methods, put blocktreeords in rotation
          Hide
          Robert Muir added a comment -

          Here is an initial patch.

          I already refactored tests so to bump the default codec/PF/DVF its much easier: you just change methods in TestUtil.

          I also added a check to TestAllFilesHaveCodecHeader to look for duplicate codec names (its commented out until we fix TVF).

          In this patch, I added new infos formats (.SI and .FNM) that don't have all the confusing backwards version stuff. The fnm reader and writer (and checkindex) hard-check fieldinfos consistency on both read and write.

          Also checkindex got a little cleanup, so that "foreign" readers (TestUtil.checkReader) get fieldinfos and livedocs validation, whereas they did not before.

          I added CodecUtil.checkFooter(input, Throwable) to give better exceptions when things are corrupt (e.g., it adds suppressed exception for checksum status), and cut over .SI/.FNM/.NVM/.DVM to use it. I also added standalone tests for this.

          I want to cutover other parts too (like .FDX, .TVX, ...) but we shouldnt use this method until we remove all the conditional versioning and cut "clean" versions (also without bogus codec ids), otherwise I think its confusing and potentially unsafe.

          However, I think we should start with this, to unblock Mike's cleanup of SI version handling and other work? I dont think we have to write 5.0's format in one day.

          After we are happy with 5.0 format, we can then cleanup the back compat (trunk doesnt need all the 4.x back compat, etc).

          Show
          Robert Muir added a comment - Here is an initial patch. I already refactored tests so to bump the default codec/PF/DVF its much easier: you just change methods in TestUtil. I also added a check to TestAllFilesHaveCodecHeader to look for duplicate codec names (its commented out until we fix TVF). In this patch, I added new infos formats (.SI and .FNM) that don't have all the confusing backwards version stuff. The fnm reader and writer (and checkindex) hard-check fieldinfos consistency on both read and write. Also checkindex got a little cleanup, so that "foreign" readers (TestUtil.checkReader) get fieldinfos and livedocs validation, whereas they did not before. I added CodecUtil.checkFooter(input, Throwable) to give better exceptions when things are corrupt (e.g., it adds suppressed exception for checksum status), and cut over .SI/.FNM/.NVM/.DVM to use it. I also added standalone tests for this. I want to cutover other parts too (like .FDX, .TVX, ...) but we shouldnt use this method until we remove all the conditional versioning and cut "clean" versions (also without bogus codec ids), otherwise I think its confusing and potentially unsafe. However, I think we should start with this, to unblock Mike's cleanup of SI version handling and other work? I dont think we have to write 5.0's format in one day. After we are happy with 5.0 format, we can then cleanup the back compat (trunk doesnt need all the 4.x back compat, etc).
          Hide
          Michael McCandless added a comment -

          +1, this is fabulous!

          Show
          Michael McCandless added a comment - +1, this is fabulous!
          Hide
          Uwe Schindler added a comment - - edited

          I fixed the forbidden issue in the process-webpages ant target in svn and made the comment "generic" (it was just a template). You can remove/resolve the conflict on svn up with "use theirs".

          Show
          Uwe Schindler added a comment - - edited I fixed the forbidden issue in the process-webpages ant target in svn and made the comment "generic" (it was just a template). You can remove/resolve the conflict on svn up with "use theirs".
          Hide
          Robert Muir added a comment -

          Thanks Uwe, one less thing to update when bumping codec version, too.

          Show
          Robert Muir added a comment - Thanks Uwe, one less thing to update when bumping codec version, too.
          Hide
          ASF subversion and git services added a comment -

          Commit 1627187 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1627187 ]

          LUCENE-5969: Add Lucene50Codec (infos, dv, norms)

          Show
          ASF subversion and git services added a comment - Commit 1627187 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1627187 ] LUCENE-5969 : Add Lucene50Codec (infos, dv, norms)
          Hide
          ASF subversion and git services added a comment -

          Commit 1627189 from Robert Muir in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1627189 ]

          LUCENE-5969: Add Lucene50Codec (infos, dv, norms)

          Show
          ASF subversion and git services added a comment - Commit 1627189 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1627189 ] LUCENE-5969 : Add Lucene50Codec (infos, dv, norms)
          Hide
          Uwe Schindler added a comment -

          Thanks Robert. Unfortunately I was not able to verify the full patch. But the changes with supressed exceptions looked fine.

          Show
          Uwe Schindler added a comment - Thanks Robert. Unfortunately I was not able to verify the full patch. But the changes with supressed exceptions looked fine.
          Hide
          ASF subversion and git services added a comment -

          Commit 1627517 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1627517 ]

          LUCENE-5969: Lucene410Codec -> backwards

          Show
          ASF subversion and git services added a comment - Commit 1627517 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1627517 ] LUCENE-5969 : Lucene410Codec -> backwards
          Hide
          ASF subversion and git services added a comment -

          Commit 1627522 from Robert Muir in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1627522 ]

          LUCENE-5969: Lucene410Codec -> backwards

          Show
          ASF subversion and git services added a comment - Commit 1627522 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1627522 ] LUCENE-5969 : Lucene410Codec -> backwards
          Hide
          ASF subversion and git services added a comment -

          Commit 1627530 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627530 ]

          LUCENE-5969: make branch for the heavy parts

          Show
          ASF subversion and git services added a comment - Commit 1627530 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627530 ] LUCENE-5969 : make branch for the heavy parts
          Hide
          ASF subversion and git services added a comment -

          Commit 1627535 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627535 ]

          LUCENE-5969, LUCENE-5412: make .si immutable again, and make ancient writers read-only

          Show
          ASF subversion and git services added a comment - Commit 1627535 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627535 ] LUCENE-5969 , LUCENE-5412 : make .si immutable again, and make ancient writers read-only
          Hide
          ASF subversion and git services added a comment -

          Commit 1627544 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627544 ]

          LUCENE-5969: file mismatch detection for 5.0 fnm

          Show
          ASF subversion and git services added a comment - Commit 1627544 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627544 ] LUCENE-5969 : file mismatch detection for 5.0 fnm
          Hide
          ASF subversion and git services added a comment -

          Commit 1627564 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627564 ]

          LUCENE-5969: add AssertingLiveDocsFormat

          Show
          ASF subversion and git services added a comment - Commit 1627564 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627564 ] LUCENE-5969 : add AssertingLiveDocsFormat
          Hide
          ASF subversion and git services added a comment -

          Commit 1627593 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627593 ]

          LUCENE-5969: improve checks for livedocs

          Show
          ASF subversion and git services added a comment - Commit 1627593 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627593 ] LUCENE-5969 : improve checks for livedocs
          Hide
          ASF subversion and git services added a comment -

          Commit 1627669 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627669 ]

          LUCENE-5969: beef up AssertingLiveDocs more

          Show
          ASF subversion and git services added a comment - Commit 1627669 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627669 ] LUCENE-5969 : beef up AssertingLiveDocs more
          Hide
          ASF subversion and git services added a comment -

          Commit 1627701 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627701 ]

          LUCENE-5969: take bitvector out back and shoot it

          Show
          ASF subversion and git services added a comment - Commit 1627701 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627701 ] LUCENE-5969 : take bitvector out back and shoot it
          Hide
          ASF subversion and git services added a comment -

          Commit 1627714 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627714 ]

          LUCENE-5969, LUCENE-5895: fix sign bit bugs in segment/commit IDs, use byte[] representation

          Show
          ASF subversion and git services added a comment - Commit 1627714 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627714 ] LUCENE-5969 , LUCENE-5895 : fix sign bit bugs in segment/commit IDs, use byte[] representation
          Hide
          ASF subversion and git services added a comment -

          Commit 1627805 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627805 ]

          LUCENE-5969: add remaining infos/deletions safety

          Show
          ASF subversion and git services added a comment - Commit 1627805 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627805 ] LUCENE-5969 : add remaining infos/deletions safety
          Hide
          Robert Muir added a comment -

          Here is a difference between trunk and branch as a patch.

          • fixed .si to be immutable again, so copySegmentAsIs doesn't rewrite it. Instead we strip segment prefixes just like .CFS (LUCENE-5412)
          • disabled write-access to all old .SI writers, its no longer needed.
          • fixed segment/commit unique ID generation (bugs with sign bits). Also changed this to be a byte[] so it can be efficiently encoded.
          • Add CodecUtil.write/checkSegmentHeader, which is a regular header, plus the ID of the segment. This gives us mismatched files detection.
          • beefed up assertingcodec more, with assertinglivedocs
          • add lots of safety to .si/.fnm/.del
          • moved out cruft to backwards-codecs.

          I think this is a good point to merge, and then i will continue on with the other parts of the index.

          Show
          Robert Muir added a comment - Here is a difference between trunk and branch as a patch. fixed .si to be immutable again, so copySegmentAsIs doesn't rewrite it. Instead we strip segment prefixes just like .CFS ( LUCENE-5412 ) disabled write-access to all old .SI writers, its no longer needed. fixed segment/commit unique ID generation (bugs with sign bits). Also changed this to be a byte[] so it can be efficiently encoded. Add CodecUtil.write/checkSegmentHeader, which is a regular header, plus the ID of the segment. This gives us mismatched files detection. beefed up assertingcodec more, with assertinglivedocs add lots of safety to .si/.fnm/.del moved out cruft to backwards-codecs. I think this is a good point to merge, and then i will continue on with the other parts of the index.
          Hide
          Michael McCandless added a comment -

          +1, these are great improvements.

          Show
          Michael McCandless added a comment - +1, these are great improvements.
          Hide
          ASF subversion and git services added a comment -

          Commit 1627941 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1627941 ]

          LUCENE-5969, LUCENE-5412: add more infos/metadata safety

          Show
          ASF subversion and git services added a comment - Commit 1627941 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1627941 ] LUCENE-5969 , LUCENE-5412 : add more infos/metadata safety
          Hide
          ASF subversion and git services added a comment -

          Commit 1627943 from Robert Muir in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1627943 ]

          LUCENE-5969, LUCENE-5412: add more infos/metadata safety

          Show
          ASF subversion and git services added a comment - Commit 1627943 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1627943 ] LUCENE-5969 , LUCENE-5412 : add more infos/metadata safety
          Hide
          ASF subversion and git services added a comment -

          Commit 1627944 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627944 ]

          LUCENE-5969: delete branch

          Show
          ASF subversion and git services added a comment - Commit 1627944 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627944 ] LUCENE-5969 : delete branch
          Hide
          ASF subversion and git services added a comment -

          Commit 1627945 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627945 ]

          LUCENE-5969: recreate branch

          Show
          ASF subversion and git services added a comment - Commit 1627945 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627945 ] LUCENE-5969 : recreate branch
          Hide
          ASF subversion and git services added a comment -

          Commit 1627951 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627951 ]

          LUCENE-5969: assert -> check

          Show
          ASF subversion and git services added a comment - Commit 1627951 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627951 ] LUCENE-5969 : assert -> check
          Hide
          ASF subversion and git services added a comment -

          Commit 1627954 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1627954 ]

          LUCENE-5969: assert -> check

          Show
          ASF subversion and git services added a comment - Commit 1627954 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1627954 ] LUCENE-5969 : assert -> check
          Hide
          ASF subversion and git services added a comment -

          Commit 1628019 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628019 ]

          LUCENE-5969: copy over cruft for back compat

          Show
          ASF subversion and git services added a comment - Commit 1628019 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628019 ] LUCENE-5969 : copy over cruft for back compat
          Hide
          ASF subversion and git services added a comment -

          Commit 1628024 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628024 ]

          LUCENE-5969: remove some cruft, header -> segmentHeader

          Show
          ASF subversion and git services added a comment - Commit 1628024 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628024 ] LUCENE-5969 : remove some cruft, header -> segmentHeader
          Hide
          ASF subversion and git services added a comment -

          Commit 1628026 from Michael McCandless in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628026 ]

          LUCENE-5969: cutover all merge implementations away from LeafReader[] to individual low-level producers

          Show
          ASF subversion and git services added a comment - Commit 1628026 from Michael McCandless in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628026 ] LUCENE-5969 : cutover all merge implementations away from LeafReader[] to individual low-level producers
          Hide
          ASF subversion and git services added a comment -

          Commit 1628070 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628070 ]

          LUCENE-5969: fix compile/javadocs, tighten up backwards codecs, add more safety to 5.x fields/vectors

          Show
          ASF subversion and git services added a comment - Commit 1628070 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628070 ] LUCENE-5969 : fix compile/javadocs, tighten up backwards codecs, add more safety to 5.x fields/vectors
          Hide
          ASF subversion and git services added a comment -

          Commit 1628073 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628073 ]

          LUCENE-5969: add merge api

          Show
          ASF subversion and git services added a comment - Commit 1628073 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628073 ] LUCENE-5969 : add merge api
          Hide
          ASF subversion and git services added a comment -

          Commit 1628077 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628077 ]

          LUCENE-5969: add missing checkIntegrity() calls for segments that cannot be bulk-merged

          Show
          ASF subversion and git services added a comment - Commit 1628077 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628077 ] LUCENE-5969 : add missing checkIntegrity() calls for segments that cannot be bulk-merged
          Hide
          ASF subversion and git services added a comment -

          Commit 1628382 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628382 ]

          LUCENE-5969: current state for dv/norms

          Show
          ASF subversion and git services added a comment - Commit 1628382 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628382 ] LUCENE-5969 : current state for dv/norms
          Hide
          ASF subversion and git services added a comment -

          Commit 1628386 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628386 ]

          LUCENE-5969: add segment suffix safety

          Show
          ASF subversion and git services added a comment - Commit 1628386 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628386 ] LUCENE-5969 : add segment suffix safety
          Hide
          ASF subversion and git services added a comment -

          Commit 1628439 from Michael McCandless in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628439 ]

          LUCENE-5969: fix nocommit

          Show
          ASF subversion and git services added a comment - Commit 1628439 from Michael McCandless in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628439 ] LUCENE-5969 : fix nocommit
          Hide
          ASF subversion and git services added a comment -

          Commit 1628679 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628679 ]

          LUCENE-5969: clean up unnecessary back compat and add segment header

          Show
          ASF subversion and git services added a comment - Commit 1628679 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628679 ] LUCENE-5969 : clean up unnecessary back compat and add segment header
          Hide
          ASF subversion and git services added a comment -

          Commit 1628684 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628684 ]

          LUCENE-5969: clean up unnecessary back compat and add segment header

          Show
          ASF subversion and git services added a comment - Commit 1628684 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628684 ] LUCENE-5969 : clean up unnecessary back compat and add segment header
          Hide
          ASF subversion and git services added a comment -

          Commit 1628688 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628688 ]

          LUCENE-5969: clean up unnecessary back compat and add segment header

          Show
          ASF subversion and git services added a comment - Commit 1628688 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628688 ] LUCENE-5969 : clean up unnecessary back compat and add segment header
          Hide
          ASF subversion and git services added a comment -

          Commit 1628692 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628692 ]

          LUCENE-5969: more cleanups

          Show
          ASF subversion and git services added a comment - Commit 1628692 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628692 ] LUCENE-5969 : more cleanups
          Hide
          ASF subversion and git services added a comment -

          Commit 1628697 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628697 ]

          LUCENE-5969: give remaining codecs segment headers and cleanups

          Show
          ASF subversion and git services added a comment - Commit 1628697 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628697 ] LUCENE-5969 : give remaining codecs segment headers and cleanups
          Hide
          ASF subversion and git services added a comment -

          Commit 1628714 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628714 ]

          LUCENE-5969: fix formats

          Show
          ASF subversion and git services added a comment - Commit 1628714 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628714 ] LUCENE-5969 : fix formats
          Hide
          ASF subversion and git services added a comment -

          Commit 1628889 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628889 ]

          LUCENE-5969: start improving CFSDir

          Show
          ASF subversion and git services added a comment - Commit 1628889 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628889 ] LUCENE-5969 : start improving CFSDir
          Hide
          ASF subversion and git services added a comment -

          Commit 1628996 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1628996 ]

          LUCENE-5969: move CFS to codec

          Show
          ASF subversion and git services added a comment - Commit 1628996 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1628996 ] LUCENE-5969 : move CFS to codec
          Hide
          ASF subversion and git services added a comment -

          Commit 1629001 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629001 ]

          LUCENE-5969: remove back compat

          Show
          ASF subversion and git services added a comment - Commit 1629001 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629001 ] LUCENE-5969 : remove back compat
          Hide
          ASF subversion and git services added a comment -

          Commit 1629008 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629008 ]

          LUCENE-5969: simplify cfs for 5.0

          Show
          ASF subversion and git services added a comment - Commit 1629008 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629008 ] LUCENE-5969 : simplify cfs for 5.0
          Hide
          ASF subversion and git services added a comment -

          Commit 1629105 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629105 ]

          LUCENE-5969: don't use indexfilenames in these codecs

          Show
          ASF subversion and git services added a comment - Commit 1629105 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629105 ] LUCENE-5969 : don't use indexfilenames in these codecs
          Hide
          ASF subversion and git services added a comment -

          Commit 1629106 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629106 ]

          LUCENE-5969: clear nocommit

          Show
          ASF subversion and git services added a comment - Commit 1629106 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629106 ] LUCENE-5969 : clear nocommit
          Hide
          ASF subversion and git services added a comment -

          Commit 1629207 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629207 ]

          LUCENE-5969: add cfs to TestIWExceptions2

          Show
          ASF subversion and git services added a comment - Commit 1629207 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629207 ] LUCENE-5969 : add cfs to TestIWExceptions2
          Hide
          ASF subversion and git services added a comment -

          Commit 1629272 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629272 ]

          LUCENE-5969: start porting over tests to BaseCompoundFormatTestCase

          Show
          ASF subversion and git services added a comment - Commit 1629272 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629272 ] LUCENE-5969 : start porting over tests to BaseCompoundFormatTestCase
          Hide
          ASF subversion and git services added a comment -

          Commit 1629288 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629288 ]

          LUCENE-5969: add/port more tests

          Show
          ASF subversion and git services added a comment - Commit 1629288 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629288 ] LUCENE-5969 : add/port more tests
          Hide
          ASF subversion and git services added a comment -

          Commit 1629303 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629303 ]

          LUCENE-5969: port two remaining TestCompoundFile tests

          Show
          ASF subversion and git services added a comment - Commit 1629303 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629303 ] LUCENE-5969 : port two remaining TestCompoundFile tests
          Hide
          ASF subversion and git services added a comment -

          Commit 1629313 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629313 ]

          LUCENE-5969: port remaining tests

          Show
          ASF subversion and git services added a comment - Commit 1629313 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629313 ] LUCENE-5969 : port remaining tests
          Hide
          ASF subversion and git services added a comment -

          Commit 1629380 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629380 ]

          LUCENE-5969: don't open the cfs file twice

          Show
          ASF subversion and git services added a comment - Commit 1629380 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629380 ] LUCENE-5969 : don't open the cfs file twice
          Hide
          ASF subversion and git services added a comment -

          Commit 1629397 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629397 ]

          LUCENE-5969: add SimpleText cfs

          Show
          ASF subversion and git services added a comment - Commit 1629397 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629397 ] LUCENE-5969 : add SimpleText cfs
          Hide
          ASF subversion and git services added a comment -

          Commit 1629400 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629400 ]

          LUCENE-5969: fix false fails from tests that look for exact file names

          Show
          ASF subversion and git services added a comment - Commit 1629400 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629400 ] LUCENE-5969 : fix false fails from tests that look for exact file names
          Hide
          ASF subversion and git services added a comment -

          Commit 1629401 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629401 ]

          LUCENE-5969: fix test to not rely upon filename count

          Show
          ASF subversion and git services added a comment - Commit 1629401 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629401 ] LUCENE-5969 : fix test to not rely upon filename count
          Hide
          ASF subversion and git services added a comment -

          Commit 1629404 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629404 ]

          LUCENE-5969: improved exceptions for ancient codec

          Show
          ASF subversion and git services added a comment - Commit 1629404 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629404 ] LUCENE-5969 : improved exceptions for ancient codec
          Hide
          ASF subversion and git services added a comment -

          Commit 1629405 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629405 ]

          LUCENE-5969: improve memory pf

          Show
          ASF subversion and git services added a comment - Commit 1629405 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629405 ] LUCENE-5969 : improve memory pf
          Hide
          ASF subversion and git services added a comment -

          Commit 1629406 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629406 ]

          LUCENE-5969: javadocs

          Show
          ASF subversion and git services added a comment - Commit 1629406 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629406 ] LUCENE-5969 : javadocs
          Hide
          ASF subversion and git services added a comment -

          Commit 1629408 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629408 ]

          LUCENE-5969: add changes and test

          Show
          ASF subversion and git services added a comment - Commit 1629408 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629408 ] LUCENE-5969 : add changes and test
          Hide
          Robert Muir added a comment -

          I think the branch is currently in a good state to do an intermediate merge. Then we can tackle postings and docvalues.
          This patch can be applied, but its large because of lots of svn moves.

          • All per-segment files are moved to write/checkSegmentHeader , and they also verify segment suffix/generation to fully detect mismatched files. I fixed all 5.0 (except dv/postings, still TODO) and all of codecs/ to do this.
          • All 5.0 init methods (except dv/postings, and a couple guys in codecs/: still TODO) use the new checkFooter(in, Throwable) to append suppressed checksum status if they hit corruption on open.
          • CFS is moved to the codec API, with a write method that handles all files at once, and a read method that returns read-only directory view. Added a new simpler impl for 5.0, and a simpletext impl. Moved all CFS tests to BaseCompoundFormatTestCase which they all use. SegmentReader no longer opens the CFS file twice.
          • Merging uses codec producer APIs instead of readers. This leads to more optimized merging: checksum computation is per-segment/per-producer, and norms and docvalues don't pile up unused fields into RAM during merge. If the fields are already loaded, they use them, but otherwise they load the field, but don't cache it. This is important not just for "abuse" cases, but should really improve use cases like offline indexing. I fixed all codecs (5.0, codecs/, backwards/) to not waste RAM like this.
          • 5.0 norms have a new indirect encoding for sparse fields. Currently this is very conservative as 1/31 to ensure its more efficient in terms of both space (maximum possible packedints bloat) and time (v log v < maxdoc).
          • Backwards codecs are more contained: I tried to reduce visibility, make them as read-only as possible, ensure all files are deprecated, etc.
          Show
          Robert Muir added a comment - I think the branch is currently in a good state to do an intermediate merge. Then we can tackle postings and docvalues. This patch can be applied, but its large because of lots of svn moves. All per-segment files are moved to write/checkSegmentHeader , and they also verify segment suffix/generation to fully detect mismatched files. I fixed all 5.0 (except dv/postings, still TODO) and all of codecs/ to do this. All 5.0 init methods (except dv/postings, and a couple guys in codecs/: still TODO) use the new checkFooter(in, Throwable) to append suppressed checksum status if they hit corruption on open. CFS is moved to the codec API, with a write method that handles all files at once, and a read method that returns read-only directory view. Added a new simpler impl for 5.0, and a simpletext impl. Moved all CFS tests to BaseCompoundFormatTestCase which they all use. SegmentReader no longer opens the CFS file twice. Merging uses codec producer APIs instead of readers. This leads to more optimized merging: checksum computation is per-segment/per-producer, and norms and docvalues don't pile up unused fields into RAM during merge. If the fields are already loaded, they use them, but otherwise they load the field, but don't cache it. This is important not just for "abuse" cases, but should really improve use cases like offline indexing. I fixed all codecs (5.0, codecs/, backwards/) to not waste RAM like this. 5.0 norms have a new indirect encoding for sparse fields. Currently this is very conservative as 1/31 to ensure its more efficient in terms of both space (maximum possible packedints bloat) and time (v log v < maxdoc). Backwards codecs are more contained: I tried to reduce visibility, make them as read-only as possible, ensure all files are deprecated, etc.
          Hide
          Michael McCandless added a comment -

          +1 to merge branch back, patch looks awesome.

          Show
          Michael McCandless added a comment - +1 to merge branch back, patch looks awesome.
          Hide
          Uwe Schindler added a comment -

          Cool! Very nice! I also checked how the "conventional" merging with a dumb foreign (filter) reader is wrapped and if its tested correctly

          Show
          Uwe Schindler added a comment - Cool! Very nice! I also checked how the "conventional" merging with a dumb foreign (filter) reader is wrapped and if its tested correctly
          Hide
          ASF subversion and git services added a comment -

          Commit 1629499 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1629499 ]

          LUCENE-5969: Lucene 5.0 codec, round two

          Show
          ASF subversion and git services added a comment - Commit 1629499 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1629499 ] LUCENE-5969 : Lucene 5.0 codec, round two
          Hide
          ASF subversion and git services added a comment -

          Commit 1629501 from Robert Muir in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1629501 ]

          LUCENE-5969: Lucene 5.0 codec, round two

          Show
          ASF subversion and git services added a comment - Commit 1629501 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1629501 ] LUCENE-5969 : Lucene 5.0 codec, round two
          Hide
          ASF subversion and git services added a comment -

          Commit 1629503 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1629503 ]

          LUCENE-5969: recreate branch for phase 3

          Show
          ASF subversion and git services added a comment - Commit 1629503 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1629503 ] LUCENE-5969 : recreate branch for phase 3
          Hide
          ASF subversion and git services added a comment -

          Commit 1632200 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1632200 ]

          LUCENE-5969: add 5.10 dv with segment header, CONST optimization, and missingBits ghostbuster

          Show
          ASF subversion and git services added a comment - Commit 1632200 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1632200 ] LUCENE-5969 : add 5.10 dv with segment header, CONST optimization, and missingBits ghostbuster
          Hide
          ASF subversion and git services added a comment -

          Commit 1632273 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1632273 ]

          LUCENE-5969: add tests

          Show
          ASF subversion and git services added a comment - Commit 1632273 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1632273 ] LUCENE-5969 : add tests
          Hide
          ASF subversion and git services added a comment -

          Commit 1632631 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1632631 ]

          LUCENE-5969: use sparsebitset to expand sparse encoding to cover more absurd cases

          Show
          ASF subversion and git services added a comment - Commit 1632631 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1632631 ] LUCENE-5969 : use sparsebitset to expand sparse encoding to cover more absurd cases
          Hide
          ASF subversion and git services added a comment -

          Commit 1632706 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1632706 ]

          LUCENE-5969: fix tests

          Show
          ASF subversion and git services added a comment - Commit 1632706 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1632706 ] LUCENE-5969 : fix tests
          Hide
          ASF subversion and git services added a comment -

          Commit 1633196 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633196 ]

          LUCENE-5969: move old postings back compat to backward-codecs, cleanup PBF related stuff, add segment headers, etc

          Show
          ASF subversion and git services added a comment - Commit 1633196 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633196 ] LUCENE-5969 : move old postings back compat to backward-codecs, cleanup PBF related stuff, add segment headers, etc
          Hide
          ASF subversion and git services added a comment -

          Commit 1633211 from Michael McCandless in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633211 ]

          LUCENE-5969: correct TODOs

          Show
          ASF subversion and git services added a comment - Commit 1633211 from Michael McCandless in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633211 ] LUCENE-5969 : correct TODOs
          Hide
          ASF subversion and git services added a comment -

          Commit 1633212 from Michael McCandless in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633212 ]

          LUCENE-5969: woops, revert

          Show
          ASF subversion and git services added a comment - Commit 1633212 from Michael McCandless in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633212 ] LUCENE-5969 : woops, revert
          Hide
          ASF subversion and git services added a comment -

          Commit 1633213 from Michael McCandless in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633213 ]

          LUCENE-5969: correct TODOs

          Show
          ASF subversion and git services added a comment - Commit 1633213 from Michael McCandless in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633213 ] LUCENE-5969 : correct TODOs
          Hide
          ASF subversion and git services added a comment -

          Commit 1633385 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633385 ]

          LUCENE-5969: clean up constants

          Show
          ASF subversion and git services added a comment - Commit 1633385 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633385 ] LUCENE-5969 : clean up constants
          Hide
          ASF subversion and git services added a comment -

          Commit 1633429 from Ryan Ernst in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633429 ]

          LUCENE-5969: Copy block tree to backward codecs for 4.0-4.10, remove conditionals in 50 version, add getStats() TermsEnum API to remove need to expose Stats impl

          Show
          ASF subversion and git services added a comment - Commit 1633429 from Ryan Ernst in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633429 ] LUCENE-5969 : Copy block tree to backward codecs for 4.0-4.10, remove conditionals in 50 version, add getStats() TermsEnum API to remove need to expose Stats impl
          Hide
          ASF subversion and git services added a comment -

          Commit 1633441 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633441 ]

          LUCENE-5969: move blocktree stats -> stats api

          Show
          ASF subversion and git services added a comment - Commit 1633441 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633441 ] LUCENE-5969 : move blocktree stats -> stats api
          Hide
          ASF subversion and git services added a comment -

          Commit 1633442 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633442 ]

          LUCENE-5969: add missing @Override

          Show
          ASF subversion and git services added a comment - Commit 1633442 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633442 ] LUCENE-5969 : add missing @Override
          Hide
          ASF subversion and git services added a comment -

          Commit 1633451 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633451 ]

          LUCENE-5969: fail precommit on typo'ed TODO

          Show
          ASF subversion and git services added a comment - Commit 1633451 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633451 ] LUCENE-5969 : fail precommit on typo'ed TODO
          Hide
          ASF subversion and git services added a comment -

          Commit 1633453 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633453 ]

          LUCENE-5969: fix TOODs and add missing license header

          Show
          ASF subversion and git services added a comment - Commit 1633453 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633453 ] LUCENE-5969 : fix TOODs and add missing license header
          Hide
          ASF subversion and git services added a comment -

          Commit 1633465 from Ryan Ernst in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633465 ]

          LUCENE-5969: Move bwc blocktree writer to test module, update 50 writer to use new header reading/writing

          Show
          ASF subversion and git services added a comment - Commit 1633465 from Ryan Ernst in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633465 ] LUCENE-5969 : Move bwc blocktree writer to test module, update 50 writer to use new header reading/writing
          Hide
          ASF subversion and git services added a comment -

          Commit 1633471 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633471 ]

          LUCENE-5969: test that default codec uses segmentheader for all files. change .si to write its ID the same way for consistency

          Show
          ASF subversion and git services added a comment - Commit 1633471 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633471 ] LUCENE-5969 : test that default codec uses segmentheader for all files. change .si to write its ID the same way for consistency
          Hide
          ASF subversion and git services added a comment -

          Commit 1633514 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633514 ]

          LUCENE-5969: segmentHeader -> indexHeader, cut over segments_N, detect mismatched .si, consistency of encoding elsewhere

          Show
          ASF subversion and git services added a comment - Commit 1633514 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633514 ] LUCENE-5969 : segmentHeader -> indexHeader, cut over segments_N, detect mismatched .si, consistency of encoding elsewhere
          Hide
          ASF subversion and git services added a comment -

          Commit 1633526 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633526 ]

          LUCENE-5969: fix precommit, improve file format docs

          Show
          ASF subversion and git services added a comment - Commit 1633526 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633526 ] LUCENE-5969 : fix precommit, improve file format docs
          Hide
          ASF subversion and git services added a comment -

          Commit 1633607 from Robert Muir in branch 'dev/branches/lucene5969'
          [ https://svn.apache.org/r1633607 ]

          LUCENE-5969: tidy up format names to be consistent, add missing file formats doc

          Show
          ASF subversion and git services added a comment - Commit 1633607 from Robert Muir in branch 'dev/branches/lucene5969' [ https://svn.apache.org/r1633607 ] LUCENE-5969 : tidy up format names to be consistent, add missing file formats doc
          Hide
          Robert Muir added a comment -

          Here is a patch for part 3. I think its ready, we should close the issue after this.
          Other improvements can be separate issues from here.
          Also after resolving this issue and backporting, we can do further cleanups in trunk, and remove all the 4.x support in backwards-codecs and further cleanups in SegmentInfos.

          Patch finishes adding all safety (docvalues, terms, postings, commit points). CodecUtil "segmentHeader" is renamed to "indexHeader", as its used for all index files (including commit points).

          BlockTree doesn't "backdoor" via checkindex to return stats, there is a dead simple API for this.

          Norms sparse encoding is further improved with PATCHED strategy.

          There is an API change for SegmentInfos for safety, instead of instance methods for reading read into "mutable" SIS:

          SegmentInfos.read(Dir);
          SegmentInfos.read(Dir, file);
          

          these are now static methods that return a clean instance (and named readCommit and readLatestCommit respectively, to not be fragile on upgrade).

          There is more to fix here, IMO SIS "tries to take on too much" (mutable state by IndexWriter, tracking of counters etc by IndexWriter, reading/writing commits, tries to be a "low level user-friendly" and too much publicly exposed dangers. This is all for a heavily versioned important file with conditional logic. But thats a bigger problem.

          Show
          Robert Muir added a comment - Here is a patch for part 3. I think its ready, we should close the issue after this. Other improvements can be separate issues from here. Also after resolving this issue and backporting, we can do further cleanups in trunk, and remove all the 4.x support in backwards-codecs and further cleanups in SegmentInfos. Patch finishes adding all safety (docvalues, terms, postings, commit points). CodecUtil "segmentHeader" is renamed to "indexHeader", as its used for all index files (including commit points). BlockTree doesn't "backdoor" via checkindex to return stats, there is a dead simple API for this. Norms sparse encoding is further improved with PATCHED strategy. There is an API change for SegmentInfos for safety, instead of instance methods for reading read into "mutable" SIS: SegmentInfos.read(Dir); SegmentInfos.read(Dir, file); these are now static methods that return a clean instance (and named readCommit and readLatestCommit respectively, to not be fragile on upgrade). There is more to fix here, IMO SIS "tries to take on too much" (mutable state by IndexWriter, tracking of counters etc by IndexWriter, reading/writing commits, tries to be a "low level user-friendly" and too much publicly exposed dangers. This is all for a heavily versioned important file with conditional logic. But thats a bigger problem.
          Hide
          Michael McCandless added a comment -

          +1, looks great. Wonderful to have full ID cross checking, and TOOD-banning.

          Show
          Michael McCandless added a comment - +1, looks great. Wonderful to have full ID cross checking, and TOOD-banning.
          Hide
          ASF subversion and git services added a comment -

          Commit 1633991 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1633991 ]

          LUCENE-5969: finish porting rest of codec to 5.0 features

          Show
          ASF subversion and git services added a comment - Commit 1633991 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1633991 ] LUCENE-5969 : finish porting rest of codec to 5.0 features
          Hide
          ASF subversion and git services added a comment -

          Commit 1633993 from Robert Muir in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1633993 ]

          LUCENE-5969: finish porting rest of codec to 5.0 features

          Show
          ASF subversion and git services added a comment - Commit 1633993 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1633993 ] LUCENE-5969 : finish porting rest of codec to 5.0 features
          Hide
          ASF subversion and git services added a comment -

          Commit 1633998 from Robert Muir in branch 'dev/trunk'
          [ https://svn.apache.org/r1633998 ]

          LUCENE-5969: remove 4.x back compat

          Show
          ASF subversion and git services added a comment - Commit 1633998 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1633998 ] LUCENE-5969 : remove 4.x back compat
          Hide
          ASF subversion and git services added a comment -

          Commit 1634064 from Robert Muir in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1634064 ]

          use correct constants/javadocs refs (backport from LUCENE-5969)

          Show
          ASF subversion and git services added a comment - Commit 1634064 from Robert Muir in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1634064 ] use correct constants/javadocs refs (backport from LUCENE-5969 )
          Hide
          Anshum Gupta added a comment -

          Bulk close after 5.0 release.

          Show
          Anshum Gupta added a comment - Bulk close after 5.0 release.

            People

            • Assignee:
              Unassigned
              Reporter:
              Michael McCandless
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development