Issue Details (XML | Word | Printable)

Key: LUCENE-510
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Michael McCandless
Reporter: Doug Cutting
Votes: 5
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
Lucene - Java

IndexOutput.writeString() should write length in bytes

Created: 03/Mar/06 03:30 AM   Updated: 11/Oct/08 12:49 PM
Return to search
Component/s: Store
Affects Version/s: 2.1
Fix Version/s: 2.4

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works LUCENE-510.patch 2008-03-04 08:55 PM Michael McCandless 51 kB
Text File Licensed for inclusion in ASF works LUCENE-510.take2.patch 2008-03-17 08:04 PM Michael McCandless 120 kB
Java Source File Licensed for inclusion in ASF works SortExternal.java 2006-06-05 08:54 AM Marvin Humphrey 15 kB
File Licensed for inclusion in ASF works strings.diff 2006-05-09 04:04 AM Marvin Humphrey 27 kB
Java Source File Licensed for inclusion in ASF works TestSortExternal.java 2006-06-05 08:54 AM Marvin Humphrey 6 kB
Issue Links:
Reference

Resolution Date: 09/May/08 12:05 PM


 Description  « Hide
We should change the format of strings written to indexes so that the length of the string is in bytes, not Java characters. This issue has been discussed at:

http://www.mail-archive.com/java-dev@lucene.apache.org/msg01970.html

We must increment the file format number to indicate this change. At least the format number in the segments file should change.

I'm targetting this for 2.1, i.e., we shouldn't commit it to trunk until after 2.0 is released, to minimize incompatible changes between 1.9 and 2.0 (other than removal of deprecated features).



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Repository Revision Date User Message
ASF #641303 Wed Mar 26 13:39:25 UTC 2008 mikemccand LUCENE-510: change index format to store strings as true UTF8 not modified UTF8
Files Changed
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/TermInfosReader.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/TestBackwardsCompatibility.java
MODIFY /lucene/java/trunk/docs/fileformats.html
MODIFY /lucene/java/trunk/NOTICE.txt
MODIFY /lucene/java/trunk/src/site/src/documentation/content/xdocs/fileformats.xml
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/TermVectorsWriter.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/DocumentsWriterThreadState.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/TestIndexInput.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/DocumentsWriterFieldData.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/index.23.nocfs.zip
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/index.23.cfs.zip
ADD /lucene/java/trunk/src/java/org/apache/lucene/util/UnicodeUtil.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/SegmentMerger.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/document/Document.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/TestStressIndexing2.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/TermInfosWriter.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/DocumentsWriter.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/util/StringHelper.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/TermVectorsReader.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/TermBuffer.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/FieldsWriter.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/TestIndexWriter.java
MODIFY /lucene/java/trunk/LICENSE.txt
MODIFY /lucene/java/trunk/docs/fileformats.pdf
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/store/IndexOutput.java
MODIFY /lucene/java/trunk/CHANGES.txt
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/store/IndexInput.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/SegmentTermEnum.java

Repository Revision Date User Message
ASF #654774 Fri May 09 12:04:46 UTC 2008 mikemccand LUCENE-510: fix backwards compatibility bug when bulk-merging stored fields from pre-UTF8 segments that contain non-ascii stored fields
Files Changed
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/TestBackwardsCompatibility.java
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/FieldsReader.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/util/_TestUtil.java
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/index.23.nocfs.zip
MODIFY /lucene/java/trunk/src/test/org/apache/lucene/index/index.23.cfs.zip
MODIFY /lucene/java/trunk/src/java/org/apache/lucene/index/SegmentMerger.java