Accumulo
  1. Accumulo
  2. ACCUMULO-836

Specify Charset on getBytes() call for String objects.

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.5.0
    • Fix Version/s: 1.5.0
    • Component/s: None
    • Labels:
      None

      Description

      The comments on ACCUMULO-241 indicate that the build server might have a different default Charset than computers used by developers. Therefore, some of the tests have different results on different computers.

      Every getBytes call on a String object should specify the UTF8 Charset. Unfortunately the codebase has nearly 1,800 getBytes calls.

      1. UTF8.java
        6 kB
        David Medinets

        Issue Links

          Activity

          Hide
          David Medinets added a comment -

          See the http://www.mail-archive.com/dev@accumulo.apache.org/msg02051.html thread for more details. Essentially there were some suggestions for alternatives but it seems like using UTF8 is a safe change.

          Show
          David Medinets added a comment - See the http://www.mail-archive.com/dev@accumulo.apache.org/msg02051.html thread for more details. Essentially there were some suggestions for alternatives but it seems like using UTF8 is a safe change.
          Hide
          David Medinets added a comment -

          The attached file shows how I performed the search for getBytes() and has a list of files that continue to have getBytes() because it is
          called on a Text or some other kind of object. The code just prints a list of files using getBytes(). Then I manually reviewed the files,
          made changes or added the file name to the ignore list.

          Show
          David Medinets added a comment - The attached file shows how I performed the search for getBytes() and has a list of files that continue to have getBytes() because it is called on a Text or some other kind of object. The code just prints a list of files using getBytes(). Then I manually reviewed the files, made changes or added the file name to the ignore list.
          Hide
          David Medinets added a comment -

          SVN#1403871 - I have changed 82 files to use getBytes(utf8). Each files uses the following Charset declaration.

          private static final Charset utf8 = Charset.forName("UTF8");

          If at some future time the Charset should be changed or another approach is decided upon, simply search for that string and refactoring will be straightforward.

          Show
          David Medinets added a comment - SVN#1403871 - I have changed 82 files to use getBytes(utf8). Each files uses the following Charset declaration. private static final Charset utf8 = Charset.forName("UTF8"); If at some future time the Charset should be changed or another approach is decided upon, simply search for that string and refactoring will be straightforward.
          Hide
          David Medinets added a comment -

          see my last comment.

          Show
          David Medinets added a comment - see my last comment.
          Hide
          Hudson added a comment -

          Integrated in Accumulo-Trunk #540 (See https://builds.apache.org/job/Accumulo-Trunk/540/)
          ACCUMULO-836: Specify Charset on getBytes() call for String objects. (Revision 1403871)

          Result = SUCCESS
          medined :
          Files :

          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/admin/TableOperationsImpl.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/AccumuloOutputFormat.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/InputFormatBase.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/partition/RangePartitioner.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mock/MockShell.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/data/ArrayByteSequence.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/file/BloomFilterLayer.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/ColumnFamilyCounter.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/LongCombiner.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringMax.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringMin.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringSummation.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/conf/ColumnSet.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/GrepIterator.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/IntersectingIterator.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/LargeRowFilter.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/RegExFilter.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/RowDeletingIterator.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/SummingArrayCombiner.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/Authorizations.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/ColumnVisibility.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/trace/InstanceUserPassword.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/ByteArraySet.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/Encoding.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/Merge.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/MetadataTable.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/TextUtil.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/Shell.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/AddSplitsCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/AuthenticateCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/CreateTableCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/CreateUserCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/HiddenCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/PasswdCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/QuotedStringTokenizer.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/UserCommand.java
          • /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/ZooStore.java
          • /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/zookeeper/DistributedReadWriteLock.java
          • /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/zookeeper/ZooReservation.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/gc/SimpleGarbageCollector.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/Master.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/recovery/SubmitFileForRecovery.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/DeadServerList.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/SetGoalState.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/TServerInstance.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/TabletStateChangeIterator.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/ZooTabletStateStore.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/tables/TableManager.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/BulkImport.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/CompactRange.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/ImportTable.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/RenameTable.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/Utils.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tserverOps/ShutdownTServer.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/metanalysis/IndexMeta.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/ZooKeeperStatus.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/servlets/DefaultServlet.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/servlets/trace/Basic.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/security/SecurityConstants.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/security/ZKAuthenticator.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/Tablet.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/UniqueNameAllocator.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/trace/TraceServer.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/AddFilesWithMissingEntries.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/Admin.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/ChangeSecret.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/CheckForMetadataProblems.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/CleanZookeeper.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/Initialize.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/LocalityCheck.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RandomWriter.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RemoveEntriesForMissingFiles.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RestoreZookeeper.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/SendLogToChainsaw.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/SystemPropUtil.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/TablePropUtil.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/VerifyTabletAssignments.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/zookeeper/ZooQueueLock.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/zookeeper/ZooReaderWriter.java
          Show
          Hudson added a comment - Integrated in Accumulo-Trunk #540 (See https://builds.apache.org/job/Accumulo-Trunk/540/ ) ACCUMULO-836 : Specify Charset on getBytes() call for String objects. (Revision 1403871) Result = SUCCESS medined : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/admin/TableOperationsImpl.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/AccumuloOutputFormat.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/InputFormatBase.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/partition/RangePartitioner.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mock/MockShell.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/data/ArrayByteSequence.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/file/BloomFilterLayer.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/ColumnFamilyCounter.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/LongCombiner.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringMax.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringMin.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringSummation.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/conf/ColumnSet.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/GrepIterator.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/IntersectingIterator.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/LargeRowFilter.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/RegExFilter.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/RowDeletingIterator.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/SummingArrayCombiner.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/Authorizations.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/ColumnVisibility.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/trace/InstanceUserPassword.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/ByteArraySet.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/Encoding.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/Merge.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/MetadataTable.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/TextUtil.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/Shell.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/AddSplitsCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/AuthenticateCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/CreateTableCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/CreateUserCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/HiddenCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/PasswdCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/QuotedStringTokenizer.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/UserCommand.java /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/ZooStore.java /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/zookeeper/DistributedReadWriteLock.java /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/zookeeper/ZooReservation.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/gc/SimpleGarbageCollector.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/Master.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/recovery/SubmitFileForRecovery.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/DeadServerList.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/SetGoalState.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/TServerInstance.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/TabletStateChangeIterator.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/ZooTabletStateStore.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/tables/TableManager.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/BulkImport.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/CompactRange.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/ImportTable.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/RenameTable.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/Utils.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tserverOps/ShutdownTServer.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/metanalysis/IndexMeta.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/ZooKeeperStatus.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/servlets/DefaultServlet.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/servlets/trace/Basic.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/security/SecurityConstants.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/security/ZKAuthenticator.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/Tablet.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/UniqueNameAllocator.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/trace/TraceServer.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/AddFilesWithMissingEntries.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/Admin.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/ChangeSecret.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/CheckForMetadataProblems.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/CleanZookeeper.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/Initialize.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/LocalityCheck.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RandomWriter.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RemoveEntriesForMissingFiles.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RestoreZookeeper.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/SendLogToChainsaw.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/SystemPropUtil.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/TablePropUtil.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/VerifyTabletAssignments.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/zookeeper/ZooQueueLock.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/zookeeper/ZooReaderWriter.java
          Hide
          Christopher Tubbs added a comment -

          I think these changes should be reviewed to see if they are necessary and/or consistent with the discussion of the two appropriate scopes of encoding issues on ACCUMULO-840. I can review, but I think having another reviewer would also be good, if we have a volunteer.

          Show
          Christopher Tubbs added a comment - I think these changes should be reviewed to see if they are necessary and/or consistent with the discussion of the two appropriate scopes of encoding issues on ACCUMULO-840 . I can review, but I think having another reviewer would also be good, if we have a volunteer.
          Hide
          Christopher Tubbs added a comment -

          So, I looked over these changes, and didn't see anything that would be too problematic... but... I did notice that there are places where we are decoding bytes into a String, using new String(byte[]), but not specifying the encoding of the byte[]. This causes a discrepancy in some cases with the corresponding setter that uses .getBytes(utf8). For instance, in InputFormatBase, we have

          conf.set(PASSWORD, new String(Base64.encodeBase64(passwd)));

          Aside from the problematic fact that this Base64 library encodes to a byte[] instead of to a String, it doesn't document the fact that these bytes are ASCII encoded. If the user's system had a default encoding that is incompatible with ASCII, this constructor may behave unexpectedly or throw an exception, as it decodes the ASCII into the Java String type. Reading the password has no such problem... if the password is "Stringified" into the job configuration without error, then calling getBytes(utf8) on the ASCII characters in the Java String should not throw an exception. While this is not likely to cause a problem in the overwhelming majority of cases, it seems inconsistent to be pedantic about encoding with .getBytes() when we aren't equally pedantic about decoding with new String(byte[]) and similar.

          So, to summarize, I think this is generally on the right path, but needs more focus on both sides of serialization/deserialization of transient(M/R) / persistent(zoo) state.

          Show
          Christopher Tubbs added a comment - So, I looked over these changes, and didn't see anything that would be too problematic... but... I did notice that there are places where we are decoding bytes into a String, using new String(byte[]), but not specifying the encoding of the byte[]. This causes a discrepancy in some cases with the corresponding setter that uses .getBytes(utf8). For instance, in InputFormatBase, we have conf.set(PASSWORD, new String (Base64.encodeBase64(passwd))); Aside from the problematic fact that this Base64 library encodes to a byte[] instead of to a String, it doesn't document the fact that these bytes are ASCII encoded. If the user's system had a default encoding that is incompatible with ASCII, this constructor may behave unexpectedly or throw an exception, as it decodes the ASCII into the Java String type. Reading the password has no such problem... if the password is "Stringified" into the job configuration without error, then calling getBytes(utf8) on the ASCII characters in the Java String should not throw an exception. While this is not likely to cause a problem in the overwhelming majority of cases, it seems inconsistent to be pedantic about encoding with .getBytes() when we aren't equally pedantic about decoding with new String(byte[]) and similar. So, to summarize, I think this is generally on the right path, but needs more focus on both sides of serialization/deserialization of transient(M/R) / persistent(zoo) state.
          Hide
          Josh Elser added a comment -

          GrepIterator: It should be noted (javadoc) that the String being converted to bytes will be treated as UTF-8 encoded bytes or not make the UTF-8 assertion at all.

          MetadataTable#encode(), DistributedReadWriteLock#getLockData(): Should note that the byte[] return from the specified method is utf-8 bytes.

          LongCombiner.StringEncoder, StringMax, StringMin, StringSummation, SummingArrayCombiner.StringArrayEncoder, Authorizations, Master#mergeMetadataRecords: These classes are creating bytes that are UTF-8, but when the bytes are initially read into a String (from a Value typically), the default encoding is used (String constructor that takes a byte array). This leads to inconsistency as the data could have been read as something other than UTF-8 but then written back out as UTF-8. A decision needs to make what to do and that decision needs to be documented.

          ZooStore: Some awkwardness pops out at me in #setProperty(long, String, Serializable) manually adding bytes to the data to be written to ZooKeeper. I don't think UTF-8 will cause any problems, but it could definitely use some clarification.

          TraceServer.Receiver, IndexMeta, AddFilesWithMissingEntries, MetadataTable: Writes out a Value in utf-8 bytes, but I'm not confident if there is any case in which a client reading that data would expect something else. Documentation again would be useful. The likelihood of this being an issue is probably small considering that Hadoop's WritableUtils encodes Strings as UTF-8.

          I'm still a little concerned about access points to ZooKeeper and !METADATA, but given that ZooReaderWriter was converting the username and password as UTF-8 bytes I feel slightly better. I should dig into that code more tomorrow.

          One final statement, I still believe that in the ambiguous cases where core classes read arbitrary bytes and write UTF-8 bytes, Accumulo should be agnostic and not make encoding assertions. In other words, I think we should revert those changes and leave it up to the user to decide how they handle their bytes.

          Show
          Josh Elser added a comment - GrepIterator : It should be noted (javadoc) that the String being converted to bytes will be treated as UTF-8 encoded bytes or not make the UTF-8 assertion at all. MetadataTable#encode(), DistributedReadWriteLock#getLockData() : Should note that the byte[] return from the specified method is utf-8 bytes. LongCombiner.StringEncoder, StringMax, StringMin, StringSummation, SummingArrayCombiner.StringArrayEncoder, Authorizations, Master#mergeMetadataRecords : These classes are creating bytes that are UTF-8, but when the bytes are initially read into a String (from a Value typically), the default encoding is used (String constructor that takes a byte array). This leads to inconsistency as the data could have been read as something other than UTF-8 but then written back out as UTF-8. A decision needs to make what to do and that decision needs to be documented. ZooStore : Some awkwardness pops out at me in #setProperty(long, String, Serializable) manually adding bytes to the data to be written to ZooKeeper. I don't think UTF-8 will cause any problems, but it could definitely use some clarification. TraceServer.Receiver, IndexMeta, AddFilesWithMissingEntries, MetadataTable : Writes out a Value in utf-8 bytes, but I'm not confident if there is any case in which a client reading that data would expect something else. Documentation again would be useful. The likelihood of this being an issue is probably small considering that Hadoop's WritableUtils encodes Strings as UTF-8. I'm still a little concerned about access points to ZooKeeper and !METADATA, but given that ZooReaderWriter was converting the username and password as UTF-8 bytes I feel slightly better. I should dig into that code more tomorrow. One final statement, I still believe that in the ambiguous cases where core classes read arbitrary bytes and write UTF-8 bytes, Accumulo should be agnostic and not make encoding assertions. In other words, I think we should revert those changes and leave it up to the user to decide how they handle their bytes.
          Hide
          David Medinets added a comment -

          >One final statement, I still believe that in the ambiguous
          >cases where core classes read arbitrary bytes and write
          >UTF-8 bytes, Accumulo should be agnostic and not make
          >encoding assertions. In other words, I think we should revert
          >those changes and leave it up to the user to decide how they
          >handle their bytes.

          If someone does revert the change, please add a comment about why no encoding is specified so that future code reviews build on current knowledge.

          Show
          David Medinets added a comment - >One final statement, I still believe that in the ambiguous >cases where core classes read arbitrary bytes and write >UTF-8 bytes, Accumulo should be agnostic and not make >encoding assertions. In other words, I think we should revert >those changes and leave it up to the user to decide how they >handle their bytes. If someone does revert the change, please add a comment about why no encoding is specified so that future code reviews build on current knowledge.
          Hide
          John Vines added a comment -

          I've reverted the change due to the discussion http://mail-archives.apache.org/mod_mbox/accumulo-dev/201210.mbox/%3CCAOiJXP4Nenbv7dMkzTGaOeHguPJEHV7hc2CV7BGZWUfovmaPZQ@mail.gmail.com%3E . FTR, encoding is not being specified this way CURRENTLY because we're still trying to determine the best it should be specified as well as in what locations it actually needs to be/should be specified, if at all.

          Show
          John Vines added a comment - I've reverted the change due to the discussion http://mail-archives.apache.org/mod_mbox/accumulo-dev/201210.mbox/%3CCAOiJXP4Nenbv7dMkzTGaOeHguPJEHV7hc2CV7BGZWUfovmaPZQ@mail.gmail.com%3E . FTR, encoding is not being specified this way CURRENTLY because we're still trying to determine the best it should be specified as well as in what locations it actually needs to be/should be specified, if at all.
          Hide
          Hudson added a comment -

          Integrated in Accumulo-Trunk #541 (See https://builds.apache.org/job/Accumulo-Trunk/541/)
          Reverting 1403871 from ACCUMULO-836 until the discussion
          http://mail-archives.apache.org/mod_mbox/accumulo-dev/201210.mbox/%3CCAOiJXP4Nenbv7dMkzTGaOeHguPJEHV7hc2CV7BGZWUfovmaPZQ@mail.gmail.com%3E is resolved (Revision 1405202)

          Result = SUCCESS
          vines :
          Files :

          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/admin/TableOperationsImpl.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/AccumuloOutputFormat.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/InputFormatBase.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/partition/RangePartitioner.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mock/MockShell.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/data/ArrayByteSequence.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/file/BloomFilterLayer.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/ColumnFamilyCounter.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/LongCombiner.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringMax.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringMin.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringSummation.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/conf/ColumnSet.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/GrepIterator.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/IntersectingIterator.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/LargeRowFilter.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/RegExFilter.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/RowDeletingIterator.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/SummingArrayCombiner.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/Authorizations.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/ColumnVisibility.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/trace/InstanceUserPassword.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/ByteArraySet.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/Encoding.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/Merge.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/MetadataTable.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/TextUtil.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/Shell.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/AddSplitsCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/AuthenticateCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/CreateTableCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/CreateUserCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/HiddenCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/PasswdCommand.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/QuotedStringTokenizer.java
          • /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/UserCommand.java
          • /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/ZooStore.java
          • /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/zookeeper/DistributedReadWriteLock.java
          • /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/zookeeper/ZooReservation.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/gc/SimpleGarbageCollector.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/Master.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/recovery/SubmitFileForRecovery.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/DeadServerList.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/SetGoalState.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/TServerInstance.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/TabletStateChangeIterator.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/ZooTabletStateStore.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/tables/TableManager.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/BulkImport.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/CompactRange.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/ImportTable.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/RenameTable.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/Utils.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tserverOps/ShutdownTServer.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/metanalysis/IndexMeta.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/ZooKeeperStatus.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/servlets/DefaultServlet.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/servlets/trace/Basic.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/security/SecurityConstants.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/security/ZKAuthenticator.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/Tablet.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/UniqueNameAllocator.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/trace/TraceServer.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/AddFilesWithMissingEntries.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/Admin.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/ChangeSecret.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/CheckForMetadataProblems.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/CleanZookeeper.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/Initialize.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/LocalityCheck.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RandomWriter.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RemoveEntriesForMissingFiles.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RestoreZookeeper.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/SendLogToChainsaw.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/SystemPropUtil.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/TablePropUtil.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/VerifyTabletAssignments.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/zookeeper/ZooQueueLock.java
          • /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/zookeeper/ZooReaderWriter.java
          Show
          Hudson added a comment - Integrated in Accumulo-Trunk #541 (See https://builds.apache.org/job/Accumulo-Trunk/541/ ) Reverting 1403871 from ACCUMULO-836 until the discussion http://mail-archives.apache.org/mod_mbox/accumulo-dev/201210.mbox/%3CCAOiJXP4Nenbv7dMkzTGaOeHguPJEHV7hc2CV7BGZWUfovmaPZQ@mail.gmail.com%3E is resolved (Revision 1405202) Result = SUCCESS vines : Files : /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/admin/TableOperationsImpl.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/AccumuloOutputFormat.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/InputFormatBase.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mapreduce/lib/partition/RangePartitioner.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/client/mock/MockShell.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/data/ArrayByteSequence.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/file/BloomFilterLayer.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/file/rfile/RFile.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/ColumnFamilyCounter.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/LongCombiner.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringMax.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringMin.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/aggregation/StringSummation.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/conf/ColumnSet.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/GrepIterator.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/IntersectingIterator.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/LargeRowFilter.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/RegExFilter.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/RowDeletingIterator.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/iterators/user/SummingArrayCombiner.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/Authorizations.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/security/ColumnVisibility.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/trace/InstanceUserPassword.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/ByteArraySet.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/Encoding.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/Merge.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/MetadataTable.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/TextUtil.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/Shell.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/AddSplitsCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/AuthenticateCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/CreateTableCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/CreateUserCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/HiddenCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/PasswdCommand.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/QuotedStringTokenizer.java /accumulo/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/UserCommand.java /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/ZooStore.java /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/zookeeper/DistributedReadWriteLock.java /accumulo/trunk/fate/src/main/java/org/apache/accumulo/fate/zookeeper/ZooReservation.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/gc/SimpleGarbageCollector.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/Master.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/recovery/SubmitFileForRecovery.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/DeadServerList.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/SetGoalState.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/TServerInstance.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/TabletStateChangeIterator.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/ZooTabletStateStore.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/state/tables/TableManager.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/BulkImport.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/CompactRange.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/ImportTable.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/RenameTable.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tableOps/Utils.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/master/tserverOps/ShutdownTServer.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/metanalysis/IndexMeta.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/ZooKeeperStatus.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/servlets/DefaultServlet.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/monitor/servlets/trace/Basic.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/security/SecurityConstants.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/security/ZKAuthenticator.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/Tablet.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/TabletServer.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/tabletserver/UniqueNameAllocator.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/trace/TraceServer.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/AddFilesWithMissingEntries.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/Admin.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/ChangeSecret.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/CheckForMetadataProblems.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/CleanZookeeper.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/Initialize.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/LocalityCheck.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/MetadataTable.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RandomWriter.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RemoveEntriesForMissingFiles.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/RestoreZookeeper.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/SendLogToChainsaw.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/SystemPropUtil.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/TablePropUtil.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/util/VerifyTabletAssignments.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/zookeeper/ZooQueueLock.java /accumulo/trunk/server/src/main/java/org/apache/accumulo/server/zookeeper/ZooReaderWriter.java

            People

            • Assignee:
              David Medinets
              Reporter:
              David Medinets
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development