Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.90.4
    • Component/s: io, regionserver, util
    • Labels:
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      Implementation of a Pretty Printer for HLogs. Can be accessed with the Java API or with a command line interface, which replaces HLog --dump functionality. Can output to a human-readable pretty format, or to a JSON list for easy parsing in diagnostic scripts. Can also filter by region, row, and/or sequence id. See command line usage (HLog --dump -h) and the javadocs for more detail on these features.
      Show
      Implementation of a Pretty Printer for HLogs. Can be accessed with the Java API or with a command line interface, which replaces HLog --dump functionality. Can output to a human-readable pretty format, or to a JSON list for easy parsing in diagnostic scripts. Can also filter by region, row, and/or sequence id. See command line usage (HLog --dump -h) and the javadocs for more detail on these features.

      Description

      We currently have a rudimentary way to print HLog data, but it is limited and currently prints key-only information. We need extend this functionality, similar to how we developed HFile's pretty printer. Ideas for functionality:

      • filter by sequence_id
      • filter by row / region
      • option to print values in addition to key info
      • option to print output in JSON format (so scripts can easily parse for analysis)

        Issue Links

          Activity

          Hide
          Lars Hofhansl added a comment -

          Oops... I just remove the patch attachment (wrong issue sorry!)
          I didn't download the attachment first. Please re-attach. Again so sorry about this!

          Show
          Lars Hofhansl added a comment - Oops... I just remove the patch attachment (wrong issue sorry!) I didn't download the attachment first. Please re-attach. Again so sorry about this!
          Hide
          Hudson added a comment -

          Integrated in HBase-TRUNK #2001 (See https://builds.apache.org/job/HBase-TRUNK/2001/)
          HBASE-3968 HLog Pretty Printer
          HBASE-3968 HLog Pretty Printer – REDO
          HBASE-3968 HLog Pretty Printer – REVERT: I committed too much
          HBASE-3968 HLog Pretty Printer

          stack :
          Files :

          • /hbase/trunk/CHANGES.txt

          stack :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogPrettyPrinter.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
          • /hbase/trunk/CHANGES.txt
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java

          stack :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogPrettyPrinter.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
          • /hbase/trunk/CHANGES.txt
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java

          stack :
          Files :

          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogPrettyPrinter.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
          • /hbase/trunk/CHANGES.txt
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
          • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
          Show
          Hudson added a comment - Integrated in HBase-TRUNK #2001 (See https://builds.apache.org/job/HBase-TRUNK/2001/ ) HBASE-3968 HLog Pretty Printer HBASE-3968 HLog Pretty Printer – REDO HBASE-3968 HLog Pretty Printer – REVERT: I committed too much HBASE-3968 HLog Pretty Printer stack : Files : /hbase/trunk/CHANGES.txt stack : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogPrettyPrinter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java stack : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogPrettyPrinter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java stack : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogPrettyPrinter.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java /hbase/trunk/CHANGES.txt /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogKey.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java
          Hide
          stack added a comment -

          Committed to branch and trunk. Thank you for the patch Riley.

          Show
          stack added a comment - Committed to branch and trunk. Thank you for the patch Riley.
          Hide
          Riley Patterson added a comment -

          Sounds good. A label like " Transaction:" or something might be appropriate to separate multiple actions that occur in the same sequence though.

          Show
          Riley Patterson added a comment - Sounds good. A label like " Transaction:" or something might be appropriate to separate multiple actions that occur in the same sequence though.
          Hide
          stack added a comment -

          Good. Let me commit. I can fix the license. Will just remove 'Put action' from the output for now. That OK?

          Show
          stack added a comment - Good. Let me commit. I can fix the license. Will just remove 'Put action' from the output for now. That OK?
          Hide
          Riley Patterson added a comment -

          Note that the post killed the indentations in the first example, but they are actually there in the output.

          Show
          Riley Patterson added a comment - Note that the post killed the indentations in the first example, but they are actually there in the output.
          Hide
          Riley Patterson added a comment -

          toStringBinary produces \x(hex value) for non-ascii characters, and the JSONObject further escapes this "\" -> "
          ". This is desired behavior because when a JSON parser parses the string, it will decode this back to one backslash and the string can be parsed from there. It looks like the JSON standard doesn't support the \x notation, so this seemed like the best approach.

          I suppose the "Put" part of that line is not strictly necessary, but yes, I did mean that the following information represents a single put transaction.

          I apologize about the license; I'll make sure to add that next time.

          It does, in fact, work! Here are two examples, one with pretty printing and the other with JSON output:

          > HLog --dump logfilepath -p -w rpatt
          Sequence 33 from region people,,1308955768047.c2bce8f6ab585127f13f37c59d23fe26. in table people
          Put action:
          row: rpatt
          column: name:first
          at time: Mon Jun 27 11:06:46 PDT 2011
          value: Riley
          Sequence 34 from region people,,1308955768047.c2bce8f6ab585127f13f37c59d23fe26. in table people
          Put action:
          row: rpatt
          column: name:last
          at time: Mon Jun 27 11:06:56 PDT 2011
          value: Patterson

          > HLog --dump logfilepath -p -j -w rpatt
          [{"region":"people,,1308955768047.c2bce8f6ab585127f13f37c59d23fe26.","sequence":33,"table":"people","actions":[

          {"timestamp":1309198006569,"family":"name","qualifier":"first","row":"rpatt","value":"Riley"}

          ]},{"region":"people,,1308955768047.c2bce8f6ab585127f13f37c59d23fe26.","sequence":34,"table":"people","actions":[

          {"timestamp":1309198016702,"family":"name","qualifier":"last","row":"rpatt","value":"Patterson"}

          ]}]

          Thanks for your feedback!

          Show
          Riley Patterson added a comment - toStringBinary produces \x(hex value) for non-ascii characters, and the JSONObject further escapes this "\" -> " ". This is desired behavior because when a JSON parser parses the string, it will decode this back to one backslash and the string can be parsed from there. It looks like the JSON standard doesn't support the \x notation, so this seemed like the best approach. I suppose the "Put" part of that line is not strictly necessary, but yes, I did mean that the following information represents a single put transaction. I apologize about the license; I'll make sure to add that next time. It does, in fact, work! Here are two examples, one with pretty printing and the other with JSON output: > HLog --dump logfilepath -p -w rpatt Sequence 33 from region people,,1308955768047.c2bce8f6ab585127f13f37c59d23fe26. in table people Put action: row: rpatt column: name:first at time: Mon Jun 27 11:06:46 PDT 2011 value: Riley Sequence 34 from region people,,1308955768047.c2bce8f6ab585127f13f37c59d23fe26. in table people Put action: row: rpatt column: name:last at time: Mon Jun 27 11:06:56 PDT 2011 value: Patterson > HLog --dump logfilepath -p -j -w rpatt [{"region":"people,,1308955768047.c2bce8f6ab585127f13f37c59d23fe26.","sequence":33,"table":"people","actions":[ {"timestamp":1309198006569,"family":"name","qualifier":"first","row":"rpatt","value":"Riley"} ]},{"region":"people,,1308955768047.c2bce8f6ab585127f13f37c59d23fe26.","sequence":34,"table":"people","actions":[ {"timestamp":1309198016702,"family":"name","qualifier":"last","row":"rpatt","value":"Patterson"} ]}] Thanks for your feedback!
          Hide
          stack added a comment -

          Patch looks great. A few questions:

          You do toStringBinary. Is this going to be ok inside in the middle of a JSON output? Does toStringBinary do sufficient escaping such that parse of the emitted JSON is still possible or when you do JSONObject, does that do the necessary escaping?

          What do you mean by Put action here:

          '+ out.println(" Put action:");'

          That everything is a 'put'?

          You are missing a license on HLogPrettyPrinter.java (but thats minor, I can fix on commit).

          It looks great.

          Does it work? Do you have examples you could paste into the issue to demo its actually working?

          Good on you Riley.

          Show
          stack added a comment - Patch looks great. A few questions: You do toStringBinary. Is this going to be ok inside in the middle of a JSON output? Does toStringBinary do sufficient escaping such that parse of the emitted JSON is still possible or when you do JSONObject, does that do the necessary escaping? What do you mean by Put action here: '+ out.println(" Put action:");' That everything is a 'put'? You are missing a license on HLogPrettyPrinter.java (but thats minor, I can fix on commit). It looks great. Does it work? Do you have examples you could paste into the issue to demo its actually working? Good on you Riley.
          Hide
          Todd Lipcon added a comment -

          seems reasonable

          Show
          Todd Lipcon added a comment - seems reasonable
          Hide
          Nicolas Spiegelberg added a comment -

          @Todd: our internal branches do not have Avro support, so that's a little overkill for us but might be good for a subsequent task. The thought was that JSON encoding will allow us to easily feed the data into a Python/Ruby script to do any in-depth analysis that shell scripting falls short on.

          Show
          Nicolas Spiegelberg added a comment - @Todd: our internal branches do not have Avro support, so that's a little overkill for us but might be good for a subsequent task. The thought was that JSON encoding will allow us to easily feed the data into a Python/Ruby script to do any in-depth analysis that shell scripting falls short on.
          Hide
          Todd Lipcon added a comment -

          One possible route would be to write a converter to Avro. Then, you'd have all of the existing Avro tooling for doing things like pretty-print, structured grep, etc, plus the advantage that you could easily write mapreduce/hive queries over the converted logs.

          Show
          Todd Lipcon added a comment - One possible route would be to write a converter to Avro. Then, you'd have all of the existing Avro tooling for doing things like pretty-print, structured grep, etc, plus the advantage that you could easily write mapreduce/hive queries over the converted logs.
          Hide
          Nicolas Spiegelberg added a comment -

          We need a basic JSON encoder for this task. Noticed that this has been a pain point in the past. We'll see if we can use Jersey for this task and tread lightly if not.

          Show
          Nicolas Spiegelberg added a comment - We need a basic JSON encoder for this task. Noticed that this has been a pain point in the past. We'll see if we can use Jersey for this task and tread lightly if not.
          Hide
          Nicolas Spiegelberg added a comment -

          Riley, one of our summer interns, will be working on this task.

          Show
          Nicolas Spiegelberg added a comment - Riley, one of our summer interns, will be working on this task.

            People

            • Assignee:
              Riley Patterson
              Reporter:
              Nicolas Spiegelberg
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development