Details

    • Type: Sub-task Sub-task
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.98.0, 0.95.2
    • Component/s: Client
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      Hide
      This patch introduces an extensible data types API for HBase. It is inspired by the following systems:

       - PostgreSQL. Postgres has a user-extensible data type API, which has been used to great effect by it's user community (ie, PostGIS). The desire is for HBase to expose an equally extensible data type API. One aspect of the Postgres data type is the ability to provide equivalence functions for index operations. This appears to be of critical performance utility for its execution engine.
       - Orderly. Orderly handles the issue of compound rowkeys by providing convenience classes for handling these kinds of data types. This influence is reflected in the Struct and Union family of classes.
       - Phoenix. The PDataType enum used in Phoenix provides type hints, similar Postgres's equivalence functions. These appear to be used during query execution for numerical type promotion.

      This patch introduces an interface, DataType, along with a number of data type implementations based on the Bytes encoding. Also included are Struct and Union types, demonstrating simple implementations of compound types. Helper classes around the Struct implementation are also provided.

      This patch does not address the type compatibility concerns expressed by Phoenix's PDataType API (ie, isComparableTo, isCoercibleTo); these will be addressed in HBASE-8863.

      This patch also provides DataType implementations based on the OrderedBytes encoding from HBASE-8201.
      Show
      This patch introduces an extensible data types API for HBase. It is inspired by the following systems:  - PostgreSQL. Postgres has a user-extensible data type API, which has been used to great effect by it's user community (ie, PostGIS). The desire is for HBase to expose an equally extensible data type API. One aspect of the Postgres data type is the ability to provide equivalence functions for index operations. This appears to be of critical performance utility for its execution engine.  - Orderly. Orderly handles the issue of compound rowkeys by providing convenience classes for handling these kinds of data types. This influence is reflected in the Struct and Union family of classes.  - Phoenix. The PDataType enum used in Phoenix provides type hints, similar Postgres's equivalence functions. These appear to be used during query execution for numerical type promotion. This patch introduces an interface, DataType, along with a number of data type implementations based on the Bytes encoding. Also included are Struct and Union types, demonstrating simple implementations of compound types. Helper classes around the Struct implementation are also provided. This patch does not address the type compatibility concerns expressed by Phoenix's PDataType API (ie, isComparableTo, isCoercibleTo); these will be addressed in HBASE-8863 . This patch also provides DataType implementations based on the OrderedBytes encoding from HBASE-8201 .
    • Tags:
      0.96notable
    1. 0001-HBASE-8693-Extensible-data-types-API.patch
      50 kB
      Nick Dimiduk
    2. 0001-HBASE-8693-Extensible-data-types-API.patch
      93 kB
      Nick Dimiduk
    3. 0001-HBASE-8693-Extensible-data-types-API.patch
      127 kB
      Nick Dimiduk
    4. 0002-HBASE-8693-example-Use-DataType-API-to-build-regionN.patch
      9 kB
      Nick Dimiduk
    5. KijiFormattedEntityId.java
      2 kB
      Nick Dimiduk
    6. 0001-HBASE-8693-Extensible-data-types-API.patch
      130 kB
      Nick Dimiduk
    7. 0001-HBASE-8693-Extensible-data-types-API.patch
      137 kB
      Nick Dimiduk
    8. 0001-HBASE-8693-Extensible-data-types-API.patch
      100 kB
      Nick Dimiduk
    9. 0001-HBASE-8693-Extensible-data-types-API.patch
      128 kB
      Nick Dimiduk
    10. 0001-HBASE-8693-Extensible-data-types-API.patch
      128 kB
      Nicolas Liochon
    11. 0001-HBASE-8693-Extensible-data-types-API.patch
      127 kB
      Nick Dimiduk
    12. 0001-HBASE-8693-Extensible-data-types-API.patch
      131 kB
      Nick Dimiduk
    13. 0001-HBASE-8693-Extensible-data-types-API.patch
      136 kB
      Nick Dimiduk
    14. 0001-HBASE-8693-Extensible-data-types-API.patch
      136 kB
      Nick Dimiduk

      Issue Links

        Activity

        Hide
        Nick Dimiduk added a comment -

        WIP. This patch introduces an extensible data types API for HBase. It is inspired by the following systems:

        • PostgreSQL. Postgres has a user-extensible data type API, which has been used to great effect by it's user community (ie, PostGIS). The desire is for HBase to expose an equally extensible data type API. One aspect of the Postgres data type is the ability to provide equivalence functions for index operations. This appears to be of critical performance utility for its execution engine.
        • Orderly. Orderly handles the issue of compound rowkeys by providing convenience classes for handling these kinds of data types. This influence is reflected in the Struct and Union family of classes.
        • Phoenix. The PDataType enum used in Phoenix provides type hints, similar Postgres's equivalence functions. These appear to be used during query execution for numerical type promotion.

        This initial WIP patch is intended to take a first cut at exercising the OrderedBytes API, particularly around numerical value support. The other intention is to establish the set of data types HBase should provide out of the box. The final intention of this WIP patch is to shop around the data types and API for their definition with a wider audience. Feedback from maintainers of both Phoenix and the Pig, Hive, and Impala HBase interoperability layers is desired.

        Patch TODOs include:

        • proper implementation of isComparableTo and isCoercibleTo
        • test coverage
        • javadocs
        Show
        Nick Dimiduk added a comment - WIP. This patch introduces an extensible data types API for HBase. It is inspired by the following systems: PostgreSQL. Postgres has a user-extensible data type API, which has been used to great effect by it's user community (ie, PostGIS). The desire is for HBase to expose an equally extensible data type API. One aspect of the Postgres data type is the ability to provide equivalence functions for index operations. This appears to be of critical performance utility for its execution engine. Orderly. Orderly handles the issue of compound rowkeys by providing convenience classes for handling these kinds of data types. This influence is reflected in the Struct and Union family of classes. Phoenix. The PDataType enum used in Phoenix provides type hints, similar Postgres's equivalence functions. These appear to be used during query execution for numerical type promotion. This initial WIP patch is intended to take a first cut at exercising the OrderedBytes API, particularly around numerical value support. The other intention is to establish the set of data types HBase should provide out of the box. The final intention of this WIP patch is to shop around the data types and API for their definition with a wider audience. Feedback from maintainers of both Phoenix and the Pig, Hive, and Impala HBase interoperability layers is desired. Patch TODOs include: proper implementation of isComparableTo and isCoercibleTo test coverage javadocs
        Show
        Nick Dimiduk added a comment - on rb: https://reviews.apache.org/r/12069/ (cc Doug Meil , James Taylor , Elliott Clark , Owen O'Malley , Ashutosh Chauhan , Alan Gates )
        Hide
        Nick Dimiduk added a comment -

        A note regarding variable length encodings. Variable vs fixed-width encodings was a highlighted point during conversations around HBASE-7221 and HBASE-7692. These data type implementations make exclusive use of the OrderedBytes encodings. That is because my thinking around them thus far is focused on use as rowkeys and column qualifiers. However, this requirement isn't strictly necessary for use in values. I noticed a rough analogy in Postgres's data type implementation is the distinction between the encoding used to store data and the encoding used for an index entry.

        Do you think we should have a similar kind of dichotomy for encoding into order-preserving context vs non-order-preserving context? My initial thinking is probably not (due to additional API surface area), but I want to have the conversation.

        Show
        Nick Dimiduk added a comment - A note regarding variable length encodings. Variable vs fixed-width encodings was a highlighted point during conversations around HBASE-7221 and HBASE-7692 . These data type implementations make exclusive use of the OrderedBytes encodings. That is because my thinking around them thus far is focused on use as rowkeys and column qualifiers. However, this requirement isn't strictly necessary for use in values. I noticed a rough analogy in Postgres's data type implementation is the distinction between the encoding used to store data and the encoding used for an index entry. Do you think we should have a similar kind of dichotomy for encoding into order-preserving context vs non-order-preserving context? My initial thinking is probably not (due to additional API surface area), but I want to have the conversation.
        Hide
        Nicolas Liochon added a comment -

        You kept the java name for all types, except BigDecimal, is there a reason?
        Some unit tests could help to see how it should be used. For example, the constructors are private, I was wondering if one would not want to create these objects from core java classes (i.e.: create a hbase.Double from a java double).

        Show
        Nicolas Liochon added a comment - You kept the java name for all types, except BigDecimal, is there a reason? Some unit tests could help to see how it should be used. For example, the constructors are private, I was wondering if one would not want to create these objects from core java classes (i.e.: create a hbase.Double from a java double).
        Hide
        Nick Dimiduk added a comment -

        You kept the java name for all types, except BigDecimal, is there a reason?

        The data types come from the (outdated) spec posted on HBASE-8089. I believe there is value in choosing types that are meaningful in a SQL context, but we shouldn't limit our thinking on this to what was laid down 30 years ago.

        Some unit tests could help to see how it should be used.

        Agreed. Look for those in a followup patch.

        For example, the constructors are private...

        The idea is instances of HDataType are type definitions, not data values. For instance, the o.a.h.h.t.Decimal#DECIMAL instance is the definition of how to encode,decode values and how values of this type relate to values of other types. It does not represent a numeric value.

        Show
        Nick Dimiduk added a comment - You kept the java name for all types, except BigDecimal, is there a reason? The data types come from the (outdated) spec posted on HBASE-8089 . I believe there is value in choosing types that are meaningful in a SQL context, but we shouldn't limit our thinking on this to what was laid down 30 years ago. Some unit tests could help to see how it should be used. Agreed. Look for those in a followup patch. For example, the constructors are private... The idea is instances of HDataType are type definitions, not data values. For instance, the o.a.h.h.t.Decimal#DECIMAL instance is the definition of how to encode,decode values and how values of this type relate to values of other types. It does not represent a numeric value.
        Hide
        stack added a comment -

        Should this work be in hbase-common rather than in hbase-client? They are client facility at first but one day they might go server-side. Also, easier adding to hbase-common than to hbase-client. Unless they have dependencies?

        What is Order here?

        + /**
        + * Write instance <code>v</code> into buffer <code>b</code>.
        + */
        + public abstract void write(ByteBuffer b, T v, Order ord);

        When would I use isCoercibleTo?

        I see a read on Union4 but not a write. That intentional? The union3 will take care of it? Ditto union3...

        How I describe a Struct outside of a Struct (JSON to describe how to make one?)

        Whats a Binary?

        Agree that example usage would help.

        Do we need all these types?

        Good stuff N.

        I think you should post today's slides here too; they are good on the high-level.

        Show
        stack added a comment - Should this work be in hbase-common rather than in hbase-client? They are client facility at first but one day they might go server-side. Also, easier adding to hbase-common than to hbase-client. Unless they have dependencies? What is Order here? + /** + * Write instance <code>v</code> into buffer <code>b</code>. + */ + public abstract void write(ByteBuffer b, T v, Order ord); When would I use isCoercibleTo? I see a read on Union4 but not a write. That intentional? The union3 will take care of it? Ditto union3... How I describe a Struct outside of a Struct (JSON to describe how to make one?) Whats a Binary? Agree that example usage would help. Do we need all these types? Good stuff N. I think you should post today's slides here too; they are good on the high-level.
        Hide
        Nick Dimiduk added a comment -

        Should this work be in hbase-common rather than in hbase-client?

        Initial conversations required the type stuff not be in common. I agree, it makes more sense there and I think that community opinion is changing. The current implementation doesn't bring in any dependencies, so it should be painless.

        What is Order here?

        Order is a component from the OrderedBytes implementation (see patch on HBASE-8201). It enables users to store data sorted in ascending or descending order. Right now it's mostly a vestigial appendage; I don't know how the data types API wants to expose and consume this functionality. I'm hoping to gain insight from Phoenix, Kiji, &c in future reviews.

        When would I use isCoercibleTo?

        This comes from examination of Phoenix's PDataType. My understanding is, in the absence of secondary indices, the query planner can use type coercion to its advantage. This is the part of the data type API that I understand the least. I'm hoping for more clarity from James Taylor.

        I see a read on Union4

        Sounds like a bug to me.

        How I describe a Struct outside of a Struct..?

        Examples to follow.

        Whats a Binary?

        Equivalent to SQL BLOB. This is how a user can inject good old fashion byte[]}}s into a {{Struct or Union.

        Do we need all these types?

        Great question. That conversation is happening up on HBASE-8089. My preference is no, but I think the SQL guys want more of these for better interoperability between them.

        Show
        Nick Dimiduk added a comment - Should this work be in hbase-common rather than in hbase-client? Initial conversations required the type stuff not be in common. I agree, it makes more sense there and I think that community opinion is changing. The current implementation doesn't bring in any dependencies, so it should be painless. What is Order here? Order is a component from the OrderedBytes implementation (see patch on HBASE-8201 ). It enables users to store data sorted in ascending or descending order. Right now it's mostly a vestigial appendage; I don't know how the data types API wants to expose and consume this functionality. I'm hoping to gain insight from Phoenix, Kiji, &c in future reviews. When would I use isCoercibleTo? This comes from examination of Phoenix's PDataType . My understanding is, in the absence of secondary indices, the query planner can use type coercion to its advantage. This is the part of the data type API that I understand the least. I'm hoping for more clarity from James Taylor . I see a read on Union4 Sounds like a bug to me. How I describe a Struct outside of a Struct..? Examples to follow. Whats a Binary? Equivalent to SQL BLOB. This is how a user can inject good old fashion byte[]}}s into a {{Struct or Union . Do we need all these types? Great question. That conversation is happening up on HBASE-8089 . My preference is no, but I think the SQL guys want more of these for better interoperability between them.
        Hide
        Nick Dimiduk added a comment -

        This patch extends the list of provided types to include "legacy" types, intended to ease user transition of existing applications. It also extends the test coverage and addresses some of stack's comments on the earlier draft. Also of interest primarily to James Taylor, it defers the isComparableTo/isCoercableTo API conversation to HBASE-8863.

        Show
        Nick Dimiduk added a comment - This patch extends the list of provided types to include "legacy" types, intended to ease user transition of existing applications. It also extends the test coverage and addresses some of stack 's comments on the earlier draft. Also of interest primarily to James Taylor , it defers the isComparableTo / isCoercableTo API conversation to HBASE-8863 .
        Hide
        Nick Dimiduk added a comment -

        Marking as patch available to that it's clear the patch is ready for a proper review. Will fail BuildBot because its dependency isn't committed yet.

        Show
        Nick Dimiduk added a comment - Marking as patch available to that it's clear the patch is ready for a proper review. Will fail BuildBot because its dependency isn't committed yet.
        Hide
        Nick Dimiduk added a comment -

        RB is down at the moment. I have some incremental work on github, including an example of using HDataType in HRegionInfo.

        I'm thinking I want to rename HDataType#{read,write} to HDataType#{decode,encode}. Thoughts?

        Show
        Nick Dimiduk added a comment - RB is down at the moment. I have some incremental work on github, including an example of using HDataType in HRegionInfo . I'm thinking I want to rename HDataType#{read,write } to HDataType#{decode,encode }. Thoughts?
        Hide
        Nick Dimiduk added a comment -

        This patch introduces the isSkippable() api to HDataType. It provides a StructBuilder helper for dealing with anonymous Structs. It adds Wrappers for making an existing type Terminated or of FixedLength. It also corrects some copy-paste errors in initial implementations of some of the Legacy types.

        Show
        Nick Dimiduk added a comment - This patch introduces the isSkippable() api to HDataType. It provides a StructBuilder helper for dealing with anonymous Structs. It adds Wrappers for making an existing type Terminated or of FixedLength. It also corrects some copy-paste errors in initial implementations of some of the Legacy types.
        Hide
        Nick Dimiduk added a comment -

        Per Enis Soztutar's comment, this is an example of using the HDataType classes to manage HRegionInfo's regionName. It could probably be taken a step further to manage the MD5Hash as a LegacyBytesFixedLength subclass. That would allow the total size of the allocated buffer to be calculated with HREGIONINFO_NEW_CODEC#encodedLength() instead of by hand.

        Show
        Nick Dimiduk added a comment - Per Enis Soztutar 's comment, this is an example of using the HDataType classes to manage HRegionInfo's regionName. It could probably be taken a step further to manage the MD5Hash as a LegacyBytesFixedLength subclass. That would allow the total size of the allocated buffer to be calculated with HREGIONINFO_NEW_CODEC#encodedLength() instead of by hand.
        Hide
        Nick Dimiduk added a comment -

        Moving Matteo Bertozzi's comment from the dev list back to JIRA:

        I was looking at the HBASE-8693 patch, and looks good to me for the primitive types.

        Thanks, and I'm glad to hear it. Any comments about redundant or missing types a user would expect out of the box?

        but I can't see how do you plan to evolve stuff like the struct.

        Struct is a programatic data structure, not a tool for schema management. It has no concept of "upgrade Struct Foo, version 1 to Foo version 2 by adding a new field in the middle here and changing the last one from X to Y." It's a convenience for manipulating complex byte[] structures. Schema management may become of concern for HBase, but that's out of scope. Any chance this topic came up at yesterday's meetup?

        By "evolve" I mean add/remove fields, or just query it with a subset of fields. the fields don't have an id, and on read you must specify all of them in the same order as you've used for write. (but maybe is just an immutable/fixed list of fields, and I'm ok with just adding that info to the comment on top of the class)

        I discovered this missing API while working through the example use patch, above. The update I posted on RB yesterday adds an API for accessing a specific struct member by position. If RB links work, take a look at Struct#read(ByteBuffer, int).

        Show
        Nick Dimiduk added a comment - Moving Matteo Bertozzi 's comment from the dev list back to JIRA: I was looking at the HBASE-8693 patch, and looks good to me for the primitive types. Thanks, and I'm glad to hear it. Any comments about redundant or missing types a user would expect out of the box? but I can't see how do you plan to evolve stuff like the struct. Struct is a programatic data structure, not a tool for schema management. It has no concept of "upgrade Struct Foo, version 1 to Foo version 2 by adding a new field in the middle here and changing the last one from X to Y." It's a convenience for manipulating complex byte[] structures. Schema management may become of concern for HBase, but that's out of scope. Any chance this topic came up at yesterday's meetup? By "evolve" I mean add/remove fields, or just query it with a subset of fields. the fields don't have an id, and on read you must specify all of them in the same order as you've used for write. (but maybe is just an immutable/fixed list of fields, and I'm ok with just adding that info to the comment on top of the class) I discovered this missing API while working through the example use patch, above. The update I posted on RB yesterday adds an API for accessing a specific struct member by position. If RB links work, take a look at Struct#read(ByteBuffer, int) .
        Hide
        Matteo Bertozzi added a comment -

        Struct is a programatic data structure, not a tool for schema management. It has no concept of "upgrade Struct Foo, version 1 to Foo version 2 by adding a new field in the middle here and changing the last one from X to Y." It's a convenience for manipulating complex byte[] structures. Schema management may become of concern for HBase, but that's out of scope.

        Ok, make sense with this limited scope (no schema) have a fixed list of fields.

        My main concern is: I start use 96 with this struct encoding... is fixed so I can't add fields.. so I work around it adding a version number in front of the struct and then I do the switch for v1, v2, v3 with all the fixed struct that I know...

        ...later I switch to a future release that have the code for table schema that "half" relies on this patch. How can I map my data? since I've done some tricks for my versioning I probably can't do anything... and I must rewrite everything..

        as you said, data evolution is out of the scope. so if you consider this patch just as a "smarter" alternative to the Bytes encoding. feel free to ignore my comments since this stuff already looks good to me as it is.

        Show
        Matteo Bertozzi added a comment - Struct is a programatic data structure, not a tool for schema management. It has no concept of "upgrade Struct Foo, version 1 to Foo version 2 by adding a new field in the middle here and changing the last one from X to Y." It's a convenience for manipulating complex byte[] structures. Schema management may become of concern for HBase, but that's out of scope. Ok, make sense with this limited scope (no schema) have a fixed list of fields. My main concern is: I start use 96 with this struct encoding... is fixed so I can't add fields.. so I work around it adding a version number in front of the struct and then I do the switch for v1, v2, v3 with all the fixed struct that I know... ...later I switch to a future release that have the code for table schema that "half" relies on this patch. How can I map my data? since I've done some tricks for my versioning I probably can't do anything... and I must rewrite everything.. as you said, data evolution is out of the scope. so if you consider this patch just as a "smarter" alternative to the Bytes encoding. feel free to ignore my comments since this stuff already looks good to me as it is.
        Hide
        Nick Dimiduk added a comment -

        Ok, make sense with this limited scope (no schema) have a fixed list of fields.

        Right. In this implementation Struct is a simple concatenation of fields. No schema information is written into that concatenation because to do so will mess with sort order. Struct is merely API convenience. Now, the field encodings implemented in OrderedBytes include a header byte which is currently used to identify the type of encoded field that follows. The full space of 256 available bit patterns in that header bit is not consumed by the current implementation. I've been thinking about extending that header byte to include some version bits at the very beginning. That would enable evolution of the individual field encodings (say, if you later want to re-implement blob-mid, for example). This doesn't address the user-level logical structure of a Struct data type, only evolution of the OrderedBytes codec.

        My main concern is: I start use 96 with this struct encoding... is fixed so I can't add fields.. so I work around it adding a version number in front of the struct and then I do the switch for v1, v2, v3 with all the fixed struct that I know...

        Prepending a version number to the Struct's members will impact sort order. Struct definition is fixed in that you can't prepend or interpose a new field in the middle of an existing encoded value. You're free to append fields. Appending a field would look like the following:

        1. application defines Struct v0 with members [A,B,C]
        2. application writes lots of data
        3. application changes, Struct v1 becomes [A,B,C,D,E]
        4. application writes lots more data

        At step 3, the application now needs to become version aware. Because the fields of v0 are a subset of v1, the application can use the definition of struct v1 with the following safe-guards. (1) Any place where v0 was used, it now needs to be sure to check for end-of-buffer and skip over the two new elements. (2) Anywhere v1 is used, mindful of truncated records and be prepared to only receive the v0 fields. Maybe the API defined around Struct can be improved to support these needs?

        Records of v0 and v1 can be intermixed, ie, as rowkeys in the same table. According to the documented sort semantics, they'll sort "left-to-right and depth-first". Meaning, they'll sort first according to v0 values and then within that group, by v1 values.

        We leave all of this up to user applications today, so this change management isn't mitigated. Changing a compound rowkey today requires rewriting data (or duplication into a new table). A smarter struct encoding, one that's able to preserve the sorted semantics I've described but that can also track more sophisticated schama change would be very useful indeed – I don't think it exists.

        Prepending a version field to a Struct will change the sorting behavior; v0 will sort before v1, &c. IMHO, this is a less flexible migration strategy than the append behavior described above. It's also perfectly valid, and the user of the Struct API is free to do so in their own application. In that case, the application is still version-aware. Instead of being cautious about consuming the potentially truncated records, instead it's executing a scan for each version.

        as you said, data evolution is out of the scope. so if you consider this patch just as a "smarter" alternative to the Bytes encoding.

        HBASE-8201 is a smarter alternative to Bytes and this ticket adds some higher-level APIs for manipulating them. In short, yes, schema definition and evolution is out of scope.

        Show
        Nick Dimiduk added a comment - Ok, make sense with this limited scope (no schema) have a fixed list of fields. Right. In this implementation Struct is a simple concatenation of fields. No schema information is written into that concatenation because to do so will mess with sort order. Struct is merely API convenience. Now, the field encodings implemented in OrderedBytes include a header byte which is currently used to identify the type of encoded field that follows. The full space of 256 available bit patterns in that header bit is not consumed by the current implementation. I've been thinking about extending that header byte to include some version bits at the very beginning. That would enable evolution of the individual field encodings (say, if you later want to re-implement blob-mid, for example). This doesn't address the user-level logical structure of a Struct data type, only evolution of the OrderedBytes codec. My main concern is: I start use 96 with this struct encoding... is fixed so I can't add fields.. so I work around it adding a version number in front of the struct and then I do the switch for v1, v2, v3 with all the fixed struct that I know... Prepending a version number to the Struct's members will impact sort order. Struct definition is fixed in that you can't prepend or interpose a new field in the middle of an existing encoded value. You're free to append fields. Appending a field would look like the following: application defines Struct v0 with members [A,B,C] application writes lots of data application changes, Struct v1 becomes [A,B,C,D,E] application writes lots more data At step 3, the application now needs to become version aware. Because the fields of v0 are a subset of v1, the application can use the definition of struct v1 with the following safe-guards. (1) Any place where v0 was used, it now needs to be sure to check for end-of-buffer and skip over the two new elements. (2) Anywhere v1 is used, mindful of truncated records and be prepared to only receive the v0 fields. Maybe the API defined around Struct can be improved to support these needs? Records of v0 and v1 can be intermixed, ie, as rowkeys in the same table. According to the documented sort semantics, they'll sort "left-to-right and depth-first". Meaning, they'll sort first according to v0 values and then within that group, by v1 values. We leave all of this up to user applications today, so this change management isn't mitigated. Changing a compound rowkey today requires rewriting data (or duplication into a new table). A smarter struct encoding, one that's able to preserve the sorted semantics I've described but that can also track more sophisticated schama change would be very useful indeed – I don't think it exists. Prepending a version field to a Struct will change the sorting behavior; v0 will sort before v1, &c. IMHO, this is a less flexible migration strategy than the append behavior described above. It's also perfectly valid, and the user of the Struct API is free to do so in their own application. In that case, the application is still version-aware. Instead of being cautious about consuming the potentially truncated records, instead it's executing a scan for each version. as you said, data evolution is out of the scope. so if you consider this patch just as a "smarter" alternative to the Bytes encoding. HBASE-8201 is a smarter alternative to Bytes and this ticket adds some higher-level APIs for manipulating them. In short, yes, schema definition and evolution is out of scope.
        Hide
        Matteo Bertozzi added a comment -

        above you talk about the sort order, and I guess just about the key.
        but when I talk about data or schema I refer to the cell value, not the key.
        For the key I think that the fixed or append-only, as you pointed out is good enough.

        again, maybe I'm out of scope.. but do you see those classes used only to encode the key? e.g. the struct mention explicitly the key in the comment. I probably see this as more generic key/value serialization, knowing about the future direction with the schema.

        Show
        Matteo Bertozzi added a comment - above you talk about the sort order, and I guess just about the key. but when I talk about data or schema I refer to the cell value, not the key. For the key I think that the fixed or append-only, as you pointed out is good enough. again, maybe I'm out of scope.. but do you see those classes used only to encode the key? e.g. the struct mention explicitly the key in the comment. I probably see this as more generic key/value serialization, knowing about the future direction with the schema.
        Hide
        Nick Dimiduk added a comment -

        To be fair, sort order also is of concern in column names. My choice of the word "schema" was unfortunate in my previous comment. I should have said "no composite structure is written into the concatenation." Because HBase's only native data type is byte[], encodings are necessary for any application value other than byte[], wherever it hits a rowkey, qualifier, or value.

        It's quite out of scope for my purposes, but I'm curious what you think about the future direction with schema. I think the Phoenix and Kiji folk will have some good insights.

        Show
        Nick Dimiduk added a comment - To be fair, sort order also is of concern in column names. My choice of the word "schema" was unfortunate in my previous comment. I should have said "no composite structure is written into the concatenation." Because HBase's only native data type is byte[], encodings are necessary for any application value other than byte[], wherever it hits a rowkey, qualifier, or value. It's quite out of scope for my purposes, but I'm curious what you think about the future direction with schema. I think the Phoenix and Kiji folk will have some good insights.
        Hide
        Nick Dimiduk added a comment -

        I had a look over the kiji-schema project. They have a much more developed sense of schema than anything I've proposed, which makes me think I'm still on the right track. I haven't been through it end-to-end, but I took a stab at implementing a piece of their stack using this data types API. It looks to me like the bit they use to implement HBase rowkeys.

        I don't know how it would wire up into their Avro IDL, but it looks like they could open up the restrictions on their RowKeyFormat2.ComponentType to support any HDataType implementation if so desired. Here's the example.

        Show
        Nick Dimiduk added a comment - I had a look over the kiji-schema project. They have a much more developed sense of schema than anything I've proposed, which makes me think I'm still on the right track. I haven't been through it end-to-end, but I took a stab at implementing a piece of their stack using this data types API. It looks to me like the bit they use to implement HBase rowkeys. I don't know how it would wire up into their Avro IDL, but it looks like they could open up the restrictions on their RowKeyFormat2.ComponentType to support any HDataType implementation if so desired. Here's the example.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12593082/KijiFormattedEntityId.java
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6404//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12593082/KijiFormattedEntityId.java against trunk revision . +1 @author . The patch does not contain any @author tags. -1 tests included . The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6404//console This message is automatically generated.
        Hide
        Matteo Bertozzi added a comment -

        Thanks for keeping following up on my out of scope questions.

        again, I think that I'm focusing more on the cell-value side instead of the key part which will be the one that will have the benefit from the ordered byte stuff and will probably have more restriction on the evolution since this stuff is client side only and you've to deal with the raw byte sorting of hbase.

        It's quite out of scope for my purposes, but I'm curious what you think about the future direction with schema. I think the Phoenix and Kiji folk will have some good insights.

        (I'll talk only about cell-values here, so I'm not interested in the ordered stuff in this case)
        I want to write my app today with this library.
        I'll start off using a Struct, and it's ok until I have to add/remove a field.
        so.. I can add a version/schema id.. but now I have the problem that I have to keep all the schemas and then project to the schema that I want to use.

        Example:

        • get row0 -> cell with schema 1
        • get row1 -> cell with schema 2
        • get row2 -> cell with schema 3
        • Now the user/api have to handle this 3 different rows and project to a user provided schema to get out something useful to the user...

        In this case, you have to store all the schemas and you've to provide a mapping for each schema to the one that the user wants.

        The other approach, more protobuf like is each field has an id that must be unique. on read you provide your "read schema" and you load only the field present in the "read schema".
        note that this can also work with just with the api similar to what you have "getField(field_id)" where the id is the unique id and not the index.

        again, I think that your focus at the moment is more on the key side... and my guess is that the struct is fine for that.
        but this jira is "serialization primitives" without a "row-keys" in front... so I assume you plan to use this stuff also for the cell values, and from what I said above... I don't see an easy way to evolve my cell data, without rewrite every time or doing "manual" mappings for each struct version.

        Show
        Matteo Bertozzi added a comment - Thanks for keeping following up on my out of scope questions. again, I think that I'm focusing more on the cell-value side instead of the key part which will be the one that will have the benefit from the ordered byte stuff and will probably have more restriction on the evolution since this stuff is client side only and you've to deal with the raw byte sorting of hbase. It's quite out of scope for my purposes, but I'm curious what you think about the future direction with schema. I think the Phoenix and Kiji folk will have some good insights. (I'll talk only about cell-values here, so I'm not interested in the ordered stuff in this case) I want to write my app today with this library. I'll start off using a Struct, and it's ok until I have to add/remove a field. so.. I can add a version/schema id.. but now I have the problem that I have to keep all the schemas and then project to the schema that I want to use. Example: get row0 -> cell with schema 1 get row1 -> cell with schema 2 get row2 -> cell with schema 3 Now the user/api have to handle this 3 different rows and project to a user provided schema to get out something useful to the user... In this case, you have to store all the schemas and you've to provide a mapping for each schema to the one that the user wants. The other approach, more protobuf like is each field has an id that must be unique. on read you provide your "read schema" and you load only the field present in the "read schema". note that this can also work with just with the api similar to what you have "getField(field_id)" where the id is the unique id and not the index. again, I think that your focus at the moment is more on the key side... and my guess is that the struct is fine for that. but this jira is "serialization primitives" without a "row-keys" in front... so I assume you plan to use this stuff also for the cell values, and from what I said above... I don't see an easy way to evolve my cell data, without rewrite every time or doing "manual" mappings for each struct version.
        Hide
        Nick Dimiduk added a comment -

        again, I think that your focus at the moment is more on the key side... and my guess is that the struct is fine for that. but this jira is "serialization primitives" without a "row-keys" in front... so I assume you plan to use this stuff also for the cell values, and from what I said above... I don't see an easy way to evolve my cell data, without rewrite every time or doing "manual" mappings for each struct version.

        You're right, this implementation is too simplistic for storing complex entities in a Cell. You can do it, but you'll be a bit stuck as there's no concept schema identification or of evolution. I can see how the title can be miss-leading. OrderedBytes and HDataType are no replacement for application use of {protobuf,avro,thrift}, particularly in the "entity-centric modeling" approach with fat key-values.

        Show
        Nick Dimiduk added a comment - again, I think that your focus at the moment is more on the key side... and my guess is that the struct is fine for that. but this jira is "serialization primitives" without a "row-keys" in front... so I assume you plan to use this stuff also for the cell values, and from what I said above... I don't see an easy way to evolve my cell data, without rewrite every time or doing "manual" mappings for each struct version. You're right, this implementation is too simplistic for storing complex entities in a Cell. You can do it, but you'll be a bit stuck as there's no concept schema identification or of evolution. I can see how the title can be miss-leading. OrderedBytes and HDataType are no replacement for application use of {protobuf,avro,thrift}, particularly in the "entity-centric modeling" approach with fat key-values.
        Hide
        stack added a comment -

        IIRC, their avro idl is for all but the description of the rowkey. When they talk about rowkey 'schema', it is allowed that it cannot evolve for reasons discussed above. Adding to the right of a rowkey should be fine though. Ditto when serializing column qualifiers.

        High in this issue you raise: "Do you think we should have a similar kind of dichotomy for encoding into order-preserving context vs non-order-preserving context? My initial thinking is probably not (due to additional API surface area), but I want to have the conversation."

        You allow that there are two contexts (and indeed Matteo asks for clarification on this) – one where there is no way around it but you need to rewrite the data if you want to refer to it using a different struct/'schema'; e.g. a rowkey (caveat adding fields to the right) – and then there are the contexts where you should be able to evolve the content; e.g. cell content and even to a higher level where you might impose a schema made of multiple column content (or full row), and so on.

        This seems like a good split. In the cell context, the area where you would like to be able to evolve, sort order preservation is not required. In the simple case, an int16 type, you probably don't need versioning either? Its serialization is unlikely to change but you might want version even these primitive types just in case? If a compound type in a cell, you would like to be able to evolve it; to add fields, etc. So you could add a version to structs here? (but why would user use this lib over pb in this case?) Now you bleed over into higher level issues; schema and its follow-ons, where to store it and how to evolve, etc. (Matteo's concerns).

        I suppose we are fine given you have 'schema' and 'schema evolution' as out-of-scope in your answer to Matteo. We should be clear that these problems remain as to-be-solved (or solved by others – see kiji) after this patch is done and be sure folks don't get the wrong impression. Just saying.

        On the adding fields to the right of your struct, where you have the application use the right struct version, pity your lib couldn't do that for the app. PB has a lead-off serialized length which saves it reading off the end of the record. You can't do that because you'll mess up your ordering. You can't lead the record with a version since that will also mess your sort order (as you say above). A buffer where you check available would be expensive...

        Show
        stack added a comment - IIRC, their avro idl is for all but the description of the rowkey. When they talk about rowkey 'schema', it is allowed that it cannot evolve for reasons discussed above. Adding to the right of a rowkey should be fine though. Ditto when serializing column qualifiers. High in this issue you raise: "Do you think we should have a similar kind of dichotomy for encoding into order-preserving context vs non-order-preserving context? My initial thinking is probably not (due to additional API surface area), but I want to have the conversation." You allow that there are two contexts (and indeed Matteo asks for clarification on this) – one where there is no way around it but you need to rewrite the data if you want to refer to it using a different struct/'schema'; e.g. a rowkey (caveat adding fields to the right) – and then there are the contexts where you should be able to evolve the content; e.g. cell content and even to a higher level where you might impose a schema made of multiple column content (or full row), and so on. This seems like a good split. In the cell context, the area where you would like to be able to evolve, sort order preservation is not required. In the simple case, an int16 type, you probably don't need versioning either? Its serialization is unlikely to change but you might want version even these primitive types just in case? If a compound type in a cell, you would like to be able to evolve it; to add fields, etc. So you could add a version to structs here? (but why would user use this lib over pb in this case?) Now you bleed over into higher level issues; schema and its follow-ons, where to store it and how to evolve, etc. (Matteo's concerns). I suppose we are fine given you have 'schema' and 'schema evolution' as out-of-scope in your answer to Matteo. We should be clear that these problems remain as to-be-solved (or solved by others – see kiji) after this patch is done and be sure folks don't get the wrong impression. Just saying. On the adding fields to the right of your struct, where you have the application use the right struct version, pity your lib couldn't do that for the app. PB has a lead-off serialized length which saves it reading off the end of the record. You can't do that because you'll mess up your ordering. You can't lead the record with a version since that will also mess your sort order (as you say above). A buffer where you check available would be expensive...
        Hide
        Nick Dimiduk added a comment -

        This HDataType interface and the two codecs upon which the implementations rely is not schema management for HBase. HDataType can be used to manage encoding values into rowkeys, column qualifiers, or values. Use an instance of Struct, or don't, in any of those contexts. The use of Struct in the order-sensitive context has driven more design thought, but it generates a byte[] wherever it's used. Would an example of an Avro, Thrift, or Protobuff HDataType implementation help to drive this idea home?

        My trouble with using the word "schema" for key-values is that context is too narrow a scope. Being able to consistently read a value out of a cell does not tell me what the schema of the database is. HBase provides basic table definition management but not data definition management, the effective meaning of schema. Pheonix and Kiji both provide a layer of schema management on top of HBase. Through them you define the logical layout of data in tables, and you abandon to them how that data is physically arranged and encoded. HDataType provides an API with which its user can control how data is physically arranged and encoded. Its user is still left to manage the logical layout and its meaning to their application for themselves.

        This patch is not schema management. It provides a common set of primitives that other applications can consume – be them user applications developed directly against HBase or Phoenix or Kiji themselves. The consumers I've always had in mind have always been myself and application developers like me, Hive, Pig, and Phoenix. The primary benefit being that all those applications gain some level of interoperability through data in HBase. That I was able to read Kiji's avdl file and in an afternoon understand how HDataType could be used to make it's implementation simpler and more extensible is validation of utility.

        Show
        Nick Dimiduk added a comment - This HDataType interface and the two codecs upon which the implementations rely is not schema management for HBase. HDataType can be used to manage encoding values into rowkeys, column qualifiers, or values. Use an instance of Struct , or don't, in any of those contexts. The use of Struct in the order-sensitive context has driven more design thought, but it generates a byte[] wherever it's used. Would an example of an Avro, Thrift, or Protobuff HDataType implementation help to drive this idea home? My trouble with using the word "schema" for key-values is that context is too narrow a scope. Being able to consistently read a value out of a cell does not tell me what the schema of the database is. HBase provides basic table definition management but not data definition management, the effective meaning of schema. Pheonix and Kiji both provide a layer of schema management on top of HBase. Through them you define the logical layout of data in tables, and you abandon to them how that data is physically arranged and encoded. HDataType provides an API with which its user can control how data is physically arranged and encoded. Its user is still left to manage the logical layout and its meaning to their application for themselves. This patch is not schema management. It provides a common set of primitives that other applications can consume – be them user applications developed directly against HBase or Phoenix or Kiji themselves. The consumers I've always had in mind have always been myself and application developers like me, Hive, Pig, and Phoenix. The primary benefit being that all those applications gain some level of interoperability through data in HBase. That I was able to read Kiji's avdl file and in an afternoon understand how HDataType could be used to make it's implementation simpler and more extensible is validation of utility.
        Hide
        Nick Dimiduk added a comment -

        This version of the patch rebases onto the latest patch from HBASE-8201 and exposes additional read, write methods on LegacyBytes.

        Show
        Nick Dimiduk added a comment - This version of the patch rebases onto the latest patch from HBASE-8201 and exposes additional read, write methods on LegacyBytes.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12593621/0001-HBASE-8693-Extensible-data-types-API.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 28 new or modified tests.

        -1 patch. The patch command could not apply the patch.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6432//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12593621/0001-HBASE-8693-Extensible-data-types-API.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 28 new or modified tests. -1 patch . The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6432//console This message is automatically generated.
        Hide
        Nick Dimiduk added a comment -

        Canceling patch until dependency HBASE-8201 is merged.

        Show
        Nick Dimiduk added a comment - Canceling patch until dependency HBASE-8201 is merged.
        Hide
        Nick Dimiduk added a comment -

        This patch addresses remaining reviewer comments.

        • rename HDataType -> DataType
        • rename DataType# {read,write}

          -> decode,encode

        • rename types based on OrderedBytes encoding to Ordered*
        • propagate rename of blob {mid,last}

          -> blob

          {var,copy}
        • fill in holes in LegacyString*, en/decode API
        Show
        Nick Dimiduk added a comment - This patch addresses remaining reviewer comments. rename HDataType -> DataType rename DataType# {read,write} -> decode,encode rename types based on OrderedBytes encoding to Ordered* propagate rename of blob {mid,last} -> blob {var,copy} fill in holes in LegacyString*, en/decode API
        Hide
        Nick Dimiduk added a comment -

        Refactored to remove dependency on HBASE-8201. Now includes dependency on HBASE-9091. Provides DataType interface and implementations based on o.a.h.h.util.Bytes only.

        Show
        Nick Dimiduk added a comment - Refactored to remove dependency on HBASE-8201 . Now includes dependency on HBASE-9091 . Provides DataType interface and implementations based on o.a.h.h.util.Bytes only.
        Hide
        Nick Dimiduk added a comment -

        Based on new version of OrderedBytes. Does not use ByteRange or ByteBuffer, just a byte[] and offset.

        Show
        Nick Dimiduk added a comment - Based on new version of OrderedBytes. Does not use ByteRange or ByteBuffer, just a byte[] and offset.
        Hide
        Nick Dimiduk added a comment -

        On RB.

        Show
        Nick Dimiduk added a comment - On RB.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12595534/0001-HBASE-8693-Extensible-data-types-API.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 32 new or modified tests.

        -1 hadoop1.0. The patch failed to compile against the hadoop 1.0 profile.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6566//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12595534/0001-HBASE-8693-Extensible-data-types-API.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 32 new or modified tests. -1 hadoop1.0 . The patch failed to compile against the hadoop 1.0 profile. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6566//console This message is automatically generated.
        Hide
        Matt Corgan added a comment -

        unofficial +1

        Show
        Matt Corgan added a comment - unofficial +1
        Hide
        Nicolas Liochon added a comment -

        The javadoc still references the ByteRange stuff, but it can be fixed on commit.

        +1 (if it does compile ). I'm restarting a build to check. It would be great to get a +1 from Enis Soztutar, as he started the review on RB.

        Show
        Nicolas Liochon added a comment - The javadoc still references the ByteRange stuff, but it can be fixed on commit. +1 (if it does compile ). I'm restarting a build to check. It would be great to get a +1 from Enis Soztutar , as he started the review on RB.
        Hide
        Nick Dimiduk added a comment -

        The javadoc still references the ByteRange stuff

        Oh bother. I also just found one API remnant that uses ByteBuffer instead of (ByteRange instead of) byte[]. I'll go back through one more time and clean it up.

        Nicolas Liochon I see you uploaded a new patch, but diff show me no delta. Was there some other change you intended?

        Show
        Nick Dimiduk added a comment - The javadoc still references the ByteRange stuff Oh bother. I also just found one API remnant that uses ByteBuffer instead of (ByteRange instead of) byte[]. I'll go back through one more time and clean it up. Nicolas Liochon I see you uploaded a new patch, but diff show me no delta. Was there some other change you intended?
        Hide
        Nick Dimiduk added a comment -

        Updated patch. Purge vestigial remnants of previous API variants. no javadoc warnings or errors produced locally.

        Show
        Nick Dimiduk added a comment - Updated patch. Purge vestigial remnants of previous API variants. no javadoc warnings or errors produced locally.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12595679/0001-HBASE-8693-Extensible-data-types-API.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 32 new or modified tests.

        -1 hadoop1.0. The patch failed to compile against the hadoop 1.0 profile.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6579//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12595679/0001-HBASE-8693-Extensible-data-types-API.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 32 new or modified tests. -1 hadoop1.0 . The patch failed to compile against the hadoop 1.0 profile. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6579//console This message is automatically generated.
        Hide
        Nicolas Liochon added a comment -

        Any objection or time needed for a review? If not I will commit on Monday.

        Show
        Nicolas Liochon added a comment - Any objection or time needed for a review? If not I will commit on Monday.
        Hide
        Nick Dimiduk added a comment -

        Address reviewer comments from Stack and James.

        Show
        Nick Dimiduk added a comment - Address reviewer comments from Stack and James.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12595843/0001-HBASE-8693-Extensible-data-types-API.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 32 new or modified tests.

        -1 hadoop1.0. The patch failed to compile against the hadoop 1.0 profile.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6601//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12595843/0001-HBASE-8693-Extensible-data-types-API.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 32 new or modified tests. -1 hadoop1.0 . The patch failed to compile against the hadoop 1.0 profile. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6601//console This message is automatically generated.
        Hide
        Nick Dimiduk added a comment -

        Updated to use ByteRange.

        Show
        Nick Dimiduk added a comment - Updated to use ByteRange.
        Hide
        Nick Dimiduk added a comment -

        s/Legacy/Raw/ as per Stack's RB comment.

        Show
        Nick Dimiduk added a comment - s/Legacy/Raw/ as per Stack's RB comment.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12596742/0001-HBASE-8693-Extensible-data-types-API.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 32 new or modified tests.

        -1 hadoop1.0. The patch failed to compile against the hadoop 1.0 profile.

        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6645//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall . Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12596742/0001-HBASE-8693-Extensible-data-types-API.patch against trunk revision . +1 @author . The patch does not contain any @author tags. +1 tests included . The patch appears to include 32 new or modified tests. -1 hadoop1.0 . The patch failed to compile against the hadoop 1.0 profile. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6645//console This message is automatically generated.
        Hide
        Nicolas Liochon added a comment -

        @stack, are you ok with this last version?

        Show
        Nicolas Liochon added a comment - @stack, are you ok with this last version?
        Hide
        stack added a comment -

        +1

        Needs a fat release note (the patch preamble makes for a good start)

        Show
        stack added a comment - +1 Needs a fat release note (the patch preamble makes for a good start)
        Hide
        Nicolas Liochon added a comment -

        It's committed. I keep the jira open with a blocker status until Nick updates the release notes (I know he's on vacation this week-end, but I'm not sure about next week. I will do the sum-up if necessary).

        Show
        Nicolas Liochon added a comment - It's committed. I keep the jira open with a blocker status until Nick updates the release notes (I know he's on vacation this week-end, but I'm not sure about next week. I will do the sum-up if necessary).
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-TRUNK #4372 (See https://builds.apache.org/job/HBase-TRUNK/4372/)
        HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) (nkeywal: rev 1512929)

        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/DataType.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/FixedLengthWrapper.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlob.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlobVar.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBytesBase.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat32.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat64.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt32.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt64.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedNumeric.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedString.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytes.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesFixedLength.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesTerminated.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawDouble.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawFloat.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawInteger.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawLong.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawString.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringFixedLength.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringTerminated.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Struct.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructBuilder.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructIterator.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/TerminatedWrapper.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union2.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union3.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union4.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/package-info.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestFixedLengthWrapper.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlob.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlobVar.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedString.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestRawString.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestTerminatedWrapper.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestUnion2.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-TRUNK #4372 (See https://builds.apache.org/job/HBase-TRUNK/4372/ ) HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) (nkeywal: rev 1512929) /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/DataType.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/FixedLengthWrapper.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlob.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlobVar.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBytesBase.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat32.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat64.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt32.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt64.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedNumeric.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedString.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytes.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesFixedLength.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesTerminated.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawDouble.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawFloat.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawInteger.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawLong.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawString.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringFixedLength.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringTerminated.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Struct.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructBuilder.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructIterator.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/TerminatedWrapper.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union2.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union3.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union4.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/package-info.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestFixedLengthWrapper.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlob.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlobVar.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedString.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestRawString.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestTerminatedWrapper.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestUnion2.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #668 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/668/)
        HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) (nkeywal: rev 1512929)

        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/DataType.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/FixedLengthWrapper.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlob.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlobVar.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBytesBase.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat32.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat64.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt32.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt64.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedNumeric.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedString.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytes.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesFixedLength.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesTerminated.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawDouble.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawFloat.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawInteger.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawLong.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawString.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringFixedLength.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringTerminated.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Struct.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructBuilder.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructIterator.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/TerminatedWrapper.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union2.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union3.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union4.java
        • /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/package-info.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestFixedLengthWrapper.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlob.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlobVar.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedString.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestRawString.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestTerminatedWrapper.java
        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestUnion2.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #668 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/668/ ) HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) (nkeywal: rev 1512929) /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/DataType.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/FixedLengthWrapper.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlob.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlobVar.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBytesBase.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat32.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat64.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt32.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt64.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedNumeric.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedString.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytes.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesFixedLength.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesTerminated.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawDouble.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawFloat.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawInteger.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawLong.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawString.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringFixedLength.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringTerminated.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Struct.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructBuilder.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructIterator.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/TerminatedWrapper.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union2.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union3.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union4.java /hbase/trunk/hbase-common/src/main/java/org/apache/hadoop/hbase/types/package-info.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestFixedLengthWrapper.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlob.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlobVar.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedString.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestRawString.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestTerminatedWrapper.java /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestUnion2.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in hbase-0.95 #432 (See https://builds.apache.org/job/hbase-0.95/432/)
        HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) (nkeywal: rev 1512927)

        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/DataType.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/FixedLengthWrapper.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlob.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlobVar.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBytesBase.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat32.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat64.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt32.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt64.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedNumeric.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedString.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytes.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesFixedLength.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesTerminated.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawDouble.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawFloat.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawInteger.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawLong.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawString.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringFixedLength.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringTerminated.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Struct.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructBuilder.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructIterator.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/TerminatedWrapper.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union2.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union3.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union4.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/package-info.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestFixedLengthWrapper.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlob.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlobVar.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedString.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestRawString.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestTerminatedWrapper.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestUnion2.java
        Show
        Hudson added a comment - SUCCESS: Integrated in hbase-0.95 #432 (See https://builds.apache.org/job/hbase-0.95/432/ ) HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) (nkeywal: rev 1512927) /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/DataType.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/FixedLengthWrapper.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlob.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlobVar.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBytesBase.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat32.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat64.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt32.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt64.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedNumeric.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedString.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytes.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesFixedLength.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesTerminated.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawDouble.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawFloat.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawInteger.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawLong.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawString.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringFixedLength.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringTerminated.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Struct.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructBuilder.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructIterator.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/TerminatedWrapper.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union2.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union3.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union4.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/package-info.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestFixedLengthWrapper.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlob.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlobVar.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedString.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestRawString.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestTerminatedWrapper.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestUnion2.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in hbase-0.95-on-hadoop2 #235 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/235/)
        HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) (nkeywal: rev 1512927)

        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/DataType.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/FixedLengthWrapper.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlob.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlobVar.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBytesBase.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat32.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat64.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt32.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt64.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedNumeric.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedString.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytes.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesFixedLength.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesTerminated.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawDouble.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawFloat.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawInteger.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawLong.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawString.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringFixedLength.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringTerminated.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Struct.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructBuilder.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructIterator.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/TerminatedWrapper.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union2.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union3.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union4.java
        • /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/package-info.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestFixedLengthWrapper.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlob.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlobVar.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedString.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestRawString.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestTerminatedWrapper.java
        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestUnion2.java
        Show
        Hudson added a comment - FAILURE: Integrated in hbase-0.95-on-hadoop2 #235 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/235/ ) HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) (nkeywal: rev 1512927) /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/DataType.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/FixedLengthWrapper.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlob.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBlobVar.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedBytesBase.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat32.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedFloat64.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt32.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedInt64.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedNumeric.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/OrderedString.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytes.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesFixedLength.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawBytesTerminated.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawDouble.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawFloat.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawInteger.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawLong.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawString.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringFixedLength.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/RawStringTerminated.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Struct.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructBuilder.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/StructIterator.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/TerminatedWrapper.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union2.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union3.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/Union4.java /hbase/branches/0.95/hbase-common/src/main/java/org/apache/hadoop/hbase/types/package-info.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestFixedLengthWrapper.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlob.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedBlobVar.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestOrderedString.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestRawString.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestTerminatedWrapper.java /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestUnion2.java
        Hide
        stack added a comment -

        Nick added a release note (from the mosh pit at outside lands) so resolving. Nice-one Nick.

        Show
        stack added a comment - Nick added a release note (from the mosh pit at outside lands) so resolving. Nice-one Nick.
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in hbase-0.95 #433 (See https://builds.apache.org/job/hbase-0.95/433/)
        HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) – ADD MISSING LICENSE (stack: rev 1513020)

        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        Show
        Hudson added a comment - SUCCESS: Integrated in hbase-0.95 #433 (See https://builds.apache.org/job/hbase-0.95/433/ ) HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) – ADD MISSING LICENSE (stack: rev 1513020) /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-TRUNK #4373 (See https://builds.apache.org/job/HBase-TRUNK/4373/)
        HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) – ADD MISSING LICENSE (stack: rev 1513019)

        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-TRUNK #4373 (See https://builds.apache.org/job/HBase-TRUNK/4373/ ) HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) – ADD MISSING LICENSE (stack: rev 1513019) /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        Hide
        Hudson added a comment -

        SUCCESS: Integrated in hbase-0.95-on-hadoop2 #236 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/236/)
        HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) – ADD MISSING LICENSE (stack: rev 1513020)

        • /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        Show
        Hudson added a comment - SUCCESS: Integrated in hbase-0.95-on-hadoop2 #236 (See https://builds.apache.org/job/hbase-0.95-on-hadoop2/236/ ) HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) – ADD MISSING LICENSE (stack: rev 1513020) /hbase/branches/0.95/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #669 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/669/)
        HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) – ADD MISSING LICENSE (stack: rev 1513019)

        • /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        Show
        Hudson added a comment - FAILURE: Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #669 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/669/ ) HBASE-8693 DataType: provide extensible type API (Nick Dimiduck) – ADD MISSING LICENSE (stack: rev 1513019) /hbase/trunk/hbase-common/src/test/java/org/apache/hadoop/hbase/types/TestStruct.java
        Hide
        James Taylor added a comment -

        Thought of one more important thing for composite keys in the new type syste. For a composite row key, Phoenix strips off trailing null columns values in the row key. The reason this is important is that then new nullable row key columns can be added to a schema without requiring any data upgrade to existing rows. Otherwise, adding new row key columns to the end of a schema becomes extremely cumbersome, as you'd need to delete all existing rows and add them back with a row key that includes a null value.

        Not sure how you're handling this now, but I wanted to bring it up before this gets released/frozen, as changing this later would require upgrading/changing existing data.

        Show
        James Taylor added a comment - Thought of one more important thing for composite keys in the new type syste. For a composite row key, Phoenix strips off trailing null columns values in the row key. The reason this is important is that then new nullable row key columns can be added to a schema without requiring any data upgrade to existing rows. Otherwise, adding new row key columns to the end of a schema becomes extremely cumbersome, as you'd need to delete all existing rows and add them back with a row key that includes a null value. Not sure how you're handling this now, but I wanted to bring it up before this gets released/frozen, as changing this later would require upgrading/changing existing data.
        Hide
        Nick Dimiduk added a comment -

        Hi James Taylor,

        I'm assuming Phoenix intends to use the provided Struct and StructIterator implementations.

        In the case of reading a written value, a call to Struct#decode(PositionedByteRange) will return the decoded Object[]. If you've previously stripped off null columns or you've extended your schema since the data was written, the resulting array will be shorter than your schema demands. In this case, it should be a simple check of the length of the decoded Object[] and act accordingly.

        The case of writing a value, your scenario is almost supported, except for this silly little assert. To support writing fewer members than are defined in the Struct definition, this needs changed to

        assert fields.length >= val.length;
        

        Does that sound about right to you?

        Show
        Nick Dimiduk added a comment - Hi James Taylor , I'm assuming Phoenix intends to use the provided Struct and StructIterator implementations. In the case of reading a written value, a call to Struct#decode(PositionedByteRange) will return the decoded Object[] . If you've previously stripped off null columns or you've extended your schema since the data was written, the resulting array will be shorter than your schema demands. In this case, it should be a simple check of the length of the decoded Object[] and act accordingly. The case of writing a value, your scenario is almost supported, except for this silly little assert . To support writing fewer members than are defined in the Struct definition, this needs changed to assert fields.length >= val.length; Does that sound about right to you?
        Hide
        James Taylor added a comment -

        Yes, Phoenix would plan to use the Struct and StructIterator. Rather than the client needing to modify the iteration code everywhere, it'd be good if the StructIterator handled this out-of-the-box.

        On the write side of things, it'd be good if the thing that writes a Struct striped off trailing nulls. You can only do this when writing the key is complete, because of course you might have nulls in the middle which is valid.

        I don't mean to push everything back to your framework, but the important thing is that the framework writes in the expected way already. If Phoenix has to specialize it, then we lose the interop piece which is what we're trying to get in the first place.

        Show
        James Taylor added a comment - Yes, Phoenix would plan to use the Struct and StructIterator. Rather than the client needing to modify the iteration code everywhere, it'd be good if the StructIterator handled this out-of-the-box. On the write side of things, it'd be good if the thing that writes a Struct striped off trailing nulls. You can only do this when writing the key is complete, because of course you might have nulls in the middle which is valid. I don't mean to push everything back to your framework, but the important thing is that the framework writes in the expected way already. If Phoenix has to specialize it, then we lose the interop piece which is what we're trying to get in the first place.
        Hide
        Nick Dimiduk added a comment -

        I don't mean to push everything back to your framework, but...

        I generally agree with you

        Since 0.95.2RC0 has received the necessary votes, this ticket won't see an addendum. Let's move this conversation into a new issue. I've created HBASE-9283 just for you James Taylor

        Show
        Nick Dimiduk added a comment - I don't mean to push everything back to your framework, but... I generally agree with you Since 0.95.2RC0 has received the necessary votes, this ticket won't see an addendum. Let's move this conversation into a new issue. I've created HBASE-9283 just for you James Taylor

          People

          • Assignee:
            Nick Dimiduk
            Reporter:
            Nick Dimiduk
          • Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development