Hive
  1. Hive
  2. HIVE-5871

Use multiple-characters as field delimiter

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.12.0
    • Fix Version/s: 0.14.0
    • Component/s: Contrib
    • Labels:

      Description

      By default, hive only allows user to use single character as field delimiter. Although there's RegexSerDe to specify multiple-character delimiter, it can be daunting to use, especially for amateurs.
      The patch adds a new SerDe named MultiDelimitSerDe. With MultiDelimitSerDe, users can specify a multiple-character field delimiter when creating tables, in a way most similar to typical table creations. For example:

      create table test (id string,hivearray array<binary>,hivemap map<string,int>) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delim"="[,]","collection.delim"=":","mapkey.delim"="@");
      

      where field.delim is the field delimiter, collection.delim and mapkey.delim is the delimiter for collection items and key value pairs, respectively. Among these delimiters, field.delim is mandatory and can be of multiple characters, while collection.delim and mapkey.delim is optional and only support single character.

      To use MultiDelimitSerDe, you have to add the hive-contrib jar to the class path, e.g. with the add jar command.

      1. HIVE-5871.patch
        16 kB
        Rui Li
      2. HIVE-5871.6.patch
        17 kB
        Rui Li
      3. HIVE-5871.5.patch
        17 kB
        Rui Li
      4. HIVE-5871.4.patch
        16 kB
        Rui Li
      5. HIVE-5871.3.patch
        16 kB
        Rui Li
      6. HIVE-5871.2.patch
        16 kB
        Rui Li

        Issue Links

          Activity

          Hide
          Rui Li added a comment - - edited

          This implementation mainly relies on LazySimpleSerDe for serialization and deserialization. I added some methods to LazyStruct to parse a row delimited by multiple-character string. Another difference from LazySimpleSerDe is that MultiDelimitSerDe doesn't use Base64 to encode binary fields in serialization. Because the encoded string may interfere with the delimiter. I also modified LazyBinary, so that when it deserializes a binary field and is unable to Base64 decode the field, it just keeps the data unchanged. A simple use case is as follow:

          create table test (id string,hivearray array<binary>,hivemap map<string,int>) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delimited"="[,]","collection.delimited"=":","mapkey.delimited"="@");

          where field.delimited is the multiple-char field delimiter. collection.delimited is the delimiter for collection items. mapkey.delimited is the delimiter for keys and values in maps. We currently don't support multiple-char for these two delimiters.

          <Edited 10/Sep/14 on behalf of Rui Li> This comment's example differs from the final version of the patch. See the description above for an accurate example, and note that the SERDEPROPERTIES are *.delim rather than *.delimited.

          Show
          Rui Li added a comment - - edited This implementation mainly relies on LazySimpleSerDe for serialization and deserialization. I added some methods to LazyStruct to parse a row delimited by multiple-character string. Another difference from LazySimpleSerDe is that MultiDelimitSerDe doesn't use Base64 to encode binary fields in serialization. Because the encoded string may interfere with the delimiter. I also modified LazyBinary, so that when it deserializes a binary field and is unable to Base64 decode the field, it just keeps the data unchanged. A simple use case is as follow: create table test (id string,hivearray array<binary>,hivemap map<string,int>) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe' WITH SERDEPROPERTIES ("field.delimited"=" [,] ","collection.delimited"=":","mapkey.delimited"="@"); where field.delimited is the multiple-char field delimiter. collection.delimited is the delimiter for collection items. mapkey.delimited is the delimiter for keys and values in maps. We currently don't support multiple-char for these two delimiters. <Edited 10/Sep/14 on behalf of Rui Li> This comment's example differs from the final version of the patch. See the description above for an accurate example, and note that the SERDEPROPERTIES are *.delim rather than *.delimited.
          Hide
          Rui Li added a comment -

          Fix previous implementation.

          Show
          Rui Li added a comment - Fix previous implementation.
          Hide
          Brock Noland added a comment -

          Rui Li please excuse me for my ignorance. Does this same problem extend to using formats such as Parquet? Do users only use text because they want to use this serde?

          Show
          Brock Noland added a comment - Rui Li please excuse me for my ignorance. Does this same problem extend to using formats such as Parquet? Do users only use text because they want to use this serde?
          Hide
          Rui Li added a comment -

          Hi Brock Noland, when users initially required this feature, they're only using text. So I haven't testes it for other formats. I can do the test if you think the implementation makes sense.

          Show
          Rui Li added a comment - Hi Brock Noland , when users initially required this feature, they're only using text. So I haven't testes it for other formats. I can do the test if you think the implementation makes sense.
          Hide
          Brock Noland added a comment -

          Thank you! I think any changes to Parquet and other formats would be a completely separate change. However, I think it would be interesting to know if Parquet worked for users of GBK.

          Can you add a RB item for this?

          Show
          Brock Noland added a comment - Thank you! I think any changes to Parquet and other formats would be a completely separate change. However, I think it would be interesting to know if Parquet worked for users of GBK. Can you add a RB item for this?
          Hide
          Rui Li added a comment -

          Update the patch for latest code base

          Show
          Rui Li added a comment - Update the patch for latest code base
          Hide
          Rui Li added a comment -

          Brock Noland I updated the patch because the old one won't work with latest hive code.
          Also created a request on RB and linked to this JIRA.

          Show
          Rui Li added a comment - Brock Noland I updated the patch because the old one won't work with latest hive code. Also created a request on RB and linked to this JIRA.
          Hide
          Brock Noland added a comment -

          I think any changes to Parquet and other formats would be a completely separate change. However, I think it would be interesting to know if Parquet worked for users of GBK.

          Actually I think is more relevant to HIVE-7142...

          Show
          Brock Noland added a comment - I think any changes to Parquet and other formats would be a completely separate change. However, I think it would be interesting to know if Parquet worked for users of GBK. Actually I think is more relevant to HIVE-7142 ...
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12660075/HIVE-5871.3.patch

          ERROR: -1 due to 5 failed/errored test(s), 5868 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx
          org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode
          org.apache.hadoop.hive.serde2.lazy.TestLazyPrimitive.testLazyBinary
          org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/195/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/195/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-195/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 5 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12660075

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660075/HIVE-5871.3.patch ERROR: -1 due to 5 failed/errored test(s), 5868 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode org.apache.hadoop.hive.serde2.lazy.TestLazyPrimitive.testLazyBinary org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/195/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/195/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-195/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed This message is automatically generated. ATTACHMENT ID: 12660075
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12660319/HIVE-5871.4.patch

          ERROR: -1 due to 5 failed/errored test(s), 5885 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
          org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx
          org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode
          org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes
          org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/212/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/212/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-212/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 5 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12660319

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12660319/HIVE-5871.4.patch ERROR: -1 due to 5 failed/errored test(s), 5885 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hadoop.hive.ql.TestDDLWithRemoteMetastoreSecondNamenode.testCreateTableWithIndexAndPartitionsNonDefaultNameNode org.apache.hive.hcatalog.pig.TestOrcHCatLoader.testReadDataPrimitiveTypes org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/212/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/212/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-212/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed This message is automatically generated. ATTACHMENT ID: 12660319
          Hide
          Brock Noland added a comment -

          +1

          Show
          Brock Noland added a comment - +1
          Hide
          Brock Noland added a comment -

          Rui Li I know I +1'ed the change but I think we should change the properties fields.deliminted, collection, mapkey... etc to the constants which are available in serDeConstants.

          Show
          Brock Noland added a comment - Rui Li I know I +1'ed the change but I think we should change the properties fields.deliminted, collection, mapkey... etc to the constants which are available in serDeConstants.
          Hide
          Rui Li added a comment -

          Hi Brock Noland I updated the patch accordingly.

          Show
          Rui Li added a comment - Hi Brock Noland I updated the patch accordingly.
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12661385/HIVE-5871.5.patch

          ERROR: -1 due to 2 failed/errored test(s), 5875 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_create
          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/289/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/289/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-289/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 2 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12661385

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12661385/HIVE-5871.5.patch ERROR: -1 due to 2 failed/errored test(s), 5875 tests executed Failed tests: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_orc_create org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/289/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/289/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-289/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed This message is automatically generated. ATTACHMENT ID: 12661385
          Hide
          Rui Li added a comment -

          Leave fixing the typo in serdeConstants to HIVE-6404

          Show
          Rui Li added a comment - Leave fixing the typo in serdeConstants to HIVE-6404
          Hide
          Hive QA added a comment -

          Overall: -1 at least one tests failed

          Here are the results of testing the latest attachment:
          https://issues.apache.org/jira/secure/attachment/12661600/HIVE-5871.6.patch

          ERROR: -1 due to 3 failed/errored test(s), 5894 tests executed
          Failed tests:

          org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization
          org.apache.hive.jdbc.TestJdbcDriver2.testDatabaseMetaData
          org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection
          

          Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/306/testReport
          Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/306/console
          Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-306/

          Messages:

          Executing org.apache.hive.ptest.execution.PrepPhase
          Executing org.apache.hive.ptest.execution.ExecutionPhase
          Executing org.apache.hive.ptest.execution.ReportingPhase
          Tests exited with: TestsFailedException: 3 tests failed
          

          This message is automatically generated.

          ATTACHMENT ID: 12661600

          Show
          Hive QA added a comment - Overall : -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12661600/HIVE-5871.6.patch ERROR: -1 due to 3 failed/errored test(s), 5894 tests executed Failed tests: org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynpart_sort_opt_vectorization org.apache.hive.jdbc.TestJdbcDriver2.testDatabaseMetaData org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/306/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/306/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-306/ Messages: Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed This message is automatically generated. ATTACHMENT ID: 12661600
          Hide
          Brock Noland added a comment -

          One quick question, I noticed the following change to TestLazyPrimitive:

          diff --git serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java
          index 7cd1805..3d7f11e 100644
          --- serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java
          +++ serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java
          @@ -388,7 +388,7 @@ public void testLazyBinary() {
               initLazyObject(ba, new byte[] {'2', '?', '3'}, 0, 3);
               assertEquals(new BytesWritable(new byte[] {'2', '?', '3'}), ba.getWritableObject());
               initLazyObject(ba, new byte[] {'\n'}, 0, 1);
          -    assertEquals(new BytesWritable(new byte[] {}), ba.getWritableObject());
          +    assertEquals(new BytesWritable(new byte[] {'\n'}), ba.getWritableObject());
             }
          

          which I am concerned about being backwards incompatible. Chinna Rao Lalam actually added this years ago in HIVE-2465. Chinna, what are your thoughts on this change?

          Show
          Brock Noland added a comment - One quick question, I noticed the following change to TestLazyPrimitive: diff --git serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java index 7cd1805..3d7f11e 100644 --- serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java +++ serde/src/test/org/apache/hadoop/hive/serde2/lazy/TestLazyPrimitive.java @@ -388,7 +388,7 @@ public void testLazyBinary() { initLazyObject(ba, new byte[] {'2', '?', '3'}, 0, 3); assertEquals(new BytesWritable(new byte[] {'2', '?', '3'}), ba.getWritableObject()); initLazyObject(ba, new byte[] {'\n'}, 0, 1); - assertEquals(new BytesWritable(new byte[] {}), ba.getWritableObject()); + assertEquals(new BytesWritable(new byte[] {'\n'}), ba.getWritableObject()); } which I am concerned about being backwards incompatible. Chinna Rao Lalam actually added this years ago in HIVE-2465 . Chinna, what are your thoughts on this change?
          Hide
          Rui Li added a comment -

          Hi Brock Noland, I made the change because MultiDelimitSerde won't base-64 encode or decode binary data, in case the encoded string should be the same with the multiple-character delimiter. And this in turn is because I want to reuse the LazySimpleSerde for most of the serialize and deserialize logic. Please let me know if this change is unacceptable and we need a better way to handle it.

          Show
          Rui Li added a comment - Hi Brock Noland , I made the change because MultiDelimitSerde won't base-64 encode or decode binary data, in case the encoded string should be the same with the multiple-character delimiter. And this in turn is because I want to reuse the LazySimpleSerde for most of the serialize and deserialize logic. Please let me know if this change is unacceptable and we need a better way to handle it.
          Hide
          Brock Noland added a comment -

          Hi Rui Li,

          Could you implement MultiDelimitSerde without making the change to LazyBinary? The way I read the code this could cause some users of the text serdes to have "new" and unexpected results returned.

          Show
          Brock Noland added a comment - Hi Rui Li , Could you implement MultiDelimitSerde without making the change to LazyBinary? The way I read the code this could cause some users of the text serdes to have "new" and unexpected results returned.
          Hide
          Rui Li added a comment -

          Hi Brock Noland,

          LazyBinary only intends to decode valid base64 data: byte[] decoded = arrayByteBase64 ? Base64.decodeBase64(recv) : recv;. Original data is returned if it contains non-base64 character. Therefore, I think it's natural to also return the original data if decode fails. Otherwise, why would we bother to check arrayByteBase64 before decoding?
          I've asked Chinna to help see if this is correct and waiting for his comments.

          But anyway, if you think this is incorrect or we should minimize change to the code base I'll find a way to avoid it

          Thanks!

          Show
          Rui Li added a comment - Hi Brock Noland , LazyBinary only intends to decode valid base64 data: byte[] decoded = arrayByteBase64 ? Base64.decodeBase64(recv) : recv; . Original data is returned if it contains non-base64 character. Therefore, I think it's natural to also return the original data if decode fails. Otherwise, why would we bother to check arrayByteBase64 before decoding? I've asked Chinna to help see if this is correct and waiting for his comments. But anyway, if you think this is incorrect or we should minimize change to the code base I'll find a way to avoid it Thanks!
          Hide
          Brock Noland added a comment -

          Hi Rui,

          Yes, I agree that it makes sense to write it like that from the outset. The case I was thinking of is where you have a non-base64 string, junk, which "appears" to be in base64 and thus decode is called. Today that would return either null or empty byte array but after this change it will appear as-is.

          Thinking about this more, perhaps we can commit the change as-is. Szehon Ho do you have thoughts on this?

          Show
          Brock Noland added a comment - Hi Rui, Yes, I agree that it makes sense to write it like that from the outset. The case I was thinking of is where you have a non-base64 string, junk, which "appears" to be in base64 and thus decode is called. Today that would return either null or empty byte array but after this change it will appear as-is. Thinking about this more, perhaps we can commit the change as-is. Szehon Ho do you have thoughts on this?
          Hide
          Brock Noland added a comment -

          +1

          Show
          Brock Noland added a comment - +1
          Hide
          Brock Noland added a comment -

          Thank you Rui Li!! I have committed this patch to trunk!

          Show
          Brock Noland added a comment - Thank you Rui Li !! I have committed this patch to trunk!
          Hide
          Rui Li added a comment -

          Thanks Brock Noland for the patient review!

          Show
          Rui Li added a comment - Thanks Brock Noland for the patient review!
          Hide
          Lefty Leverenz added a comment -

          Doc note: MultiDelimitSerDe needs to be documented in the wiki (with version information and a link to this JIRA ticket).

          It belongs in some existing docs, and a new doc with limitations and usage examples could be a child page to the SerDe doc (or a new section in the SerDe doc):

          A release note could include the example from Rui Li's first comment.

          Show
          Lefty Leverenz added a comment - Doc note: MultiDelimitSerDe needs to be documented in the wiki (with version information and a link to this JIRA ticket). It belongs in some existing docs, and a new doc with limitations and usage examples could be a child page to the SerDe doc (or a new section in the SerDe doc): SerDe – Built-in, Third-Party, and Custom SerDes DDL – Create Table – Row Format, Storage Format, and SerDe possibly DDL – Add SerDe Properties Developer Guide – Hive SerDe (add to "Also:" list at end of section) optionally HCatalog Storage Formats – SerDes and Storage Formats (first paragraph) A release note could include the example from Rui Li 's first comment.
          Hide
          Rui Li added a comment -

          Hi Lefty Leverenz, I'm not sure if I have the permission to update the wiki. Besides, could you point me to some guide as how to use/edit the wiki? Thanks!
          My first comment is outdated due to some changes in later versions of the patch. Since I can't edit my comments, maybe I can update the description to include the correct example?

          Show
          Rui Li added a comment - Hi Lefty Leverenz , I'm not sure if I have the permission to update the wiki. Besides, could you point me to some guide as how to use/edit the wiki? Thanks! My first comment is outdated due to some changes in later versions of the patch. Since I can't edit my comments, maybe I can update the description to include the correct example?
          Hide
          Lefty Leverenz added a comment -

          Actually you can edit comments, although that's discouraged because preserving the history is important. (A good workaround is to append "Edit <date>:" to the comment instead of changing the original text.) The edit function is via a pencil icon in the upper right corner of each comment. But updating the description is also good.

          If you have wiki update permission, each Hive wiki page has an Edit link with a pencil icon in the upper right corner, next to Share and Tools. If not, you can request permission as described in AboutThisWiki which you can find on the left side of the Home page.

          A preliminary guide to editing the wiki is in a comment on HIVE-7142. By the way, that's an edited comment.

          Here are the links:

          Show
          Lefty Leverenz added a comment - Actually you can edit comments, although that's discouraged because preserving the history is important. (A good workaround is to append "Edit <date>:" to the comment instead of changing the original text.) The edit function is via a pencil icon in the upper right corner of each comment. But updating the description is also good. If you have wiki update permission, each Hive wiki page has an Edit link with a pencil icon in the upper right corner, next to Share and Tools. If not, you can request permission as described in AboutThisWiki which you can find on the left side of the Home page. A preliminary guide to editing the wiki is in a comment on HIVE-7142 . By the way, that's an edited comment. Here are the links: AboutThisWiki – How to get permission to edit HIVE-7142 comment about how to edit the wiki
          Hide
          Rui Li added a comment -

          Thanks Lefty Leverenz that's very helpful. However I don't see a pencil icon in the upper right corner of my comments. (There is one in the description though). Wonder if I'm still missing something?

          Show
          Rui Li added a comment - Thanks Lefty Leverenz that's very helpful. However I don't see a pencil icon in the upper right corner of my comments. (There is one in the description though). Wonder if I'm still missing something?
          Hide
          Brock Noland added a comment - - edited

          I think for Hive, you need "committer" privs to get the ability to edit comments. Let me see if we can relax this.

          Show
          Brock Noland added a comment - - edited I think for Hive, you need "committer" privs to get the ability to edit comments. Let me see if we can relax this.
          Hide
          Lefty Leverenz added a comment -

          Thanks for that explanation, Brock Noland. Live and learn.

          Rui Li, since I have edit permission you can tell me what to change and I'll do it for you. That will help avoid confusion for JIRA surfers.

          Show
          Lefty Leverenz added a comment - Thanks for that explanation, Brock Noland . Live and learn. Rui Li , since I have edit permission you can tell me what to change and I'll do it for you. That will help avoid confusion for JIRA surfers.
          Hide
          Rui Li added a comment -

          Lefty Leverenz I've updated the description to add the correct use example. Maybe you can edit my comment to indicate it outdated. Thanks.

          Show
          Rui Li added a comment - Lefty Leverenz I've updated the description to add the correct use example. Maybe you can edit my comment to indicate it outdated. Thanks.
          Hide
          Lefty Leverenz added a comment -

          How's that, Rui Li?

          Show
          Lefty Leverenz added a comment - How's that, Rui Li ?
          Hide
          Rui Li added a comment -

          Lefty Leverenz That's perfect. Thanks a lot!

          Show
          Rui Li added a comment - Lefty Leverenz That's perfect. Thanks a lot!
          Hide
          Thejas M Nair added a comment -

          This has been fixed in 0.14 release. Please open new jira if you see any issues.

          Show
          Thejas M Nair added a comment - This has been fixed in 0.14 release. Please open new jira if you see any issues.

            People

            • Assignee:
              Rui Li
              Reporter:
              Rui Li
            • Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development