Hive
  1. Hive
  2. HIVE-3682

when output hive table to file,users should could have a separator of their own choice

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.1
    • Fix Version/s: 0.11.0
    • Component/s: CLI
    • Labels:
      None
    • Environment:

      Linux 3.0.0-14-generic #23-Ubuntu SMP Mon Nov 21 20:34:47 UTC 2011 i686 i686 i386 GNU/Linux
      java version "1.6.0_25"
      hadoop-0.20.2-cdh3u0
      hive-0.8.1

      Description

      By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001).
      But indeed users should have the right to set a seperator of their own choice.

      Usage Example:
      create table for_test (key string, value string);
      load data local inpath './in1.txt' into table for_test
      select * from for_test;
      UT-01:default separator is \001 line separator is \n
      insert overwrite local directory './test-01'
      select * from src ;

      create table array_table (a array<string>, b array<string>)
      ROW FORMAT DELIMITED
      FIELDS TERMINATED BY '\t'
      COLLECTION ITEMS TERMINATED BY ',';

      load data local inpath "../hive/examples/files/arraytest.txt" overwrite into table table2;

      CREATE TABLE map_table (foo STRING , bar MAP<STRING, STRING>)
      ROW FORMAT DELIMITED
      FIELDS TERMINATED BY '\t'
      COLLECTION ITEMS TERMINATED BY ','
      MAP KEYS TERMINATED BY ':'
      STORED AS TEXTFILE;

      UT-02:defined field separator as ':'
      insert overwrite local directory './test-02'
      row format delimited
      FIELDS TERMINATED BY ':'
      select * from src ;

      UT-03: line separator DO NOT ALLOWED to define as other separator
      insert overwrite local directory './test-03'
      row format delimited
      FIELDS TERMINATED BY ':'
      select * from src ;

      UT-04: define map separators
      insert overwrite local directory './test-04'
      row format delimited
      FIELDS TERMINATED BY '\t'
      COLLECTION ITEMS TERMINATED BY ','
      MAP KEYS TERMINATED BY ':'
      select * from src;

      1. HIVE-3682.D10275.4.patch.for.0.11
        40 kB
        Sushanth Sowmyan
      2. HIVE-3682.D10275.4.patch
        40 kB
        Phabricator
      3. HIVE-3682.D10275.3.patch
        40 kB
        Phabricator
      4. HIVE-3682.D10275.2.patch
        40 kB
        Phabricator
      5. HIVE-3682.D10275.1.patch
        42 kB
        Phabricator
      6. HIVE-3682.with.serde.patch
        11 kB
        Sushanth Sowmyan
      7. HIVE-3682-1.patch
        7 kB
        caofangkun

        Issue Links

          Activity

          Show
          caofangkun added a comment - https://reviews.apache.org/r/10115/
          Hide
          Sushanth Sowmyan added a comment -

          Hi caofangkun, we had a similar need, but a little further in scope than your patch, so I've built on and modified your patch to add on a couple more features:

          a) I've refactored your changes to LoadFileDesc into LocalDirectoryDesc - I thought that keeping it in LoadFileDesc confused things a bit for future readability.
          b) I've added on serde support for writing out - this allows people to use custom serdes (for eg., HCat's JsonSerDe) when outputting.
          c) I've added on support to write out to custom output formats as well, enabling the "STORED AS" clause that exists in create table (Note that the inputformat part of the STORED-AS clause is simply ignored as it makes no sense in this case)

          Show
          Sushanth Sowmyan added a comment - Hi caofangkun , we had a similar need, but a little further in scope than your patch, so I've built on and modified your patch to add on a couple more features: a) I've refactored your changes to LoadFileDesc into LocalDirectoryDesc - I thought that keeping it in LoadFileDesc confused things a bit for future readability. b) I've added on serde support for writing out - this allows people to use custom serdes (for eg., HCat's JsonSerDe) when outputting. c) I've added on support to write out to custom output formats as well, enabling the "STORED AS" clause that exists in create table (Note that the inputformat part of the STORED-AS clause is simply ignored as it makes no sense in this case)
          Hide
          caofangkun added a comment -

          Thanks Sushanth Sowmyan and the "STORED AS " feature is very useful for me too.

          Show
          caofangkun added a comment - Thanks Sushanth Sowmyan and the "STORED AS " feature is very useful for me too.
          Hide
          Phabricator added a comment -

          khorgath requested code review of "HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice".

          Reviewers: JIRA

          HIVE-3682 Supporting custom INSERT OVERWRITE LOCAL DIRECTORY syntax with SerDe and Outputformat support

          By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001).
          But indeed users should have the right to set a seperator of their own choice.

          In addition, we need to be able to support custom serde specification to output(such as an available json serde),
          or we need to be able to specify an output format like a 'stored as rcfile' specification to allow cases
          where we want to export data that is meant to be copied into dfs elsewhere and directly read as an external table.

          Usage Example:
          create table for_test (key string, value string);
          load data local inpath './in1.txt' into table for_test
          select * from for_test;
          UT-01:default separator is \001 line separator is \n
          insert overwrite local directory './test-01'
          select * from src ;

          create table array_table (a array<string>, b array<string>)
          ROW FORMAT DELIMITED
          FIELDS TERMINATED BY '\t'
          COLLECTION ITEMS TERMINATED BY ',';

          load data local inpath "../hive/examples/files/arraytest.txt" overwrite into table table2;

          CREATE TABLE map_table (foo STRING , bar MAP<STRING, STRING>)
          ROW FORMAT DELIMITED
          FIELDS TERMINATED BY '\t'
          COLLECTION ITEMS TERMINATED BY ','
          MAP KEYS TERMINATED BY ':'
          STORED AS TEXTFILE;

          UT-02:defined field separator as ':'
          insert overwrite local directory './test-02'
          row format delimited
          FIELDS TERMINATED BY ':'
          select * from src ;

          UT-03: line separator DO NOT ALLOWED to define as other separator
          insert overwrite local directory './test-03'
          row format delimited
          FIELDS TERMINATED BY ':'
          select * from src ;

          UT-04: define map separators
          insert overwrite local directory './test-04'
          row format delimited
          FIELDS TERMINATED BY '\t'
          COLLECTION ITEMS TERMINATED BY ','
          MAP KEYS TERMINATED BY ':'
          select * from src;

          UT-05: STORED-AS specification
          insert overwrite local directory './test-05'
          stored as rcfile
          select * from src;

          UT-06: custom SerDe specification for output
          insert overwrite local directory './test-06'
          row format 'org.apache.hadoop.hive.serde2.DelimitedJSONSerDe'
          stored as textfile
          select * from src;

          TEST PLAN
          Included .q files

          REVISION DETAIL
          https://reviews.facebook.net/D10275

          AFFECTED FILES
          data/files/array_table.txt
          data/files/map_table.txt
          ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
          ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java
          ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
          ql/src/java/org/apache/hadoop/hive/ql/plan/LocalDirectoryDesc.java
          ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
          ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q
          ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out

          MANAGE HERALD RULES
          https://reviews.facebook.net/herald/view/differential/

          WHY DID I GET THIS EMAIL?
          https://reviews.facebook.net/herald/transcript/24573/

          To: JIRA, khorgath

          Show
          Phabricator added a comment - khorgath requested code review of " HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice". Reviewers: JIRA HIVE-3682 Supporting custom INSERT OVERWRITE LOCAL DIRECTORY syntax with SerDe and Outputformat support By default,when output hive table to file ,columns of the Hive table are separated by ^A character (that is \001). But indeed users should have the right to set a seperator of their own choice. In addition, we need to be able to support custom serde specification to output(such as an available json serde), or we need to be able to specify an output format like a 'stored as rcfile' specification to allow cases where we want to export data that is meant to be copied into dfs elsewhere and directly read as an external table. Usage Example: create table for_test (key string, value string); load data local inpath './in1.txt' into table for_test select * from for_test; UT-01:default separator is \001 line separator is \n insert overwrite local directory './test-01' select * from src ; create table array_table (a array<string>, b array<string>) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ','; load data local inpath "../hive/examples/files/arraytest.txt" overwrite into table table2; CREATE TABLE map_table (foo STRING , bar MAP<STRING, STRING>) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' STORED AS TEXTFILE; UT-02:defined field separator as ':' insert overwrite local directory './test-02' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-03: line separator DO NOT ALLOWED to define as other separator insert overwrite local directory './test-03' row format delimited FIELDS TERMINATED BY ':' select * from src ; UT-04: define map separators insert overwrite local directory './test-04' row format delimited FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':' select * from src; UT-05: STORED-AS specification insert overwrite local directory './test-05' stored as rcfile select * from src; UT-06: custom SerDe specification for output insert overwrite local directory './test-06' row format 'org.apache.hadoop.hive.serde2.DelimitedJSONSerDe' stored as textfile select * from src; TEST PLAN Included .q files REVISION DETAIL https://reviews.facebook.net/D10275 AFFECTED FILES data/files/array_table.txt data/files/map_table.txt ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/plan/LocalDirectoryDesc.java ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/24573/ To: JIRA, khorgath
          Hide
          Phabricator added a comment -

          khorgath has added reviewers to the revision "HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice".
          Added Reviewers: ashutoshc, omalley

          Updated patch based on caofangkun's initial patch to support STORED-AS and SerDe specification.

          REVISION DETAIL
          https://reviews.facebook.net/D10275

          To: JIRA, ashutoshc, omalley, khorgath

          Show
          Phabricator added a comment - khorgath has added reviewers to the revision " HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice". Added Reviewers: ashutoshc, omalley Updated patch based on caofangkun 's initial patch to support STORED-AS and SerDe specification. REVISION DETAIL https://reviews.facebook.net/D10275 To: JIRA, ashutoshc, omalley, khorgath
          Hide
          Gang Tim Liu added a comment -

          Hi ~caofangkun, thank you for working on it. Would you please assign this issue to yourself? thanks

          Show
          Gang Tim Liu added a comment - Hi ~caofangkun, thank you for working on it. Would you please assign this issue to yourself? thanks
          Hide
          caofangkun added a comment -

          Hi Gang Tim Liu , I'm not a committer yet, so I could not assign this issue to myself.
          Please feel free and assign this issue .
          Thanks

          Show
          caofangkun added a comment - Hi Gang Tim Liu , I'm not a committer yet, so I could not assign this issue to myself. Please feel free and assign this issue . Thanks
          Hide
          Ashutosh Chauhan added a comment -

          Left some comments on phabricator.

          Show
          Ashutosh Chauhan added a comment - Left some comments on phabricator.
          Hide
          Phabricator added a comment -

          ashutoshc has requested changes to the revision "HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice".

          Some comments.

          INLINE COMMENTS
          ql/src/java/org/apache/hadoop/hive/ql/plan/LocalDirectoryDesc.java:29 Since all of these fields are subset of fields defined in CreateTableDesc. I wonder if you can reuse that class instead of creating a new one? In case you can not, atleast consider extending that.
          ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:1208 As I have indicated below if you can reuse/extend CreateTableDesc instead of LocalDirectoryDesc, you probably can refactor and reuse much of the parsing logic from create table and that will be good to have.
          ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java:102 As I mentioned earlier, if we reuse CreateTableDesc class, probably much of this code could be avoided.
          ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q:8 Can you add a test which results in MR job. e.g, doing join, group-by etc?
          ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q:72 After HIVE-4369 we need to use dfs $

          {system:test.dfs.mkdir}

          for mkdir command because of hadoop incompatibility issues.

          REVISION DETAIL
          https://reviews.facebook.net/D10275

          BRANCH
          HIVE-3682

          ARCANIST PROJECT
          hive

          To: JIRA, ashutoshc, omalley, khorgath

          Show
          Phabricator added a comment - ashutoshc has requested changes to the revision " HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice". Some comments. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/plan/LocalDirectoryDesc.java:29 Since all of these fields are subset of fields defined in CreateTableDesc. I wonder if you can reuse that class instead of creating a new one? In case you can not, atleast consider extending that. ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java:1208 As I have indicated below if you can reuse/extend CreateTableDesc instead of LocalDirectoryDesc, you probably can refactor and reuse much of the parsing logic from create table and that will be good to have. ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java:102 As I mentioned earlier, if we reuse CreateTableDesc class, probably much of this code could be avoided. ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q:8 Can you add a test which results in MR job. e.g, doing join, group-by etc? ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q:72 After HIVE-4369 we need to use dfs $ {system:test.dfs.mkdir} for mkdir command because of hadoop incompatibility issues. REVISION DETAIL https://reviews.facebook.net/D10275 BRANCH HIVE-3682 ARCANIST PROJECT hive To: JIRA, ashutoshc, omalley, khorgath
          Hide
          Phabricator added a comment -

          khorgath updated the revision "HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice".

          Updated to reflect a couple of review comments:

          • Reused CreateTableDesc instead of creating LocalDirectoryDesc
          • Removed LocalDirectoryDesc
          • Still needs a separate function to set parameters inside the
            CreateTableDesc though, because of NPEs in expectations of fields
            like InputFormat inside CreateTableDesc. I can loosen those checks
            but not without worrying about whether something else will break
            because of that(and it does, with some minimal testing.)
          • Have updated tests to do things like projections, which causes an MR job
          • Have not updated to reflect HIVE-4369, because I can't get that without
            merging with trunk, and that means I can't upload using arc to
            reviewboard. I will update the main jira with a svn patch with that changed.

          Reviewers: ashutoshc, JIRA, omalley

          REVISION DETAIL
          https://reviews.facebook.net/D10275

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D10275?vs=32139&id=33039#toc

          AFFECTED FILES
          data/files/array_table.txt
          data/files/map_table.txt
          ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
          ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java
          ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
          ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
          ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q
          ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out

          To: JIRA, ashutoshc, omalley, khorgath

          Show
          Phabricator added a comment - khorgath updated the revision " HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice". Updated to reflect a couple of review comments: Reused CreateTableDesc instead of creating LocalDirectoryDesc Removed LocalDirectoryDesc Still needs a separate function to set parameters inside the CreateTableDesc though, because of NPEs in expectations of fields like InputFormat inside CreateTableDesc. I can loosen those checks but not without worrying about whether something else will break because of that(and it does, with some minimal testing.) Have updated tests to do things like projections, which causes an MR job Have not updated to reflect HIVE-4369 , because I can't get that without merging with trunk, and that means I can't upload using arc to reviewboard. I will update the main jira with a svn patch with that changed. Reviewers: ashutoshc, JIRA, omalley REVISION DETAIL https://reviews.facebook.net/D10275 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D10275?vs=32139&id=33039#toc AFFECTED FILES data/files/array_table.txt data/files/map_table.txt ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out To: JIRA, ashutoshc, omalley, khorgath
          Hide
          Phabricator added a comment -

          khorgath updated the revision "HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice".

          Updated to reflect HIVE-4369 change as well

          Reviewers: ashutoshc, JIRA, omalley

          REVISION DETAIL
          https://reviews.facebook.net/D10275

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D10275?vs=33039&id=33045#toc

          AFFECTED FILES
          data/files/array_table.txt
          data/files/map_table.txt
          ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
          ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java
          ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
          ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
          ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q
          ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out

          To: JIRA, ashutoshc, omalley, khorgath

          Show
          Phabricator added a comment - khorgath updated the revision " HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice". Updated to reflect HIVE-4369 change as well Reviewers: ashutoshc, JIRA, omalley REVISION DETAIL https://reviews.facebook.net/D10275 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D10275?vs=33039&id=33045#toc AFFECTED FILES data/files/array_table.txt data/files/map_table.txt ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out To: JIRA, ashutoshc, omalley, khorgath
          Hide
          Phabricator added a comment -

          ashutoshc has accepted the revision "HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice".

          +1 will commit if tests pass.

          REVISION DETAIL
          https://reviews.facebook.net/D10275

          BRANCH
          HIVE-3682

          ARCANIST PROJECT
          hive

          To: JIRA, ashutoshc, omalley, khorgath

          Show
          Phabricator added a comment - ashutoshc has accepted the revision " HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice". +1 will commit if tests pass. REVISION DETAIL https://reviews.facebook.net/D10275 BRANCH HIVE-3682 ARCANIST PROJECT hive To: JIRA, ashutoshc, omalley, khorgath
          Hide
          Phabricator added a comment -

          khorgath updated the revision "HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice".

          Converted !cat to dfs -cat to prevent issues with multiple streams writing to the .out file

          Reviewers: ashutoshc, JIRA, omalley

          REVISION DETAIL
          https://reviews.facebook.net/D10275

          CHANGE SINCE LAST DIFF
          https://reviews.facebook.net/D10275?vs=33045&id=33153#toc

          BRANCH
          HIVE-3682

          ARCANIST PROJECT
          hive

          AFFECTED FILES
          data/files/array_table.txt
          data/files/map_table.txt
          ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
          ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java
          ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
          ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
          ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q
          ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out

          To: JIRA, ashutoshc, omalley, khorgath

          Show
          Phabricator added a comment - khorgath updated the revision " HIVE-3682 [jira] when output hive table to file,users should could have a separator of their own choice". Converted !cat to dfs -cat to prevent issues with multiple streams writing to the .out file Reviewers: ashutoshc, JIRA, omalley REVISION DETAIL https://reviews.facebook.net/D10275 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D10275?vs=33045&id=33153#toc BRANCH HIVE-3682 ARCANIST PROJECT hive AFFECTED FILES data/files/array_table.txt data/files/map_table.txt ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out To: JIRA, ashutoshc, omalley, khorgath
          Hide
          Sushanth Sowmyan added a comment -

          Attaching 0.11 patch for latest patch.

          Show
          Sushanth Sowmyan added a comment - Attaching 0.11 patch for latest patch.
          Hide
          Ashutosh Chauhan added a comment -

          Committed to trunk and 0.11 branch. Thanks, Sushanth!

          Show
          Ashutosh Chauhan added a comment - Committed to trunk and 0.11 branch. Thanks, Sushanth!
          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-h0.21 #2086 (See https://builds.apache.org/job/Hive-trunk-h0.21/2086/)
          HIVE-3682 : when output hive table to file,users should could have a separator of their own choice (Sushanth Sowmyan via Ashutosh Chauhan) (Revision 1477368)

          Result = FAILURE
          hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1477368
          Files :

          • /hive/trunk/data/files/array_table.txt
          • /hive/trunk/data/files/map_table.txt
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
          • /hive/trunk/ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q
          • /hive/trunk/ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out
          Show
          Hudson added a comment - Integrated in Hive-trunk-h0.21 #2086 (See https://builds.apache.org/job/Hive-trunk-h0.21/2086/ ) HIVE-3682 : when output hive table to file,users should could have a separator of their own choice (Sushanth Sowmyan via Ashutosh Chauhan) (Revision 1477368) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1477368 Files : /hive/trunk/data/files/array_table.txt /hive/trunk/data/files/map_table.txt /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java /hive/trunk/ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q /hive/trunk/ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out
          Hide
          Hudson added a comment -

          Integrated in Hive-trunk-hadoop2 #183 (See https://builds.apache.org/job/Hive-trunk-hadoop2/183/)
          HIVE-3682 : when output hive table to file,users should could have a separator of their own choice (Sushanth Sowmyan via Ashutosh Chauhan) (Revision 1477368)

          Result = FAILURE
          hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1477368
          Files :

          • /hive/trunk/data/files/array_table.txt
          • /hive/trunk/data/files/map_table.txt
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
          • /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
          • /hive/trunk/ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q
          • /hive/trunk/ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out
          Show
          Hudson added a comment - Integrated in Hive-trunk-hadoop2 #183 (See https://builds.apache.org/job/Hive-trunk-hadoop2/183/ ) HIVE-3682 : when output hive table to file,users should could have a separator of their own choice (Sushanth Sowmyan via Ashutosh Chauhan) (Revision 1477368) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1477368 Files : /hive/trunk/data/files/array_table.txt /hive/trunk/data/files/map_table.txt /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java /hive/trunk/ql/src/test/queries/clientpositive/insert_overwrite_local_directory_1.q /hive/trunk/ql/src/test/results/clientpositive/insert_overwrite_local_directory_1.q.out
          Show
          caofangkun added a comment - Hi Navis could you please put this into the wiki ? https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-Writingdataintofilesystemfromqueries
          Hide
          Navis added a comment -

          caofangkun I don't know anything about this issue. I think you've meant to refer Sushanth Sowmyan?

          Show
          Navis added a comment - caofangkun I don't know anything about this issue. I think you've meant to refer Sushanth Sowmyan ?
          Hide
          Vijay Ratnagiri added a comment -

          Hey Guys,

          I was really delighted to find that the export finally supported choosing the format, but unfortunately, my delight was short lived when I discovered thet this feature is supported only for 'insert overwrite LOCAL directory' and not when I'm exporting to an HDFS directory.

          I get a syntax/parse error when I try to export to an HDFS directory with a custom row format.

          How come this feature was implimented like this? If this wasn't intentional, then, does this warrant reopening this ticket?

          Thanks!

          Show
          Vijay Ratnagiri added a comment - Hey Guys, I was really delighted to find that the export finally supported choosing the format, but unfortunately, my delight was short lived when I discovered thet this feature is supported only for 'insert overwrite LOCAL directory' and not when I'm exporting to an HDFS directory. I get a syntax/parse error when I try to export to an HDFS directory with a custom row format. How come this feature was implimented like this? If this wasn't intentional, then, does this warrant reopening this ticket? Thanks!
          Hide
          Sushanth Sowmyan added a comment -

          caofangkun : Thanks for bringing that up, apologies for not noticing it till now, I'll add it in to the wiki.

          Vijay Ratnagiri : Well, for writing out to hdfs, there already exists a way to do this, and that is to write out to a new table at that location. What was lacking was the ability to be able to support a write to a local directory with the features that exist for a hdfs write, and therefore, this was added. Basically, you can do a CREATE TABLE with whatever format you want, at an appropriate hdfs location, and then do an INSERT OVERWRITE into that table with the results of whatever SELECT you desire.

          Show
          Sushanth Sowmyan added a comment - caofangkun : Thanks for bringing that up, apologies for not noticing it till now, I'll add it in to the wiki. Vijay Ratnagiri : Well, for writing out to hdfs, there already exists a way to do this, and that is to write out to a new table at that location. What was lacking was the ability to be able to support a write to a local directory with the features that exist for a hdfs write, and therefore, this was added. Basically, you can do a CREATE TABLE with whatever format you want, at an appropriate hdfs location, and then do an INSERT OVERWRITE into that table with the results of whatever SELECT you desire.
          Hide
          Amareshwari Sriramadasu added a comment -

          Though above suggestion of creating a table and insert overwrite table works, it enforces the user to know schema of the output and create the table ahead. When queries are automated, it is difficult to always create the table ahead. I have created the issue HIVE-6410 for adding the functionality in this issue for INSERT OVERWRITE DIRECTORY as well.

          Show
          Amareshwari Sriramadasu added a comment - Though above suggestion of creating a table and insert overwrite table works, it enforces the user to know schema of the output and create the table ahead. When queries are automated, it is difficult to always create the table ahead. I have created the issue HIVE-6410 for adding the functionality in this issue for INSERT OVERWRITE DIRECTORY as well.
          Hide
          Lefty Leverenz added a comment -

          Lars Francke added this note to the wiki: "As of Hive 0.11.0 the separator used can be specified, in earlier versions it was always the ^A character (\001)" and Prasad Mujumdar added the ROW FORMAT syntax. More details and some examples would be helpful.

          Show
          Lefty Leverenz added a comment - Lars Francke added this note to the wiki: "As of Hive 0.11.0 the separator used can be specified, in earlier versions it was always the ^A character (\001)" and Prasad Mujumdar added the ROW FORMAT syntax. More details and some examples would be helpful. LanguageManual DML: Writing data into the filesystem from queries

            People

            • Assignee:
              Sushanth Sowmyan
              Reporter:
              caofangkun
            • Votes:
              0 Vote for this issue
              Watchers:
              15 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development