Hive
  1. Hive
  2. HIVE-5795

Hive should be able to skip header and footer rows when reading data file for a table

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.13.0
    • Component/s: None
    • Labels:
    • Release Note:
      Hide
      hive.file.max.footer
        Default Value: 100
        Max number of lines of footer user can set for a table file.
      skip.header.line.count
        Default Value: 0
        Number of header lines for the table file.
      skip.footer.line.count
        Default Value: 0
        Number of footer lines for the table file.

      "skip.footer.line.count" and "skip.header.line.count" should be specified in the table property during creating the table. Following example shows the usage of these two properties:

      Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="2");
      Show
      hive.file.max.footer   Default Value: 100   Max number of lines of footer user can set for a table file. skip.header.line.count   Default Value: 0   Number of header lines for the table file. skip.footer.line.count   Default Value: 0   Number of footer lines for the table file. "skip.footer.line.count" and "skip.header.line.count" should be specified in the table property during creating the table. Following example shows the usage of these two properties: Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="2");

      Description

      Hive should be able to skip header and footer lines when reading data file from table. In this way, user don't need to processing data which generated by other application with a header or footer and directly use the file for table operations.
      To implement this, the idea is adding new properties in table descriptions to define the number of lines in header and footer and skip them when reading the record from record reader. An DDL example for creating a table with header and footer should be like this:

      Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="2");
      
      1. HIVE-5795.5.patch
        42 kB
        Shuaishuai Nie
      2. HIVE-5795.4.patch
        44 kB
        Shuaishuai Nie
      3. HIVE-5795.1.patch
        33 kB
        Shuaishuai Nie
      4. HIVE-5795.3.patch
        42 kB
        Shuaishuai Nie
      5. HIVE-5795.2.patch
        39 kB
        Shuaishuai Nie

        Issue Links

          Activity

          Navis made changes -
          Link This issue is duplicated by HIVE-4776 [ HIVE-4776 ]
          Lefty Leverenz made changes -
          Labels TODOC13
          Shuaishuai Nie made changes -
          Release Note hive.file.max.footer
            Default Value: 100
            Max number of lines of footer user can set for a table file.
          skip.header.line.count
            Default Value: 0
            Number of header lines for the table file.
          skip.footer.line.count
            Default Value: 0
            Number of footer lines for the table file.

          "skip.footer.line.count" and "skip.header.line.count" should be specified in the table property during creating the table. Following example shows the usage of these two properties:
          {code}
          Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="2");
          {code}
          hive.file.max.footer
            Default Value: 100
            Max number of lines of footer user can set for a table file.
          skip.header.line.count
            Default Value: 0
            Number of header lines for the table file.
          skip.footer.line.count
            Default Value: 0
            Number of footer lines for the table file.

          "skip.footer.line.count" and "skip.header.line.count" should be specified in the table property during creating the table. Following example shows the usage of these two properties:

          Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="2");
          Shuaishuai Nie made changes -
          Release Note hive.file.max.footer
            Default Value: 100
            Max number of lines of footer user can set for a table file.
          skip.header.line.count
            Default Value: 0
            Number of header lines for the table file.
          skip.footer.line.count
            Default Value: 0
            Number of footer lines for the table file.
          hive.file.max.footer
            Default Value: 100
            Max number of lines of footer user can set for a table file.
          skip.header.line.count
            Default Value: 0
            Number of header lines for the table file.
          skip.footer.line.count
            Default Value: 0
            Number of footer lines for the table file.

          "skip.footer.line.count" and "skip.header.line.count" should be specified in the table property during creating the table. Following example shows the usage of these two properties:
          {code}
          Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="2");
          {code}
          Shuaishuai Nie made changes -
          Release Note hive.file.max.footer
            Default Value: 100
            Max number of lines of footer user can set for a table file.
          skip.header.line.count
            Default Value: 0
            Number of header lines for the table file.
          skip.footer.line.count
            Default Value: 0
            Number of footer lines for the table file.
          Thejas M Nair made changes -
          Issue Type Bug [ 1 ] New Feature [ 2 ]
          Thejas M Nair made changes -
          Fix Version/s 0.13.0 [ 12324986 ]
          Thejas M Nair made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Shuaishuai Nie made changes -
          Attachment HIVE-5795.5.patch [ 12620989 ]
          Shuaishuai Nie made changes -
          Description Hive should be able to skip header and footer lines when reading data file from table. In this way, user don't need to processing data which generated by other application with a header or footer and directly use the file for table operations.
          To implement this, the idea is adding new properties in table descriptions to define the number of lines in header and footer and skip them when reading the record from record reader. An DDL example for creating a table with header and footer should be like this:
          {code}
          Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties ("skip.header.number"="1", "skip.footer.number"="2");
          {code}
          Hive should be able to skip header and footer lines when reading data file from table. In this way, user don't need to processing data which generated by other application with a header or footer and directly use the file for table operations.
          To implement this, the idea is adding new properties in table descriptions to define the number of lines in header and footer and skip them when reading the record from record reader. An DDL example for creating a table with header and footer should be like this:
          {code}
          Create external table testtable (name string, message string) row format delimited fields terminated by '\t' lines terminated by '\n' location '/testtable' tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="2");
          {code}
          Shuaishuai Nie made changes -
          Attachment HIVE-5795.4.patch [ 12620881 ]
          Shuaishuai Nie made changes -
          Attachment HIVE-5795.1.patch [ 12619655 ]
          Shuaishuai Nie made changes -
          Attachment HIVE-5795.1.patch [ 12614149 ]
          Shuaishuai Nie made changes -
          Attachment HIVE-5795.3.patch [ 12619457 ]
          Shuaishuai Nie made changes -
          Attachment HIVE-5795.3.patch [ 12619445 ]
          Shuaishuai Nie made changes -
          Attachment HIVE-5795.3.patch [ 12619445 ]
          Shuaishuai Nie made changes -
          Attachment HIVE-5795.2.patch [ 12618181 ]
          Shuaishuai Nie made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Shuaishuai Nie made changes -
          Field Original Value New Value
          Attachment HIVE-5795.1.patch [ 12614149 ]
          Shuaishuai Nie created issue -

            People

            • Assignee:
              Shuaishuai Nie
              Reporter:
              Shuaishuai Nie
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development