Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-17984

getMaxLength is not returning the correct lengths for Char/Varchar types while reading the ORC file from WebHDFS file system

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Hive, ORC
    • Labels:
      None
    • Environment:

      tested it against hive-exec 2.1

      Description

      getMaxLength is not returning the correct length for char/varchar datatypes.
      I see that getMaxLength is returning 255 for CHAR type and 65535 for VARCHAR type.
      When I checked the same file using orcfiledump utility, I could see the correct lengths.

      Here is the snippet of the code:

      Reader _reader = OrcFile.createReader(new Path(_fileName),OrcFile.readerOptions(conf).filesystem(fs)) ;
      TypeDescription metarec = _reader.getSchema() ;
      List <TypeDescription> cols = metarec.getChildren();
      List <String> colNames = metarec.getFieldNames();
      for (int i=0; i < cols.size(); i++)
      {
      TypeDescription fieldSchema = cols.get;

      switch (fieldSchema.getCategory())

      { case CHAR: header += "char(" + fieldSchema.getMaxLength() + ")" ; break; ---------- ---------- }

      }

      Please let me know your pointers please.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              syaamb4u Syam
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:

                Time Tracking

                Estimated:
                Original Estimate - 24h
                24h
                Remaining:
                Remaining Estimate - 24h
                24h
                Logged:
                Time Spent - Not Specified
                Not Specified