Uploaded image for project: 'ORC'
  1. ORC
  2. ORC-979 C++ API QA
  3. ORC-959

C++ reader crash in resolving nested List columns for SearchArgument

VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.7.0
    • 1.7.0
    • C++
    • None

    Description

      SearchArgument currently only provides interfaces using column names. Only columns of struct fields can be correctly resolved. Other columns (e.g. inside LIST or MAP) will cause crash in resolving them.

      The following codes reproduce the issue:

      #include <orc/OrcFile.hh>
      using namespace std;
      using namespace orc;
      
      int main() {
        ORC_UNIQUE_PTR<InputStream> inStream = readLocalFile("complextypestbl.orc");
        ReaderOptions options;
        ORC_UNIQUE_PTR<Reader> reader = createReader(move(inStream), options);
      
        RowReaderOptions rowReaderOptions;
      
        ORC_UNIQUE_PTR<SearchArgumentBuilder> sarg = SearchArgumentFactory::newBuilder();
        sarg->lessThanEquals("f", PredicateDataType::STRING, Literal("bbb", 3));
        ORC_UNIQUE_PTR<SearchArgument> final_sarg = sarg->build();
        rowReaderOptions.searchArgument(move(final_sarg));
      
        ORC_UNIQUE_PTR<RowReader> rowReader = reader->createRowReader(rowReaderOptions);
        ORC_UNIQUE_PTR<ColumnVectorBatch> batch = rowReader->createRowBatch(1024);
        return 0;
      }
      

      complextypestbl.orc is an ORC file of a ACID table with the following schema:

      id bigint
      int_array array<int>
      int_array_array array<array<int>>
      int_map map<string, int> 
      int_map_array array<map<string, int>>
      nested_struct struct<a: int, b: array<int>, c: struct<d: array<array<struct<e: int, f: string>>>>, g: map<string, struct<h: struct<i: array<double>>>>>
      

      The above C++ codes push down a predicate on the "f" column. GDB stacktrace for the crash:

      Program received signal SIGSEGV, Segmentation fault.
      orc::SargsApplier::findColumn (type=..., colName=...) at /home/quanlong/workspace/orc/c++/src/sargs/SargsApplier.cc:28
      28	      if (type.getFieldName(i) == colName) {
      (gdb) bt
      #0  orc::SargsApplier::findColumn (type=..., colName=...) at /home/quanlong/workspace/orc/c++/src/sargs/SargsApplier.cc:28
      #1  0x000000000045a518 in orc::SargsApplier::findColumn (type=..., colName=...) at /home/quanlong/workspace/orc/c++/src/sargs/SargsApplier.cc:31
      #2  0x000000000045a518 in orc::SargsApplier::findColumn (type=..., colName=...) at /home/quanlong/workspace/orc/c++/src/sargs/SargsApplier.cc:31
      #3  0x000000000045a67f in orc::SargsApplier::SargsApplier (this=0x200b9f0, type=..., searchArgument=<optimized out>, rowIndexStride=<optimized out>, writerVersion=<optimized out>)
          at /home/quanlong/workspace/orc/c++/src/sargs/SargsApplier.cc:56
      #4  0x00000000004253f8 in orc::RowReaderImpl::RowReaderImpl (this=0x2009760, _contents=..., opts=...) at /home/quanlong/workspace/orc/c++/src/Reader.cc:244
      #5  0x00000000004257ad in orc::ReaderImpl::createRowReader (this=<optimized out>, opts=...) at /home/quanlong/workspace/orc/c++/src/Reader.cc:765
      #6  0x000000000040b688 in main ()
      (gdb) l
      23	
      24	  // find column id from column name
      25	  uint64_t SargsApplier::findColumn(const Type& type,
      26	                                    const std::string& colName) {
      27	    for (uint64_t i = 0; i != type.getSubtypeCount(); ++i) {
      28	      if (type.getFieldName(i) == colName) {
      29	        return type.getSubtype(i)->getColumnId();
      30	      } else {
      31	        uint64_t ret = findColumn(*type.getSubtype(i), colName);
      32	        if (ret != INVALID_COLUMN_ID) {
      (gdb) p type.getKind()
      $16 = orc::LIST
      

      Only STRUCT type has valid field names. So the above codes crash.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            stigahuang Quanlong Huang
            stigahuang Quanlong Huang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment