Uploaded image for project: 'Apache Avro'
  1. Apache Avro
  2. AVRO-3194

[C++] Parsing avro file using `GenericDatum` results in segmentation fault

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • c++
    • None
    • Ubuntu on AWS EC2

    Description

      Hello,

      I wonder what's the correct example of parsing an avro input stream into a vector of `GenericRecord`?

      I am trying to parse .avro file using `GenericDatum` to store them in a vector of `GenericRecord`, but I got a segmentation fault (signal 11) when I am trying to call `datum.value()`. I am following the example here: https://stackoverflow.com/questions/55956222/how-to-read-data-from-avro-file-using-c-interface .Here is the sample code that I am writing:

      std::unique_ptr<avro::InputStream> avroInputStream  = avro::istreamInputStream(retrievedFile); // `retrievedFile` is a basic_iostream from AWS S3
      
      // get the schema file
      std::stringstream schemaInput(schemaName);
      avro::ValidSchema validSchema;
      avro::compileJsonSchema(schemaInput, validSchema);
      
      // read the data input stream with the given valid schema
      avro::DataFileReader<avro::GenericDatum> fileReader(move(avroInputStream));
      avro::GenericDatum datum(fileReader.dataSchema());
      std::vector<avro::GenericRecord> recordArray;
      while (fileReader.read(datum)) {
          if (datum.type() == avro::AVRO_RECORD) {
              std::cout << "[Check 1]" << std::endl;
              const avro::GenericRecord record = datum.value<avro::GenericRecord>(); // result in segmentation fault
              std::cout << "[Check 2]" << std::endl;
              recordArray.push_back(record);
          }
      }
      
      // processing the recordArray further
      ...

      Interestingly, if instead I used the struct generated by the JSON schema to parse the avro file instead of treating everything generic, it worked just fine (I'm following the example here: https://avro.apache.org/docs/current/api/cpp/html/index.html#UsingAvroDataFiles )

      Attachments

        1. lineorder_d.json
          0.9 kB
          Oscar Zhang

        Activity

          People

            Unassigned Unassigned
            OscarTHZhang Oscar Zhang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: