Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-5552

CSVReader Can't Derive Schema from Quoted Headers

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Patch Available
    • Minor
    • Resolution: Unresolved
    • 1.7.1
    • None
    • Extensions
    • None

    Description

      When deriving the schema from a CSV File Header NiFi is unable to generate a valid schema if the Header Columns are Double Quoted even though the CSVReader is set to handle quotes. Using the nile.csv sample file fromĀ https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html results in anĀ Illegal initial character exception in the Avro Schema generator. In this specific case the header did not contain any spaces or special characters though it was case sensitive.

      org.apache.avro.SchemaParseException: Illegal initial character: "Flood"
      at org.apache.avro.Schema.validateName(Schema.java:1147)
      at org.apache.avro.Schema.access$200(Schema.java:81)
      at org.apache.avro.Schema$Field.<init>(Schema.java:403)
      at org.apache.avro.Schema$Field.<init>(Schema.java:423)
      at org.apache.avro.Schema$Field.<init>(Schema.java:415)
      at org.apache.nifi.avro.AvroTypeUtil.buildAvroField(AvroTypeUtil.java:123)
      at org.apache.nifi.avro.AvroTypeUtil.buildAvroSchema(AvroTypeUtil.java:114)
      at org.apache.nifi.avro.AvroTypeUtil.extractAvroSchema(AvroTypeUtil.java:94)
      at org.apache.nifi.schema.access.WriteAvroSchemaAttributeStrategy.getAttributes(WriteAvroSchemaAttributeStrategy.java:58)
      at org.apache.nifi.json.WriteJsonResult.writeRecord(WriteJsonResult.java:137)
      at org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:59)
      at org.apache.nifi.processors.standard.AbstractRecordProcessor$1.process(AbstractRecordProcessor.java:122)
      at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2885)
      at org.apache.nifi.processors.standard.AbstractRecordProcessor.onTrigger(AbstractRecordProcessor.java:109)
      at org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
      at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
      at org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
      at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
      at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
      at java.lang.Thread.run(Thread.java:748)

      Attachments

        Activity

          People

            pvillard Pierre Villard
            Absolutesantaja Shawn Weeks
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 20m
                20m