Uploaded image for project: 'Apache NiFi'
  1. Apache NiFi
  2. NIFI-5781

Incorrect schema for provenance events in SiteToSiteProvenanceReportingTask

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.7.0, 1.8.0, 1.7.1
    • 1.9.0
    • Extensions
    • None

    Description

      The current schema does not allow null values for fields such as "details", "remoteIdentifier", "alternateIdentifier" and others. This Jira is to make the schema more flexible and allow for null fields.

      This will cause error looking like:

      2018-11-01 14:59:30,551 ERROR [Timer-Driven Process Thread-2] o.a.n.r.SiteToSiteProvenanceReportingTask SiteToSiteProvenanceReportingTask[id=0751c46f-0163-1000-7d33-f276e8654728] Error running task SiteToSiteProvenanceReportingTask[id=0751c46f-0163-1000-7d33-f276e8654728] due to org.apache.avro.file.DataFileWriter$AppendWriteException: java.lang.NullPointerException: null of string in field details of nifi.provenanceEvent

      Workaround: specify a writer with a custom schema instead of inheriting record schema.

      {
        "namespace": "nifi",
        "name": "provenanceEvent",
        "type": "record",
        "fields": [
          { "name": "eventId", "type": "string" },
          { "name": "eventOrdinal", "type": "long" },
          { "name": "eventType", "type": "string" },
          { "name": "timestampMillis", "type": "long" },
          { "name": "durationMillis", "type": "long" },
          { "name": "lineageStart", "type": { "type": "long", "logicalType": "timestamp-millis" } },
          { "name": "details", "type": ["null", "string"] },
          { "name": "componentId", "type": ["null", "string"] },
          { "name": "componentType", "type": ["null", "string"] },
          { "name": "componentName", "type": ["null", "string"] },
          { "name": "processGroupId", "type": ["null", "string"] },
          { "name": "processGroupName", "type": ["null", "string"] },
          { "name": "entityId", "type": ["null", "string"] },
          { "name": "entityType", "type": ["null", "string"] },
          { "name": "entitySize", "type": ["null", "long"] },
          { "name": "previousEntitySize", "type": ["null", "long"] },
          { "name": "updatedAttributes", "type": { "type": "map", "values": "string" } },
          { "name": "previousAttributes", "type": { "type": "map", "values": "string" } },
          { "name": "actorHostname", "type": ["null", "string"] },
          { "name": "contentURI", "type": ["null", "string"] },
          { "name": "previousContentURI", "type": ["null", "string"] },
          { "name": "parentIds", "type": { "type": "array", "items": "string" } },
          { "name": "childIds", "type": { "type": "array", "items": "string" } },
          { "name": "platform", "type": "string" },
          { "name": "application", "type": "string" },
          { "name": "remoteIdentifier", "type": ["null", "string"] },
          { "name": "alternateIdentifier", "type": ["null", "string"] },
          { "name": "transitUri", "type": ["null", "string"] }
        ]
      }

      Attachments

        Issue Links

          Activity

            People

              pvillard Pierre Villard
              pvillard Pierre Villard
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: