Uploaded image for project: 'Apache Gobblin'
  1. Apache Gobblin
  2. GOBBLIN-587

Implement partition level lineage for fs based destination

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • None
    • None

    Description

      Currently, gobblin lineage is sent at dataset level. The task is to send partition level lineage for fs sink. An example kafka-hdfs partition lineage is

      {
        "timestamp": 1536785248451,
        "namespace": {
          "string": "gobblin.event.lineage"
        },
        "name": "LoginEvent",
        "metadata": {
          "destination": "{\"object-type\":\"org.apache.gobblin.dataset.PartitionDescriptor\",\"object-data\":{\"dataset\":{\"object-type\":\"org.apache.gobblin.dataset.DatasetDescriptor\",\"object-data\":{\"platform\":\"hdfs\",\"metadata\":{\"branch\":\"0\"},\"name\":\"/data/tracking/LoginEvent\"}},\"name\":\"hourly/2018/09/12/12\"}}",
          "eventType": "LineageEvent",
          "source": "{\"object-type\":\"org.apache.gobblin.dataset.DatasetDescriptor\",\"object-data\":{\"platform\":\"kafka\",\"metadata\":{},\"name\":\"LoginEvent\"}}",
          "metricContextName": "org.apache.gobblin.runtime.SafeDatasetCommit.1693032310",
          "metricContextID": "1a7895b0-9e93-414e-ac0b-038f9375c82e",
          "class": "org.apache.gobblin.runtime.SafeDatasetCommit",
        }
      }
      

      Note: Lineage is not available automatically. You might have to implement the support in your source-destination pair.

      Attachments

        Activity

          People

            zxc Zhixiong Chen
            zxc Zhixiong Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: