Uploaded image for project: 'Apache Gobblin'
  1. Apache Gobblin
  2. GOBBLIN-615

Make LWM==HWM a valid interval in QueryBaseSource

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • None
    • 0.14.0
    • None
    • None

    Description

      We have seen many issues in DateWatermark where the job intermittently failed every other day. The reason is as follows:

      1. On 10-02 at 17:47 job pulls with logindate >= 2018-10-01 (HWM = 10-2, when job finished Actual_HWM is 10/2)
      2. On same 10-02 date, if the job repulled, we would have LWM=10-3, HWM=10-2, the job would fail as expected.
      3. On 10-03 at 17:47 job fails to generate any workunits because now LWM = Actual_HWM + 1 = 10-3, HWM = 10-3. According to DateWatermark::getIntervals(), the startTime must be less than endTime to generate an interval.
      4. On 10-04 at 17:47 job recovered because LWM keeps as 10-3 and HWM = 10-4, so a valid interval is generated again.

      The fix here is to let DateWatermark generate an interval at step 3, so that we won't have an intermittent failure in step 3.

      However this fix will cause another problem. Today we could have missing data in step 1 and 4, because step 1 pulls data for 10/2 too early and step 4 pulls data for 10/4 too early, but at least step 3 pulls whole data for 10/3. After this fix, the 10/3 will be pulled too early as well. So that this fix needs to be working with Cutoff feature so that we will only pull 10-1's data on 10/2.

      Thanks,

      Kuai

      Attachments

        Activity

          People

            yukuai518 Kuai Yu
            yukuai518 Kuai Yu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: