Uploaded image for project: 'Apache Gobblin'
  1. Apache Gobblin
  2. GOBBLIN-121

change TaskDone notification for forks to be not racy

    XMLWordPrintableJSON

Details

    Description

      Github Url : https://github.com/linkedin/gobblin/pull/841
      Github Reporter : Kaiserchen
      Github Assignee : Kaiserchen
      Github Created At : 2016-03-14T13:10:48Z
      Github Updated At : 2017-04-22T18:45:16Z

      Comments


      zliu41 wrote on 2016-03-15T00:43:53Z : It's ok to remove the timeout on getting a record from the queue which is no longer needed, but I think there should still be a timeout on putting a record into the queue. The benefit of having a timeout for put is that, if there are multiple forks in a task, and one fork is slow (hence its queue is often full), the task can timeout the put for this fork and go to the next fork.

      Otherwise the entire task is blocked by the slow fork.

      Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-196587656


      Kaiserchen wrote on 2016-03-15T02:08:50Z : @zliu41 the entire task will be blocked by a slow fork at the moment anyhow. Wich is wanted behaviour i guess. IMO the gain of the whole looping thing in Task,processRecord() is very little, it just shifts the place where they see nor record for long time from the position between n to n+1 to n-1 to n. This only has an impact when its the last record. In this case the fast forks will still be done faster then the problematic fork that caused the blocking. I am actually in favour of throwing this `while (!allPutsSucceeded) {` away, as layed out the benefit is not really there IMO.

      Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-196616393


      zliu41 wrote on 2016-03-15T15:32:49Z : @Kaiserchen that's true. Then can you change the return type of `putRecord` to void.

      Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-196880457


      Kaiserchen wrote on 2016-03-16T13:21:57Z : Changing the queue with statistics to give the wanted behavior is a little more invasive. I also need to touch/ write some unittests then. Will take me a little longer

      Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-197326115


      stakiar wrote on 2016-03-18T18:51:01Z : @Kaiserchen I agree the `while (!allPutsSucceeded)` block in `Task.processRecord`. I believe the main benefit is that for a single record, a single slow `Fork` will not block other `Fork`s from seeing the record. However, the next record will not be processed, until all `Fork`s have accepted the record.

      I'm also ok with removing this logic, it should simplify the code also.

      Removing the timeouts from the `BoundedBlockingRecordQueue` also sounds reasonable.

      Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-198492404


      Kaiserchen wrote on 2016-04-13T09:23:13Z : Did all the changes that seemed to be required, Junittest seem to deadlock. If someone else want to fix this. Highly appreaciated.

      Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-209329559


      abti wrote on 2017-01-12T03:16:26Z : @shirshanka Can you please review this, since this might be a possible change that may impact streaming?

      Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-272065072


      Kaiserchen wrote on 2017-01-12T05:40:51Z : Hi,

      Sorry yall, not sure if i have time to pull it over the finish line. I can
      maybe chop of a few hours next week. If not Ill let you know. My company is
      only now migrating towards gobblin found this back then while evaluating.

      Btw gobblin streaming? Really? :Sad_face:

      Am 12.01.2017 04:16 schrieb Abhishek Tiwari <notifications@github.com>:

      > @shirshanka <https://github.com/shirshanka> Can you please review this,
      > since this might be a possible change that may impact streaming?
      >
      > —
      > You are receiving this because you were assigned.
      > Reply to this email directly, view it on GitHub
      > <https://github.com/linkedin/gobblin/pull/841#issuecomment-272065072>, or mute
      > the thread
      > <https://github.com/notifications/unsubscribe-auth/AAO-YFRTsd0_Ui0BhCKJfLGUDjKevWW_ks5rRZsNgaJpZM4HwCNK>
      > .
      >

      Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-272081003

      Attachments

        Activity

          People

            Unassigned Unassigned
            abti Abhishek Tiwari
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: