Details
-
Task
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
Github Url : https://github.com/linkedin/gobblin/pull/841
Github Reporter : Kaiserchen
Github Assignee : Kaiserchen
Github Created At : 2016-03-14T13:10:48Z
Github Updated At : 2017-04-22T18:45:16Z
Comments
zliu41 wrote on 2016-03-15T00:43:53Z : It's ok to remove the timeout on getting a record from the queue which is no longer needed, but I think there should still be a timeout on putting a record into the queue. The benefit of having a timeout for put is that, if there are multiple forks in a task, and one fork is slow (hence its queue is often full), the task can timeout the put for this fork and go to the next fork.
Otherwise the entire task is blocked by the slow fork.
Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-196587656
Kaiserchen wrote on 2016-03-15T02:08:50Z : @zliu41 the entire task will be blocked by a slow fork at the moment anyhow. Wich is wanted behaviour i guess. IMO the gain of the whole looping thing in Task,processRecord() is very little, it just shifts the place where they see nor record for long time from the position between n to n+1 to n-1 to n. This only has an impact when its the last record. In this case the fast forks will still be done faster then the problematic fork that caused the blocking. I am actually in favour of throwing this `while (!allPutsSucceeded) {` away, as layed out the benefit is not really there IMO.
Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-196616393
zliu41 wrote on 2016-03-15T15:32:49Z : @Kaiserchen that's true. Then can you change the return type of `putRecord` to void.
Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-196880457
Kaiserchen wrote on 2016-03-16T13:21:57Z : Changing the queue with statistics to give the wanted behavior is a little more invasive. I also need to touch/ write some unittests then. Will take me a little longer
Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-197326115
stakiar wrote on 2016-03-18T18:51:01Z : @Kaiserchen I agree the `while (!allPutsSucceeded)` block in `Task.processRecord`. I believe the main benefit is that for a single record, a single slow `Fork` will not block other `Fork`s from seeing the record. However, the next record will not be processed, until all `Fork`s have accepted the record.
I'm also ok with removing this logic, it should simplify the code also.
Removing the timeouts from the `BoundedBlockingRecordQueue` also sounds reasonable.
Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-198492404
Kaiserchen wrote on 2016-04-13T09:23:13Z : Did all the changes that seemed to be required, Junittest seem to deadlock. If someone else want to fix this. Highly appreaciated.
Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-209329559
abti wrote on 2017-01-12T03:16:26Z : @shirshanka Can you please review this, since this might be a possible change that may impact streaming?
Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-272065072
Kaiserchen wrote on 2017-01-12T05:40:51Z : Hi,
Sorry yall, not sure if i have time to pull it over the finish line. I can
maybe chop of a few hours next week. If not Ill let you know. My company is
only now migrating towards gobblin found this back then while evaluating.
Btw gobblin streaming? Really? :Sad_face:
Am 12.01.2017 04:16 schrieb Abhishek Tiwari <notifications@github.com>:
> @shirshanka <https://github.com/shirshanka> Can you please review this,
> since this might be a possible change that may impact streaming?
>
> —
> You are receiving this because you were assigned.
> Reply to this email directly, view it on GitHub
> <https://github.com/linkedin/gobblin/pull/841#issuecomment-272065072>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AAO-YFRTsd0_Ui0BhCKJfLGUDjKevWW_ks5rRZsNgaJpZM4HwCNK>
> .
>
Github Url : https://github.com/linkedin/gobblin/pull/841#issuecomment-272081003