Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
Description
```
18-11-2016 11:23:04 CST DumpKafka INFO - WARN Start offset for partition cachelog-huge-a:78 is out of range. Start offset = 0, earliest offset = 1702454011, latest offset = 1847203166.This partition will start from the earliest offset: 1702454011
18-11-2016 11:23:04 CST DumpKafka INFO - INFO Created workunit for partition cachelog-huge-a:78: lowWatermark=1702454011, highWatermark=1847203166, range=144749155
```
when gobblin check the watermark , i find the warn as above, what should i do to fix this warn without lose data.
Thank you
Github Url : https://github.com/linkedin/gobblin/issues/1402
Github Reporter : lurenx
Github Created At : 2016-11-18T03:31:46Z
Github Updated At : 2017-04-14T23:28:36Z
Comments
ibuenros wrote on 2016-11-18T04:01:40Z : Hi,
I'm assuming this is the first time you run gobblin-kafka on that topic?
You will lose no data, it's just saying it found no watermark, so it will
start consuming from the beginning of the partition.
Issac
On Thu, Nov 17, 2016 at 7:31 PM, lurenx notifications@github.com wrote:
> 18-11-2016 11:23:04 CST DumpKafka INFO - WARN Start offset for partition cachelog-huge-a:78 is out of range. Start offset = 0, earliest offset = 1702454011, latest offset = 1847203166.This partition will start from the earliest offset: 1702454011
> 18-11-2016 11:23:04 CST DumpKafka INFO - INFO Created workunit for partition cachelog-huge-a:78: lowWatermark=1702454011, highWatermark=1847203166, range=144749155
>
> when gobblin check the watermark , i find the warn as above, what should i
> do to fix this warn without lose data.
>
> Thank you
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> https://github.com/linkedin/gobblin/issues/1402, or mute the thread
> https://github.com/notifications/unsubscribe-auth/ABTQkAICIy6jCWGBU_vZBtTqdhMpikBBks5q_RwigaJpZM4K2FcS
> .
Github Url : https://github.com/linkedin/gobblin/issues/1402#issuecomment-261443629
lurenx wrote on 2016-11-18T06:10:38Z : this is not the first time,i have run the gobblin job for about a months.
Github Url : https://github.com/linkedin/gobblin/issues/1402#issuecomment-261456635
mwol wrote on 2016-11-22T13:37:15Z : Could it be, that you cannot catch up with the amount of incoming data?
Github Url : https://github.com/linkedin/gobblin/issues/1402#issuecomment-262242153
lurenx wrote on 2016-11-23T03:30:18Z : It‘s easy for gobblin to deal the incoming data. it may cost 15 mins to deal the one hour incoming data.
Github Url : https://github.com/linkedin/gobblin/issues/1402#issuecomment-262427860
ydai1124 wrote on 2017-04-14T23:28:36Z : @lurenx , is this still an issue for you? @mwol is correct that failing to catch up with the incoming data can be one cause. You may want to try to tune some parameters for improving your ingestion speed.
Github Url : https://github.com/linkedin/gobblin/issues/1402#issuecomment-294255299