[DRILL-4579] Drill Architecture Doc Updates - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Minor
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: Documentation
Labels:
None

Description

From Hakim
one quick note here, we have a known issue when the foreman bit dies [1]
but a fix is being reviewed and should be merged soon into master branch.
Once this issue is resolved, the client will fail the query when the
foreman dies.

[1] https://issues.apache.org/jira/browse/DRILL-3743

Hi Ananda,

I’m somewhat new to Drill and I asked the same questions. Here’s what I understand (and I hope others will offer any needed corrections.)

Drill uses a flow-based DAG model with no intermediate caching or checkpoints. That is a fancy way of saying that data streams from scanners to aggregators to your client. There is no way to recover/restart any fragment and preserve query semantics because Drill has no means of knowing which rows have already been sent upstream by that fragment.

As a result, the failure of any fragment fails the entire query; the recovery solution is to rerun the query.

Any Drillbit can act as a foreman; one per query. The Foreman for a query is the Drillbit to which your client happens to connect. Each Drillbit uses ZooKeeper to monitor the status of all other Drillbits. If a Drillbit dies (or stops its ZK heartbeat), the Drillbit drops out of ZK and is assumed dead. Each Foreman fails any queries that were active on the failed Drillbit. If it is the foreman that dies, then the client handles the failure (I’m a bit unsure of the details in this particular case.)

Thanks,

Paul

> On Apr 5, 2016, at 2:20 AM, Ananda Samal <ananda.samal@gmail.com> wrote:
>
> Hi Team,
>
> I went through the Architecture of Drill and have couple of questions are
> in my mind . Can you please help me here :
>
> 1- what is the recovery model/process of Drill ?
> ( If one of the drillbit went down from cluster while processing the data
> , how to recover them .)
>
> 2- If any of the minor Fragment id went down , Is Foreman able to recover
> that automatically or how it will manage ?
>
> 3- Is the Foreman keep track of other drillbits which are involved on the
> query executions.If yes how ? If no , then how it will manage if other
> Drillbits went down .?
>
> Can some one help here .

Attachments

Activity

People

Assignee:: Bridget Bevens

Reporter:: Bridget Bevens

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Dates

Created:: 05/Apr/16 17:56

Updated:: 05/Apr/16 17:58