Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-4579

Drill Architecture Doc Updates

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • Documentation
    • None

    Description

      From Hakim
      one quick note here, we have a known issue when the foreman bit dies [1]
      but a fix is being reviewed and should be merged soon into master branch.
      Once this issue is resolved, the client will fail the query when the
      foreman dies.

      [1] https://issues.apache.org/jira/browse/DRILL-3743

      Hi Ananda,

      I’m somewhat new to Drill and I asked the same questions. Here’s what I understand (and I hope others will offer any needed corrections.)

      Drill uses a flow-based DAG model with no intermediate caching or checkpoints. That is a fancy way of saying that data streams from scanners to aggregators to your client. There is no way to recover/restart any fragment and preserve query semantics because Drill has no means of knowing which rows have already been sent upstream by that fragment.

      As a result, the failure of any fragment fails the entire query; the recovery solution is to rerun the query.

      Any Drillbit can act as a foreman; one per query. The Foreman for a query is the Drillbit to which your client happens to connect. Each Drillbit uses ZooKeeper to monitor the status of all other Drillbits. If a Drillbit dies (or stops its ZK heartbeat), the Drillbit drops out of ZK and is assumed dead. Each Foreman fails any queries that were active on the failed Drillbit. If it is the foreman that dies, then the client handles the failure (I’m a bit unsure of the details in this particular case.)

      Thanks,

      • Paul

      > On Apr 5, 2016, at 2:20 AM, Ananda Samal <ananda.samal@gmail.com> wrote:
      >
      > Hi Team,
      >
      > I went through the Architecture of Drill and have couple of questions are
      > in my mind . Can you please help me here :
      >
      > 1- what is the recovery model/process of Drill ?
      > ( If one of the drillbit went down from cluster while processing the data
      > , how to recover them .)
      >
      > 2- If any of the minor Fragment id went down , Is Foreman able to recover
      > that automatically or how it will manage ?
      >
      > 3- Is the Foreman keep track of other drillbits which are involved on the
      > query executions.If yes how ? If no , then how it will manage if other
      > Drillbits went down .?
      >
      > Can some one help here .

      Attachments

        Activity

          People

            bbevens Bridget Bevens
            bbevens Bridget Bevens
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: