Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6873

Cluster of drillbits on local files expects same set of filenames on all nodes

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 1.14.0
    • Fix Version/s: None
    • Component/s: Storage - JSON
    • Labels:
      None
    • Environment:

      Drill v1.14.0

      Zookeeper 3.4.13

      Centos 7.5

       

      Description

      Running drillbits on multiple servers with Zookeeper but without HDFS - local filesystems. When file storage is configured to a common path, but not all filenames are present on all nodes, errors are thrown:

          Error: DATA_READ ERROR: Failure reading JSON file - File file:/localdata/logs/fileX.json.gz does not exist

      Example use case: Querying log files on multiple machines as a ZK cluster from their local filesystems without moving them to a distributed file system which may not be in use.

      Is there a (planned) configuration option to simply skip filenames that exist on some but not all nodes?

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              mattk Matt Keranen
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: