Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6873

Cluster of drillbits on local files expects same set of filenames on all nodes

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 1.14.0
    • None
    • Storage - JSON
    • None
    • Drill v1.14.0

      Zookeeper 3.4.13

      Centos 7.5

       

    Description

      Running drillbits on multiple servers with Zookeeper but without HDFS - local filesystems. When file storage is configured to a common path, but not all filenames are present on all nodes, errors are thrown:

          Error: DATA_READ ERROR: Failure reading JSON file - File file:/localdata/logs/fileX.json.gz does not exist

      Example use case: Querying log files on multiple machines as a ZK cluster from their local filesystems without moving them to a distributed file system which may not be in use.

      Is there a (planned) configuration option to simply skip filenames that exist on some but not all nodes?

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            mattk Matt Keranen

            Dates

              Created:
              Updated:

              Slack

                Issue deployment