Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6667

Include internal data sets in documentation Sample Datasets

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 1.13.0
    • Fix Version/s: None
    • Component/s: Documentation
    • Labels:
      None

      Description

      The Drill documentation provides the "Sample Datasets" section, which is very handy. However, this section does not discuss the two datasets provided with Drill itself.

      • Julian Hyde's FoodMart data set, available on the class path.
      • TPC-H data set, available on the class path in tpch

      The "FoodMart" data set is available directly under cp. In fact, the Drill sample query (see below) references a FoodMart table. To see the list of tables (at development time), find the foodmark-data-json-0.4.jar file in the Maven dependencies for drill-java-exec. The table names here are simplified relative to those in the ER diagram in the above link. Perhaps include a simple table with names, and the mapping to the original names, and a link to (or just embed the link) to the FoodMart ER image. The data is available in JSON format.

      TPCH data is available in `cp`.`tpch/*.parquet`, in Parquet format. The schema is described in the TPC-H specification.

      These are very handy, but hard to find: I find I must keep searching the source code to remember file names and directory paths. End uses won't have this luxury.

      Suggestion: Describe the files available in the class path data source.

      Along these same lines, in "Connect a Data Source", there is no mention of the class path data source. Yet, we reference that data source in the Web Console where we suggest a sample query to run:

      Sample SQL query: SELECT * FROM cp.`employee.json` LIMIT 20
      

      The above query refers to the FoodMart data set.

        Attachments

          Activity

            People

            • Assignee:
              bbevens Bridget Bevens
              Reporter:
              paul-rogers Paul Rogers
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: