Uploaded image for project: 'Calcite'
  1. Calcite
  2. CALCITE-1896

sqlsh: SQL access to shell commands, as a shell command

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.14.0
    • Component/s: None
    • Labels:
      None

      Description

      Bourne shell has a toolkit of commands that allow you to do relational processing: grep = WHERE, sort = ORDER BY, uniq = SELECT DISTINCT, head = OFFSET/LIMIT, and so forth. Shell hackers are accustomed to writing pipelines. For example, to find the 3 largest files one would type

      $ find . -type f -print0 |xargs -0 ls -l  | sort -nr -k 5 | head -3
      
      -rw-r--r--  1 jhyde  staff  416028 Jul 16 10:06 ./core/src/test/java/org/apache/calcite/test/SqlValidatorTest.java
      -rw-r--r--  1 jhyde  staff  325727 Jun 28 11:48 ./core/src/test/java/org/apache/calcite/test/JdbcTest.java
      -rw-r--r--  1 jhyde  staff  325106 Jul 16 10:06 ./core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml
      

      SQL is not quite as concise but nevertheless:

      $ sqlsh select \* from files where not directory order by size desc limit 3
      

      sqlsh is "SQL shell", a wrapper around Calcite JDBC. files is a view backed by a table function that calls "find .". Other commands:

      • du table function - e.g. select path, size_k from du
      • git_ls_files table function calls git-ls-files - e.g. select * from git_ls_files
      • ps table function calls ps aux - e.g. select * from ps
      • wc function calls wc - e.g. select path, lineCount from git_ls_files cross apply wc(path)

      SQL would run in a lexical mode which is case-sensitive, and identifiers are not upper-cased if not quoted.

      We could consider allowing shell-safe characters such as '-' in unquoted identifiers. (It's difficult to quote identifiers in SQL if the SQL has already passed through bash's quote handling, and had double-quotes and single-quotes removed.)

      It gets really interesting when commands accept arguments, and Calcite pushes down filters to become those arguments. For example, in

      sqlsh select distinct author from git_ls_files join git_commit_files using \(path\) join git_commits using \(commit\)

      Calcite should rewrite to use git log.

      When accessing files, some kinds of files have implicit names (and types) for fields, and some don't. You should be able to access fields by name or position; the following are equivalent:

      sqlsh select gid from /etc/passwd where uid = 100
      sqlsh select $4 from /etc/passwd where $3 = 100
      

        Attachments

          Activity

            People

            • Assignee:
              julianhyde Julian Hyde
              Reporter:
              julianhyde Julian Hyde
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: