Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: Impala 2.9.0
    • Component/s: Frontend
    • Labels:
      None
    • Epic Color:
      ghx-label-7

      Issue Links

        Activity

        Hide
        alex.behm Alexander Behm added a comment -

        First patch had a small bug with repeatability: IMPALA-5358

        Show
        alex.behm Alexander Behm added a comment - First patch had a small bug with repeatability: IMPALA-5358
        Hide
        alex.behm Alexander Behm added a comment -

        commit ee0fc260d1420b34a3d3fb1073fe80b3c63a9ab9
        Author: Alex Behm <alex.behm@cloudera.com>
        Date: Tue May 9 22:02:29 2017 -0700

        IMPALA-5309: Adds TABLESAMPLE clause for HDFS table refs.

        Syntax:
        <tableref> TABLESAMPLE SYSTEM(<number>) [REPEATABLE(<number>)]
        The first number specifies the percent of table bytes to sample.
        The second number specifies the random seed to use.

        The sampling is coarse-grained. Impala keeps randomly adding
        files to the sample until at least the desired percentage of
        file bytes have been reached.

        Examples:
        SELECT * FROM t TABLESAMPLE SYSTEM(10)
        SELECT * FROM t TABLESAMPLE SYSTEM(50) REPEATABLE(1234)

        Testing:

        • Added parser, analyser, planner, and end-to-end tests
        • Private core/hdfs run passed

        Change-Id: Ief112cfb1e4983c5d94c08696dc83da9ccf43f70
        Reviewed-on: http://gerrit.cloudera.org:8080/6868
        Reviewed-by: Alex Behm <alex.behm@cloudera.com>
        Tested-by: Impala Public Jenkins

        Show
        alex.behm Alexander Behm added a comment - commit ee0fc260d1420b34a3d3fb1073fe80b3c63a9ab9 Author: Alex Behm <alex.behm@cloudera.com> Date: Tue May 9 22:02:29 2017 -0700 IMPALA-5309 : Adds TABLESAMPLE clause for HDFS table refs. Syntax: <tableref> TABLESAMPLE SYSTEM(<number>) [REPEATABLE(<number>)] The first number specifies the percent of table bytes to sample. The second number specifies the random seed to use. The sampling is coarse-grained. Impala keeps randomly adding files to the sample until at least the desired percentage of file bytes have been reached. Examples: SELECT * FROM t TABLESAMPLE SYSTEM(10) SELECT * FROM t TABLESAMPLE SYSTEM(50) REPEATABLE(1234) Testing: Added parser, analyser, planner, and end-to-end tests Private core/hdfs run passed Change-Id: Ief112cfb1e4983c5d94c08696dc83da9ccf43f70 Reviewed-on: http://gerrit.cloudera.org:8080/6868 Reviewed-by: Alex Behm <alex.behm@cloudera.com> Tested-by: Impala Public Jenkins

          People

          • Assignee:
            alex.behm Alexander Behm
            Reporter:
            alex.behm Alexander Behm
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development