Details

    • Type: Task
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: Impala 2.8.0
    • Fix Version/s: Impala 2.8.0
    • Component/s: Infrastructure
    • Labels:

      Description

      Tracks work on Kudu stress testing.

      1) Adding support for all DML operations
      2) Running the Kudu stress tests, filing bugs

        Activity

        Hide
        tarasbob Taras Bobrovytsky added a comment -

        Stress test development for Kudu was completed.

        commit 2159beee89d463be9b69886c95ad73271db49280
        Author: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
        Date:   Mon Nov 14 17:07:06 2016 -0800
        
            IMPALA-4467: Add support for DML statements in stress test
            
            - Add support for insert, upsert, update and and delete statements.
            - Add support for compute stats with mt_dop query options.
            - Update impyla version in order to be able to have access to query
              error text for DML queries.
            - Made flake8 fixes. flake8 on this file is clean.
            
            For every Kudu table in the databases, we make a copy and add a
            '_original' suffix to the table name. The DML queries will only make
            modifications to the non original table, the original table will never
            be modified. The orignal tables could be used to bring the non-original
            table to the inital state. Two flags were added for doing this:
            --reset-databases-before-binary-search and
            --reset-databases-after-binary-search.
            
            The DML queries are generated based on the mod values passed in with the
            following flag: --dml-mod-values 11 13 17. For each mod value 4 DML
            queries are generated. The DML operations will touch table rows where
            primary_key % mod_value = 0. So, the larger the mod value, the more rows
            would be affected. The DML queries are generated in such a way that the
            data for the insert, upsert, and update queries is taken from the table
            with the _original suffix. The stress test generates DML queries for
            only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu
            --tpch-db=tpch_100 --generate-dml-queries would only generate queries
            for the tpch_100_kudu database.
            
            Here's an example of a full call with the new options that runs the
            stress test on the local mini cluster:
            ./concurrent_select.py \
                --tpch-kudu-db=tpch_kudu \
                --generate-dml-queries \
                --dml-mod-values 11 13 17 \
                --generate-compute-stats-queries \
                --select-probability=0.5 \
                --mem-limit-padding-pct=25 \
                --mem-limit-padding-abs=50 \
                --reset-databases-before-binary-search \
                --reset-databases-after-binary-search
            
            Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40
            Reviewed-on: http://gerrit.cloudera.org:8080/5093
            Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
            Tested-by: Impala Public Jenkins
        

        There is some work left to do related to enabling the nightly Kudu run and adding compute stats queries to the nightly run.
        Two follow up JIRAS were created IMPALA-4881 and IMPALA-4882.

        Show
        tarasbob Taras Bobrovytsky added a comment - Stress test development for Kudu was completed. commit 2159beee89d463be9b69886c95ad73271db49280 Author: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Date: Mon Nov 14 17:07:06 2016 -0800 IMPALA-4467: Add support for DML statements in stress test - Add support for insert, upsert, update and and delete statements. - Add support for compute stats with mt_dop query options. - Update impyla version in order to be able to have access to query error text for DML queries. - Made flake8 fixes. flake8 on this file is clean. For every Kudu table in the databases, we make a copy and add a '_original' suffix to the table name. The DML queries will only make modifications to the non original table, the original table will never be modified. The orignal tables could be used to bring the non-original table to the inital state. Two flags were added for doing this : --reset-databases-before-binary-search and --reset-databases-after-binary-search. The DML queries are generated based on the mod values passed in with the following flag: --dml-mod-values 11 13 17. For each mod value 4 DML queries are generated. The DML operations will touch table rows where primary_key % mod_value = 0. So, the larger the mod value, the more rows would be affected. The DML queries are generated in such a way that the data for the insert, upsert, and update queries is taken from the table with the _original suffix. The stress test generates DML queries for only kudu databases. For example, --tpch-kudu-db=tpch_100_kudu --tpch-db=tpch_100 --generate-dml-queries would only generate queries for the tpch_100_kudu database. Here's an example of a full call with the new options that runs the stress test on the local mini cluster: ./concurrent_select.py \ --tpch-kudu-db=tpch_kudu \ --generate-dml-queries \ --dml-mod-values 11 13 17 \ --generate-compute-stats-queries \ --select-probability=0.5 \ --mem-limit-padding-pct=25 \ --mem-limit-padding-abs=50 \ --reset-databases-before-binary-search \ --reset-databases-after-binary-search Change-Id: Ia2aafdc6851cc0e1677a3c668d3350e47c4bfe40 Reviewed-on: http: //gerrit.cloudera.org:8080/5093 Reviewed-by: Taras Bobrovytsky <tbobrovytsky@cloudera.com> Tested-by: Impala Public Jenkins There is some work left to do related to enabling the nightly Kudu run and adding compute stats queries to the nightly run. Two follow up JIRAS were created IMPALA-4881 and IMPALA-4882 .

          People

          • Assignee:
            tarasbob Taras Bobrovytsky
            Reporter:
            mjacobs Matthew Jacobs
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development