Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1963

Add more configuration descriptions to document

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.0, 0.11.1
    • Component/s: Documentation
    • Labels:
      None

      Description

      In our docuemnt (http://tajo.apache.org/docs/devel/configuration/tajo-site-xml.html), there are a lot of missing configurations.

        Activity

        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user jihoonson opened a pull request:

        https://github.com/apache/tajo/pull/844

        TAJO-1963: Add more configuration descriptions to document

        I also fixed a wrong configuration name.

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/jihoonson/tajo-2 TAJO-1963

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/844.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #844


        commit 0b9bd167440b5e872f7ef02bae366d24e30e475d
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-11-05T07:43:47Z

        Add a document and fixed a wrong configuration name


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user jihoonson opened a pull request: https://github.com/apache/tajo/pull/844 TAJO-1963 : Add more configuration descriptions to document I also fixed a wrong configuration name. You can merge this pull request into a Git repository by running: $ git pull https://github.com/jihoonson/tajo-2 TAJO-1963 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/844.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #844 commit 0b9bd167440b5e872f7ef02bae366d24e30e475d Author: Jihoon Son <jihoonson@apache.org> Date: 2015-11-05T07:43:47Z Add a document and fixed a wrong configuration name
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/844#issuecomment-153982075

        You can see the updated document here.
        http://people.apache.org/~jihoonson/tajo-docs/

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/844#issuecomment-153982075 You can see the updated document here. http://people.apache.org/~jihoonson/tajo-docs/
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user eminency commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/844#discussion_r44098591

        — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst —
        @@ -2,23 +2,455 @@
        The tajo-site.xml File
        **********************

        -To the ``core-site.xml`` file on every host in your cluster, you must add the following information:
        +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited.
        +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`.
        +Also, catalog configurations are found here :doc:`catalog_configuration`.
        +
        +=========================
        +Join Query Settings
        +=========================
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.auto-broadcast`
        +""""""""""""""""""""""""""""""""""""""
        +
        +A flag to enable or disable the use of broadcast join.
        +
        + * Property value: Boolean
        — End diff –

        IMO, 'property value type' looks clearer.

        Show
        githubbot ASF GitHub Bot added a comment - Github user eminency commented on a diff in the pull request: https://github.com/apache/tajo/pull/844#discussion_r44098591 — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst — @@ -2,23 +2,455 @@ The tajo-site.xml File ********************** -To the ``core-site.xml`` file on every host in your cluster, you must add the following information: +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited. +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`. +Also, catalog configurations are found here :doc:`catalog_configuration`. + +========================= +Join Query Settings +========================= + +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.auto-broadcast` +"""""""""""""""""""""""""""""""""""""" + +A flag to enable or disable the use of broadcast join. + + * Property value: Boolean — End diff – IMO, 'property value type' looks clearer.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user eminency commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/844#discussion_r44100331

        — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst —
        @@ -2,23 +2,455 @@
        The tajo-site.xml File
        **********************

        -To the ``core-site.xml`` file on every host in your cluster, you must add the following information:
        +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited.
        +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`.
        +Also, catalog configurations are found here :doc:`catalog_configuration`.
        +
        +=========================
        +Join Query Settings
        +=========================
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.auto-broadcast`
        +""""""""""""""""""""""""""""""""""""""
        +
        +A flag to enable or disable the use of broadcast join.
        +
        + * Property value: Boolean
        + * Default value: true
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.auto-broadcast</name>
        + <value>true</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.broadcast.non-cross-join.threshold-kb`
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold for non-cross joins. When a non-cross join query is executed with the broadcast join, the whole size of broadcasted tables won't exceed this threshold.
        +
        + * Property value: Integer
        + * Unit: KB
        + * Default value: 5120
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.broadcast.non-cross-join.threshold-kb</name>
        + <value>5120</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.broadcast.cross-join.threshold-kb`
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold for cross joins. When a cross join query is executed, the whole size of broadcasted tables won't exceed this threshold.
        +
        + * Property value: Integer
        + * Unit: KB
        + * Default value: 1024
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.broadcast.cross-join.threshold-kb</name>
        + <value>1024</value>
        + </property>
        +
        +.. warning::
        + In Tajo, the broadcast join is only the way to perform cross joins. Since the cross join is a very expensive operation, this value need to be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""
        +
        +The repartition join is executed in two stages. When a join query is executed with the repartition join, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the join query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.task-volume-mb</name>
        + <value>64</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.partition-volume-mb`
        +"""""""""""""""""""""""""""""""""""""""""""
        +
        +The repartition join is executed in two stages. When a join query is executed with the repartition join,
        +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 128
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.partition-volume-mb</name>
        + <value>128</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.common.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform a join in a task.
        +If the input data is smaller than this value, join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.common.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.inner.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform an inner join in a task.
        +If the input data is smaller than this value, the inner join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.inner.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.outer.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform an outer join in a task.
        +If the input data is smaller than this value, the outer join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.outer.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +"""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.hash-table.size`
        +"""""""""""""""""""""""""""""""""""""
        +
        +The initial size of hash table for in-memory hash join.
        +
        + * Property value: Integer
        + * Default value: 100000
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.hash-table.size</name>
        + <value>100000</value>
        + </property>

        ======================
        -System Config
        +Sort Query Settings
        ======================

        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.sort.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""
        +
        +The sort operation is executed in two stages. When a sort query is executed, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the sort query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.sort.task-volume-mb</name>
        + <value>64</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.external-sort.buffer-mb`
        +""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold to choose the sort algorithm. If the input data is larger than this threshold, the external sort algorithm is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 200
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.external-sort.buffer-mb</name>
        + <value>200</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.sort.list.size`
        +""""""""""""""""""""""""""""""""""""""

        +The initial size of list for in-memory sort.
        +
        + * Property value: Integer
        + * Default value: 100000
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.sort.list.size</name>
        + <value>100000</value>
        + </property>
        +
        +=========================
        +Group by Query Settings
        +=========================
        +
        +""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.multi-level-aggr`
        +""""""""""""""""""""""""""""""""""""""""""""
        +
        +A flag to enable the multi-level algorithm for distinct aggregation. If this value is set, 3-phase aggregation algorithm is used.
        +Otherwise, 2-phase aggregation algorithm is used.
        +
        + * Property value: Boolean
        + * Default value: true
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.multi-level-aggr</name>
        + <value>true</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.partition-volume-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""
        +
        +The aggregation is executed in two stages. When an aggregation query is executed,
        +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 256
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.partition-volume-mb</name>
        + <value>256</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""
        +
        +The aggregation operation is executed in two stages. When an aggregation query is executed, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the aggregation query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.partition-volume-mb</name>
        — End diff –

        The title is written with 'task', but example is done with 'partition'

        Show
        githubbot ASF GitHub Bot added a comment - Github user eminency commented on a diff in the pull request: https://github.com/apache/tajo/pull/844#discussion_r44100331 — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst — @@ -2,23 +2,455 @@ The tajo-site.xml File ********************** -To the ``core-site.xml`` file on every host in your cluster, you must add the following information: +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited. +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`. +Also, catalog configurations are found here :doc:`catalog_configuration`. + +========================= +Join Query Settings +========================= + +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.auto-broadcast` +"""""""""""""""""""""""""""""""""""""" + +A flag to enable or disable the use of broadcast join. + + * Property value: Boolean + * Default value: true + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.auto-broadcast</name> + <value>true</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.broadcast.non-cross-join.threshold-kb` +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +A threshold for non-cross joins. When a non-cross join query is executed with the broadcast join, the whole size of broadcasted tables won't exceed this threshold. + + * Property value: Integer + * Unit: KB + * Default value: 5120 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.broadcast.non-cross-join.threshold-kb</name> + <value>5120</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.broadcast.cross-join.threshold-kb` +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +A threshold for cross joins. When a cross join query is executed, the whole size of broadcasted tables won't exceed this threshold. + + * Property value: Integer + * Unit: KB + * Default value: 1024 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.broadcast.cross-join.threshold-kb</name> + <value>1024</value> + </property> + +.. warning:: + In Tajo, the broadcast join is only the way to perform cross joins. Since the cross join is a very expensive operation, this value need to be tuned carefully. + +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.task-volume-mb` +"""""""""""""""""""""""""""""""""""""" + +The repartition join is executed in two stages. When a join query is executed with the repartition join, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the join query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.task-volume-mb</name> + <value>64</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.partition-volume-mb` +""""""""""""""""""""""""""""""""""""""""""" + +The repartition join is executed in two stages. When a join query is executed with the repartition join, +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages. + + * Property value: Integer + * Unit: MB + * Default value: 128 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.partition-volume-mb</name> + <value>128</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.common.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform a join in a task. +If the input data is smaller than this value, join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.common.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.inner.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform an inner join in a task. +If the input data is smaller than this value, the inner join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.inner.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.outer.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform an outer join in a task. +If the input data is smaller than this value, the outer join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.outer.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.hash-table.size` +""""""""""""""""""""""""""""""""""""" + +The initial size of hash table for in-memory hash join. + + * Property value: Integer + * Default value: 100000 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.hash-table.size</name> + <value>100000</value> + </property> ====================== -System Config +Sort Query Settings ====================== +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.sort.task-volume-mb` +"""""""""""""""""""""""""""""""""""""" + +The sort operation is executed in two stages. When a sort query is executed, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the sort query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.sort.task-volume-mb</name> + <value>64</value> + </property> + +"""""""""""""""""""""""""""""""""""""""" +`tajo.executor.external-sort.buffer-mb` +"""""""""""""""""""""""""""""""""""""""" + +A threshold to choose the sort algorithm. If the input data is larger than this threshold, the external sort algorithm is used. + + * Property value: Integer + * Unit: MB + * Default value: 200 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.external-sort.buffer-mb</name> + <value>200</value> + </property> + +"""""""""""""""""""""""""""""""""""""" +`tajo.executor.sort.list.size` +"""""""""""""""""""""""""""""""""""""" +The initial size of list for in-memory sort. + + * Property value: Integer + * Default value: 100000 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.sort.list.size</name> + <value>100000</value> + </property> + +========================= +Group by Query Settings +========================= + +"""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.multi-level-aggr` +"""""""""""""""""""""""""""""""""""""""""""" + +A flag to enable the multi-level algorithm for distinct aggregation. If this value is set, 3-phase aggregation algorithm is used. +Otherwise, 2-phase aggregation algorithm is used. + + * Property value: Boolean + * Default value: true + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.multi-level-aggr</name> + <value>true</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.partition-volume-mb` +"""""""""""""""""""""""""""""""""""""""""""""" + +The aggregation is executed in two stages. When an aggregation query is executed, +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages. + + * Property value: Integer + * Unit: MB + * Default value: 256 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.partition-volume-mb</name> + <value>256</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.task-volume-mb` +"""""""""""""""""""""""""""""""""""""""""""""" + +The aggregation operation is executed in two stages. When an aggregation query is executed, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the aggregation query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.partition-volume-mb</name> — End diff – The title is written with 'task', but example is done with 'partition'
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user eminency commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/844#discussion_r44100456

        — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst —
        @@ -2,23 +2,455 @@
        The tajo-site.xml File
        **********************

        -To the ``core-site.xml`` file on every host in your cluster, you must add the following information:
        +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited.
        +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`.
        +Also, catalog configurations are found here :doc:`catalog_configuration`.
        +
        +=========================
        +Join Query Settings
        +=========================
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.auto-broadcast`
        +""""""""""""""""""""""""""""""""""""""
        +
        +A flag to enable or disable the use of broadcast join.
        +
        + * Property value: Boolean
        + * Default value: true
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.auto-broadcast</name>
        + <value>true</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.broadcast.non-cross-join.threshold-kb`
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold for non-cross joins. When a non-cross join query is executed with the broadcast join, the whole size of broadcasted tables won't exceed this threshold.
        +
        + * Property value: Integer
        + * Unit: KB
        + * Default value: 5120
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.broadcast.non-cross-join.threshold-kb</name>
        + <value>5120</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.broadcast.cross-join.threshold-kb`
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold for cross joins. When a cross join query is executed, the whole size of broadcasted tables won't exceed this threshold.
        +
        + * Property value: Integer
        + * Unit: KB
        + * Default value: 1024
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.broadcast.cross-join.threshold-kb</name>
        + <value>1024</value>
        + </property>
        +
        +.. warning::
        + In Tajo, the broadcast join is only the way to perform cross joins. Since the cross join is a very expensive operation, this value need to be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""
        +
        +The repartition join is executed in two stages. When a join query is executed with the repartition join, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the join query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.task-volume-mb</name>
        + <value>64</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.partition-volume-mb`
        +"""""""""""""""""""""""""""""""""""""""""""
        +
        +The repartition join is executed in two stages. When a join query is executed with the repartition join,
        +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 128
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.partition-volume-mb</name>
        + <value>128</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.common.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform a join in a task.
        +If the input data is smaller than this value, join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.common.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.inner.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform an inner join in a task.
        +If the input data is smaller than this value, the inner join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.inner.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.outer.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform an outer join in a task.
        +If the input data is smaller than this value, the outer join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.outer.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +"""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.hash-table.size`
        +"""""""""""""""""""""""""""""""""""""
        +
        +The initial size of hash table for in-memory hash join.
        +
        + * Property value: Integer
        + * Default value: 100000
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.hash-table.size</name>
        + <value>100000</value>
        + </property>

        ======================
        -System Config
        +Sort Query Settings
        ======================

        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.sort.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""
        +
        +The sort operation is executed in two stages. When a sort query is executed, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the sort query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.sort.task-volume-mb</name>
        + <value>64</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.external-sort.buffer-mb`
        +""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold to choose the sort algorithm. If the input data is larger than this threshold, the external sort algorithm is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 200
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.external-sort.buffer-mb</name>
        + <value>200</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.sort.list.size`
        +""""""""""""""""""""""""""""""""""""""

        +The initial size of list for in-memory sort.
        +
        + * Property value: Integer
        + * Default value: 100000
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.sort.list.size</name>
        + <value>100000</value>
        + </property>
        +
        +=========================
        +Group by Query Settings
        +=========================
        +
        +""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.multi-level-aggr`
        +""""""""""""""""""""""""""""""""""""""""""""
        +
        +A flag to enable the multi-level algorithm for distinct aggregation. If this value is set, 3-phase aggregation algorithm is used.
        +Otherwise, 2-phase aggregation algorithm is used.
        +
        + * Property value: Boolean
        + * Default value: true
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.multi-level-aggr</name>
        + <value>true</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.partition-volume-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""
        +
        +The aggregation is executed in two stages. When an aggregation query is executed,
        +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 256
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.partition-volume-mb</name>
        + <value>256</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""
        +
        +The aggregation operation is executed in two stages. When an aggregation query is executed, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the aggregation query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.partition-volume-mb</name>
        + <value>64</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.groupby.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform an aggregation in a task.
        +If the input data is smaller than this value, the aggregation is performed with the in-memory hash aggregation.
        +Otherwise, the sort-based aggregation is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.groupby.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.aggregate.hash-table.size`
        +""""""""""""""""""""""""""""""""""""""""""
        +
        +The initial size of list for in-memory sort.
        — End diff –

        Description explains for list size, but property name looks that it means hash table size.

        Show
        githubbot ASF GitHub Bot added a comment - Github user eminency commented on a diff in the pull request: https://github.com/apache/tajo/pull/844#discussion_r44100456 — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst — @@ -2,23 +2,455 @@ The tajo-site.xml File ********************** -To the ``core-site.xml`` file on every host in your cluster, you must add the following information: +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited. +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`. +Also, catalog configurations are found here :doc:`catalog_configuration`. + +========================= +Join Query Settings +========================= + +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.auto-broadcast` +"""""""""""""""""""""""""""""""""""""" + +A flag to enable or disable the use of broadcast join. + + * Property value: Boolean + * Default value: true + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.auto-broadcast</name> + <value>true</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.broadcast.non-cross-join.threshold-kb` +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +A threshold for non-cross joins. When a non-cross join query is executed with the broadcast join, the whole size of broadcasted tables won't exceed this threshold. + + * Property value: Integer + * Unit: KB + * Default value: 5120 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.broadcast.non-cross-join.threshold-kb</name> + <value>5120</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.broadcast.cross-join.threshold-kb` +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +A threshold for cross joins. When a cross join query is executed, the whole size of broadcasted tables won't exceed this threshold. + + * Property value: Integer + * Unit: KB + * Default value: 1024 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.broadcast.cross-join.threshold-kb</name> + <value>1024</value> + </property> + +.. warning:: + In Tajo, the broadcast join is only the way to perform cross joins. Since the cross join is a very expensive operation, this value need to be tuned carefully. + +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.task-volume-mb` +"""""""""""""""""""""""""""""""""""""" + +The repartition join is executed in two stages. When a join query is executed with the repartition join, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the join query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.task-volume-mb</name> + <value>64</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.partition-volume-mb` +""""""""""""""""""""""""""""""""""""""""""" + +The repartition join is executed in two stages. When a join query is executed with the repartition join, +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages. + + * Property value: Integer + * Unit: MB + * Default value: 128 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.partition-volume-mb</name> + <value>128</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.common.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform a join in a task. +If the input data is smaller than this value, join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.common.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.inner.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform an inner join in a task. +If the input data is smaller than this value, the inner join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.inner.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.outer.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform an outer join in a task. +If the input data is smaller than this value, the outer join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.outer.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.hash-table.size` +""""""""""""""""""""""""""""""""""""" + +The initial size of hash table for in-memory hash join. + + * Property value: Integer + * Default value: 100000 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.hash-table.size</name> + <value>100000</value> + </property> ====================== -System Config +Sort Query Settings ====================== +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.sort.task-volume-mb` +"""""""""""""""""""""""""""""""""""""" + +The sort operation is executed in two stages. When a sort query is executed, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the sort query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.sort.task-volume-mb</name> + <value>64</value> + </property> + +"""""""""""""""""""""""""""""""""""""""" +`tajo.executor.external-sort.buffer-mb` +"""""""""""""""""""""""""""""""""""""""" + +A threshold to choose the sort algorithm. If the input data is larger than this threshold, the external sort algorithm is used. + + * Property value: Integer + * Unit: MB + * Default value: 200 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.external-sort.buffer-mb</name> + <value>200</value> + </property> + +"""""""""""""""""""""""""""""""""""""" +`tajo.executor.sort.list.size` +"""""""""""""""""""""""""""""""""""""" +The initial size of list for in-memory sort. + + * Property value: Integer + * Default value: 100000 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.sort.list.size</name> + <value>100000</value> + </property> + +========================= +Group by Query Settings +========================= + +"""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.multi-level-aggr` +"""""""""""""""""""""""""""""""""""""""""""" + +A flag to enable the multi-level algorithm for distinct aggregation. If this value is set, 3-phase aggregation algorithm is used. +Otherwise, 2-phase aggregation algorithm is used. + + * Property value: Boolean + * Default value: true + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.multi-level-aggr</name> + <value>true</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.partition-volume-mb` +"""""""""""""""""""""""""""""""""""""""""""""" + +The aggregation is executed in two stages. When an aggregation query is executed, +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages. + + * Property value: Integer + * Unit: MB + * Default value: 256 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.partition-volume-mb</name> + <value>256</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.task-volume-mb` +"""""""""""""""""""""""""""""""""""""""""""""" + +The aggregation operation is executed in two stages. When an aggregation query is executed, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the aggregation query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.partition-volume-mb</name> + <value>64</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.groupby.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform an aggregation in a task. +If the input data is smaller than this value, the aggregation is performed with the in-memory hash aggregation. +Otherwise, the sort-based aggregation is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.groupby.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +"""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.aggregate.hash-table.size` +"""""""""""""""""""""""""""""""""""""""""" + +The initial size of list for in-memory sort. — End diff – Description explains for list size, but property name looks that it means hash table size.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user eminency commented on the pull request:

        https://github.com/apache/tajo/pull/844#issuecomment-154274813

        I leave some comments.

        Show
        githubbot ASF GitHub Bot added a comment - Github user eminency commented on the pull request: https://github.com/apache/tajo/pull/844#issuecomment-154274813 I leave some comments.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/844#discussion_r44103172

        — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst —
        @@ -2,23 +2,455 @@
        The tajo-site.xml File
        **********************

        -To the ``core-site.xml`` file on every host in your cluster, you must add the following information:
        +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited.
        +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`.
        +Also, catalog configurations are found here :doc:`catalog_configuration`.
        +
        +=========================
        +Join Query Settings
        +=========================
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.auto-broadcast`
        +""""""""""""""""""""""""""""""""""""""
        +
        +A flag to enable or disable the use of broadcast join.
        +
        + * Property value: Boolean
        — End diff –

        Thank you for the good comment.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/844#discussion_r44103172 — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst — @@ -2,23 +2,455 @@ The tajo-site.xml File ********************** -To the ``core-site.xml`` file on every host in your cluster, you must add the following information: +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited. +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`. +Also, catalog configurations are found here :doc:`catalog_configuration`. + +========================= +Join Query Settings +========================= + +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.auto-broadcast` +"""""""""""""""""""""""""""""""""""""" + +A flag to enable or disable the use of broadcast join. + + * Property value: Boolean — End diff – Thank you for the good comment.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/844#discussion_r44103177

        — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst —
        @@ -2,23 +2,455 @@
        The tajo-site.xml File
        **********************

        -To the ``core-site.xml`` file on every host in your cluster, you must add the following information:
        +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited.
        +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`.
        +Also, catalog configurations are found here :doc:`catalog_configuration`.
        +
        +=========================
        +Join Query Settings
        +=========================
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.auto-broadcast`
        +""""""""""""""""""""""""""""""""""""""
        +
        +A flag to enable or disable the use of broadcast join.
        +
        + * Property value: Boolean
        + * Default value: true
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.auto-broadcast</name>
        + <value>true</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.broadcast.non-cross-join.threshold-kb`
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold for non-cross joins. When a non-cross join query is executed with the broadcast join, the whole size of broadcasted tables won't exceed this threshold.
        +
        + * Property value: Integer
        + * Unit: KB
        + * Default value: 5120
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.broadcast.non-cross-join.threshold-kb</name>
        + <value>5120</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.broadcast.cross-join.threshold-kb`
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold for cross joins. When a cross join query is executed, the whole size of broadcasted tables won't exceed this threshold.
        +
        + * Property value: Integer
        + * Unit: KB
        + * Default value: 1024
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.broadcast.cross-join.threshold-kb</name>
        + <value>1024</value>
        + </property>
        +
        +.. warning::
        + In Tajo, the broadcast join is only the way to perform cross joins. Since the cross join is a very expensive operation, this value need to be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""
        +
        +The repartition join is executed in two stages. When a join query is executed with the repartition join, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the join query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.task-volume-mb</name>
        + <value>64</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.partition-volume-mb`
        +"""""""""""""""""""""""""""""""""""""""""""
        +
        +The repartition join is executed in two stages. When a join query is executed with the repartition join,
        +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 128
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.partition-volume-mb</name>
        + <value>128</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.common.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform a join in a task.
        +If the input data is smaller than this value, join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.common.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.inner.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform an inner join in a task.
        +If the input data is smaller than this value, the inner join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.inner.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.outer.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform an outer join in a task.
        +If the input data is smaller than this value, the outer join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.outer.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +"""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.hash-table.size`
        +"""""""""""""""""""""""""""""""""""""
        +
        +The initial size of hash table for in-memory hash join.
        +
        + * Property value: Integer
        + * Default value: 100000
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.hash-table.size</name>
        + <value>100000</value>
        + </property>

        ======================
        -System Config
        +Sort Query Settings
        ======================

        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.sort.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""
        +
        +The sort operation is executed in two stages. When a sort query is executed, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the sort query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.sort.task-volume-mb</name>
        + <value>64</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.external-sort.buffer-mb`
        +""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold to choose the sort algorithm. If the input data is larger than this threshold, the external sort algorithm is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 200
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.external-sort.buffer-mb</name>
        + <value>200</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.sort.list.size`
        +""""""""""""""""""""""""""""""""""""""

        +The initial size of list for in-memory sort.
        +
        + * Property value: Integer
        + * Default value: 100000
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.sort.list.size</name>
        + <value>100000</value>
        + </property>
        +
        +=========================
        +Group by Query Settings
        +=========================
        +
        +""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.multi-level-aggr`
        +""""""""""""""""""""""""""""""""""""""""""""
        +
        +A flag to enable the multi-level algorithm for distinct aggregation. If this value is set, 3-phase aggregation algorithm is used.
        +Otherwise, 2-phase aggregation algorithm is used.
        +
        + * Property value: Boolean
        + * Default value: true
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.multi-level-aggr</name>
        + <value>true</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.partition-volume-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""
        +
        +The aggregation is executed in two stages. When an aggregation query is executed,
        +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 256
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.partition-volume-mb</name>
        + <value>256</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""
        +
        +The aggregation operation is executed in two stages. When an aggregation query is executed, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the aggregation query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.partition-volume-mb</name>
        — End diff –

        My mistake. Thanks.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/844#discussion_r44103177 — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst — @@ -2,23 +2,455 @@ The tajo-site.xml File ********************** -To the ``core-site.xml`` file on every host in your cluster, you must add the following information: +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited. +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`. +Also, catalog configurations are found here :doc:`catalog_configuration`. + +========================= +Join Query Settings +========================= + +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.auto-broadcast` +"""""""""""""""""""""""""""""""""""""" + +A flag to enable or disable the use of broadcast join. + + * Property value: Boolean + * Default value: true + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.auto-broadcast</name> + <value>true</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.broadcast.non-cross-join.threshold-kb` +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +A threshold for non-cross joins. When a non-cross join query is executed with the broadcast join, the whole size of broadcasted tables won't exceed this threshold. + + * Property value: Integer + * Unit: KB + * Default value: 5120 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.broadcast.non-cross-join.threshold-kb</name> + <value>5120</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.broadcast.cross-join.threshold-kb` +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +A threshold for cross joins. When a cross join query is executed, the whole size of broadcasted tables won't exceed this threshold. + + * Property value: Integer + * Unit: KB + * Default value: 1024 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.broadcast.cross-join.threshold-kb</name> + <value>1024</value> + </property> + +.. warning:: + In Tajo, the broadcast join is only the way to perform cross joins. Since the cross join is a very expensive operation, this value need to be tuned carefully. + +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.task-volume-mb` +"""""""""""""""""""""""""""""""""""""" + +The repartition join is executed in two stages. When a join query is executed with the repartition join, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the join query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.task-volume-mb</name> + <value>64</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.partition-volume-mb` +""""""""""""""""""""""""""""""""""""""""""" + +The repartition join is executed in two stages. When a join query is executed with the repartition join, +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages. + + * Property value: Integer + * Unit: MB + * Default value: 128 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.partition-volume-mb</name> + <value>128</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.common.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform a join in a task. +If the input data is smaller than this value, join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.common.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.inner.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform an inner join in a task. +If the input data is smaller than this value, the inner join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.inner.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.outer.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform an outer join in a task. +If the input data is smaller than this value, the outer join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.outer.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.hash-table.size` +""""""""""""""""""""""""""""""""""""" + +The initial size of hash table for in-memory hash join. + + * Property value: Integer + * Default value: 100000 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.hash-table.size</name> + <value>100000</value> + </property> ====================== -System Config +Sort Query Settings ====================== +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.sort.task-volume-mb` +"""""""""""""""""""""""""""""""""""""" + +The sort operation is executed in two stages. When a sort query is executed, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the sort query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.sort.task-volume-mb</name> + <value>64</value> + </property> + +"""""""""""""""""""""""""""""""""""""""" +`tajo.executor.external-sort.buffer-mb` +"""""""""""""""""""""""""""""""""""""""" + +A threshold to choose the sort algorithm. If the input data is larger than this threshold, the external sort algorithm is used. + + * Property value: Integer + * Unit: MB + * Default value: 200 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.external-sort.buffer-mb</name> + <value>200</value> + </property> + +"""""""""""""""""""""""""""""""""""""" +`tajo.executor.sort.list.size` +"""""""""""""""""""""""""""""""""""""" +The initial size of list for in-memory sort. + + * Property value: Integer + * Default value: 100000 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.sort.list.size</name> + <value>100000</value> + </property> + +========================= +Group by Query Settings +========================= + +"""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.multi-level-aggr` +"""""""""""""""""""""""""""""""""""""""""""" + +A flag to enable the multi-level algorithm for distinct aggregation. If this value is set, 3-phase aggregation algorithm is used. +Otherwise, 2-phase aggregation algorithm is used. + + * Property value: Boolean + * Default value: true + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.multi-level-aggr</name> + <value>true</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.partition-volume-mb` +"""""""""""""""""""""""""""""""""""""""""""""" + +The aggregation is executed in two stages. When an aggregation query is executed, +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages. + + * Property value: Integer + * Unit: MB + * Default value: 256 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.partition-volume-mb</name> + <value>256</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.task-volume-mb` +"""""""""""""""""""""""""""""""""""""""""""""" + +The aggregation operation is executed in two stages. When an aggregation query is executed, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the aggregation query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.partition-volume-mb</name> — End diff – My mistake. Thanks.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on a diff in the pull request:

        https://github.com/apache/tajo/pull/844#discussion_r44103181

        — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst —
        @@ -2,23 +2,455 @@
        The tajo-site.xml File
        **********************

        -To the ``core-site.xml`` file on every host in your cluster, you must add the following information:
        +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited.
        +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`.
        +Also, catalog configurations are found here :doc:`catalog_configuration`.
        +
        +=========================
        +Join Query Settings
        +=========================
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.auto-broadcast`
        +""""""""""""""""""""""""""""""""""""""
        +
        +A flag to enable or disable the use of broadcast join.
        +
        + * Property value: Boolean
        + * Default value: true
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.auto-broadcast</name>
        + <value>true</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.broadcast.non-cross-join.threshold-kb`
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold for non-cross joins. When a non-cross join query is executed with the broadcast join, the whole size of broadcasted tables won't exceed this threshold.
        +
        + * Property value: Integer
        + * Unit: KB
        + * Default value: 5120
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.broadcast.non-cross-join.threshold-kb</name>
        + <value>5120</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.broadcast.cross-join.threshold-kb`
        +"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold for cross joins. When a cross join query is executed, the whole size of broadcasted tables won't exceed this threshold.
        +
        + * Property value: Integer
        + * Unit: KB
        + * Default value: 1024
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.broadcast.cross-join.threshold-kb</name>
        + <value>1024</value>
        + </property>
        +
        +.. warning::
        + In Tajo, the broadcast join is only the way to perform cross joins. Since the cross join is a very expensive operation, this value need to be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""
        +
        +The repartition join is executed in two stages. When a join query is executed with the repartition join, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the join query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.task-volume-mb</name>
        + <value>64</value>
        + </property>
        +
        +"""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.join.partition-volume-mb`
        +"""""""""""""""""""""""""""""""""""""""""""
        +
        +The repartition join is executed in two stages. When a join query is executed with the repartition join,
        +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 128
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.join.partition-volume-mb</name>
        + <value>128</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.common.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform a join in a task.
        +If the input data is smaller than this value, join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.common.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.inner.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform an inner join in a task.
        +If the input data is smaller than this value, the inner join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.inner.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.outer.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform an outer join in a task.
        +If the input data is smaller than this value, the outer join is performed with the in-memory hash join.
        +Otherwise, the sort-merge join is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.outer.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +"""""""""""""""""""""""""""""""""""""
        +`tajo.executor.join.hash-table.size`
        +"""""""""""""""""""""""""""""""""""""
        +
        +The initial size of hash table for in-memory hash join.
        +
        + * Property value: Integer
        + * Default value: 100000
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.join.hash-table.size</name>
        + <value>100000</value>
        + </property>

        ======================
        -System Config
        +Sort Query Settings
        ======================

        +""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.sort.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""
        +
        +The sort operation is executed in two stages. When a sort query is executed, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the sort query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.sort.task-volume-mb</name>
        + <value>64</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.external-sort.buffer-mb`
        +""""""""""""""""""""""""""""""""""""""""
        +
        +A threshold to choose the sort algorithm. If the input data is larger than this threshold, the external sort algorithm is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 200
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.external-sort.buffer-mb</name>
        + <value>200</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.sort.list.size`
        +""""""""""""""""""""""""""""""""""""""

        +The initial size of list for in-memory sort.
        +
        + * Property value: Integer
        + * Default value: 100000
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.sort.list.size</name>
        + <value>100000</value>
        + </property>
        +
        +=========================
        +Group by Query Settings
        +=========================
        +
        +""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.multi-level-aggr`
        +""""""""""""""""""""""""""""""""""""""""""""
        +
        +A flag to enable the multi-level algorithm for distinct aggregation. If this value is set, 3-phase aggregation algorithm is used.
        +Otherwise, 2-phase aggregation algorithm is used.
        +
        + * Property value: Boolean
        + * Default value: true
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.multi-level-aggr</name>
        + <value>true</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.partition-volume-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""
        +
        +The aggregation is executed in two stages. When an aggregation query is executed,
        +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 256
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.partition-volume-mb</name>
        + <value>256</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.dist-query.groupby.task-volume-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""
        +
        +The aggregation operation is executed in two stages. When an aggregation query is executed, this value indicates the amount of input data processed by each task at the second stage.
        +As a result, it determines the degree of the parallel processing of the aggregation query.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.dist-query.groupby.partition-volume-mb</name>
        + <value>64</value>
        + </property>
        +
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.groupby.in-memory-hash-threshold-mb`
        +""""""""""""""""""""""""""""""""""""""""""""""""""""""""
        +
        +This value provides the criterion to decide the algorithm to perform an aggregation in a task.
        +If the input data is smaller than this value, the aggregation is performed with the in-memory hash aggregation.
        +Otherwise, the sort-based aggregation is used.
        +
        + * Property value: Integer
        + * Unit: MB
        + * Default value: 64
        + * Example
        +
        +.. code-block:: xml
        +
        + <property>
        + <name>tajo.executor.groupby.in-memory-hash-threshold-mb</name>
        + <value>64</value>
        + </property>
        +
        +.. warning::
        + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap,
        + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors.
        + This value should be tuned carefully.
        +
        +""""""""""""""""""""""""""""""""""""""""""
        +`tajo.executor.aggregate.hash-table.size`
        +""""""""""""""""""""""""""""""""""""""""""
        +
        +The initial size of list for in-memory sort.
        — End diff –

        My mistake. Thanks.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on a diff in the pull request: https://github.com/apache/tajo/pull/844#discussion_r44103181 — Diff: tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst — @@ -2,23 +2,455 @@ The tajo-site.xml File ********************** -To the ``core-site.xml`` file on every host in your cluster, you must add the following information: +You can add more configurations in the ``tajo-site.xml`` file. Note that you should replicate this file to the whole hosts in your cluster once you edited. +If you are looking for the configurations for the master and the worker, please refer to :doc:`tajo_master_configuration` and :doc:`worker_configuration`. +Also, catalog configurations are found here :doc:`catalog_configuration`. + +========================= +Join Query Settings +========================= + +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.auto-broadcast` +"""""""""""""""""""""""""""""""""""""" + +A flag to enable or disable the use of broadcast join. + + * Property value: Boolean + * Default value: true + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.auto-broadcast</name> + <value>true</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.broadcast.non-cross-join.threshold-kb` +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +A threshold for non-cross joins. When a non-cross join query is executed with the broadcast join, the whole size of broadcasted tables won't exceed this threshold. + + * Property value: Integer + * Unit: KB + * Default value: 5120 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.broadcast.non-cross-join.threshold-kb</name> + <value>5120</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.broadcast.cross-join.threshold-kb` +""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +A threshold for cross joins. When a cross join query is executed, the whole size of broadcasted tables won't exceed this threshold. + + * Property value: Integer + * Unit: KB + * Default value: 1024 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.broadcast.cross-join.threshold-kb</name> + <value>1024</value> + </property> + +.. warning:: + In Tajo, the broadcast join is only the way to perform cross joins. Since the cross join is a very expensive operation, this value need to be tuned carefully. + +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.task-volume-mb` +"""""""""""""""""""""""""""""""""""""" + +The repartition join is executed in two stages. When a join query is executed with the repartition join, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the join query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.task-volume-mb</name> + <value>64</value> + </property> + +""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.join.partition-volume-mb` +""""""""""""""""""""""""""""""""""""""""""" + +The repartition join is executed in two stages. When a join query is executed with the repartition join, +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages. + + * Property value: Integer + * Unit: MB + * Default value: 128 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.join.partition-volume-mb</name> + <value>128</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.common.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform a join in a task. +If the input data is smaller than this value, join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.common.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.inner.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform an inner join in a task. +If the input data is smaller than this value, the inner join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.inner.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.outer.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform an outer join in a task. +If the input data is smaller than this value, the outer join is performed with the in-memory hash join. +Otherwise, the sort-merge join is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.outer.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +""""""""""""""""""""""""""""""""""""" +`tajo.executor.join.hash-table.size` +""""""""""""""""""""""""""""""""""""" + +The initial size of hash table for in-memory hash join. + + * Property value: Integer + * Default value: 100000 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.join.hash-table.size</name> + <value>100000</value> + </property> ====================== -System Config +Sort Query Settings ====================== +"""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.sort.task-volume-mb` +"""""""""""""""""""""""""""""""""""""" + +The sort operation is executed in two stages. When a sort query is executed, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the sort query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.sort.task-volume-mb</name> + <value>64</value> + </property> + +"""""""""""""""""""""""""""""""""""""""" +`tajo.executor.external-sort.buffer-mb` +"""""""""""""""""""""""""""""""""""""""" + +A threshold to choose the sort algorithm. If the input data is larger than this threshold, the external sort algorithm is used. + + * Property value: Integer + * Unit: MB + * Default value: 200 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.external-sort.buffer-mb</name> + <value>200</value> + </property> + +"""""""""""""""""""""""""""""""""""""" +`tajo.executor.sort.list.size` +"""""""""""""""""""""""""""""""""""""" +The initial size of list for in-memory sort. + + * Property value: Integer + * Default value: 100000 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.sort.list.size</name> + <value>100000</value> + </property> + +========================= +Group by Query Settings +========================= + +"""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.multi-level-aggr` +"""""""""""""""""""""""""""""""""""""""""""" + +A flag to enable the multi-level algorithm for distinct aggregation. If this value is set, 3-phase aggregation algorithm is used. +Otherwise, 2-phase aggregation algorithm is used. + + * Property value: Boolean + * Default value: true + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.multi-level-aggr</name> + <value>true</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.partition-volume-mb` +"""""""""""""""""""""""""""""""""""""""""""""" + +The aggregation is executed in two stages. When an aggregation query is executed, +this value indicates the output size of each task at the first stage, which determines the number of partitions to be shuffled between two stages. + + * Property value: Integer + * Unit: MB + * Default value: 256 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.partition-volume-mb</name> + <value>256</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""" +`tajo.dist-query.groupby.task-volume-mb` +"""""""""""""""""""""""""""""""""""""""""""""" + +The aggregation operation is executed in two stages. When an aggregation query is executed, this value indicates the amount of input data processed by each task at the second stage. +As a result, it determines the degree of the parallel processing of the aggregation query. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.dist-query.groupby.partition-volume-mb</name> + <value>64</value> + </property> + +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.groupby.in-memory-hash-threshold-mb` +"""""""""""""""""""""""""""""""""""""""""""""""""""""""" + +This value provides the criterion to decide the algorithm to perform an aggregation in a task. +If the input data is smaller than this value, the aggregation is performed with the in-memory hash aggregation. +Otherwise, the sort-based aggregation is used. + + * Property value: Integer + * Unit: MB + * Default value: 64 + * Example + +.. code-block:: xml + + <property> + <name>tajo.executor.groupby.in-memory-hash-threshold-mb</name> + <value>64</value> + </property> + +.. warning:: + This value is the size of the input stored on file systems. So, when the input data is loaded into JVM heap, + its actual size is usually much larger than the configured value, which means that too large threshold can cause unexpected OutOfMemory errors. + This value should be tuned carefully. + +"""""""""""""""""""""""""""""""""""""""""" +`tajo.executor.aggregate.hash-table.size` +"""""""""""""""""""""""""""""""""""""""""" + +The initial size of list for in-memory sort. — End diff – My mistake. Thanks.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/844#issuecomment-154281531

        Thanks for your comment. I addressed your comments.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/844#issuecomment-154281531 Thanks for your comment. I addressed your comments.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user eminency commented on the pull request:

        https://github.com/apache/tajo/pull/844#issuecomment-154697044

        Thanks, it looks good. +1

        Show
        githubbot ASF GitHub Bot added a comment - Github user eminency commented on the pull request: https://github.com/apache/tajo/pull/844#issuecomment-154697044 Thanks, it looks good. +1
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/844#issuecomment-154932737

        Could check the unit test?
        ```
        Failed tests:
        TestTajoCli.testHelpSessionVars:410->assertOutputResult:103->assertOutputResult:107->assertOutputResult:125 expected:<...able size
        \set SORT_[HASH_TABLE_SIZE [int value] - Sort hash table size
        \set JOIN_HASH_TABLE_SIZE [int value] - Join hash table size
        \set INDEX_ENABLED [true or false] - index scan enabled
        \set INDEX_SELECTIVITY_THRESHOLD [real value] - the selectivity threshold for index scan
        \set PARTITION_NO_RESULT_OVERWRITE_ENABLED [true or false] - If T]rue, a partitioned t...> but was:<...able size
        \set SORT_[LIST_SIZE [int value] - List size for in-memory sort
        \set JOIN_HASH_TABLE_SIZE [int value] - Join hash table size
        \set INDEX_ENABLED [true or false] - index scan enabled
        \set INDEX_SELECTIVITY_THRESHOLD [real value] - the selectivity threshold for index scan
        \set PARTITION_NO_RESULT_OVERWRITE_ENABLED [true or false] - If t]rue, a partitioned t...>
        Tests run: 1593, Failures: 1, Errors: 0, Skipped: 0
        ```

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/844#issuecomment-154932737 Could check the unit test? ``` Failed tests: TestTajoCli.testHelpSessionVars:410->assertOutputResult:103->assertOutputResult:107->assertOutputResult:125 expected:<...able size \set SORT_[HASH_TABLE_SIZE [int value] - Sort hash table size \set JOIN_HASH_TABLE_SIZE [int value] - Join hash table size \set INDEX_ENABLED [true or false] - index scan enabled \set INDEX_SELECTIVITY_THRESHOLD [real value] - the selectivity threshold for index scan \set PARTITION_NO_RESULT_OVERWRITE_ENABLED [true or false] - If T]rue, a partitioned t...> but was:<...able size \set SORT_[LIST_SIZE [int value] - List size for in-memory sort \set JOIN_HASH_TABLE_SIZE [int value] - Join hash table size \set INDEX_ENABLED [true or false] - index scan enabled \set INDEX_SELECTIVITY_THRESHOLD [real value] - the selectivity threshold for index scan \set PARTITION_NO_RESULT_OVERWRITE_ENABLED [true or false] - If t]rue, a partitioned t...> Tests run: 1593, Failures: 1, Errors: 0, Skipped: 0 ```
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/844#issuecomment-155017340

        @eminency and @hyunsik, thank you guys for your review!
        I fixed the test failure.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/844#issuecomment-155017340 @eminency and @hyunsik, thank you guys for your review! I fixed the test failure.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user hyunsik commented on the pull request:

        https://github.com/apache/tajo/pull/844#issuecomment-155031450

        +1
        The patch looks good to me.

        Show
        githubbot ASF GitHub Bot added a comment - Github user hyunsik commented on the pull request: https://github.com/apache/tajo/pull/844#issuecomment-155031450 +1 The patch looks good to me.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/844

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/844
        Hide
        jihoonson Jihoon Son added a comment -

        Committed to master and 0.11.1

        Show
        jihoonson Jihoon Son added a comment - Committed to master and 0.11.1
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #588 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/588/)
        TAJO-1963: Add more configuration descriptions to document. (jihoonson: rev 254c923944a4859ca1f5180c122420c750c8154c)

        • tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst
        • tajo-common/src/main/java/org/apache/tajo/SessionVars.java
        • CHANGES
        • tajo-core-tests/src/test/resources/results/TestTajoCli/testHelpSessionVars.result
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestSortQuery.java
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestSortExec.java
        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • tajo-docs/src/main/sphinx/time_zone.rst
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #588 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/588/ ) TAJO-1963 : Add more configuration descriptions to document. (jihoonson: rev 254c923944a4859ca1f5180c122420c750c8154c) tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst tajo-common/src/main/java/org/apache/tajo/SessionVars.java CHANGES tajo-core-tests/src/test/resources/results/TestTajoCli/testHelpSessionVars.result tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestSortQuery.java tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestSortExec.java tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java tajo-docs/src/main/sphinx/time_zone.rst tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #970 (See https://builds.apache.org/job/Tajo-master-build/970/)
        TAJO-1963: Add more configuration descriptions to document. (jihoonson: rev 254c923944a4859ca1f5180c122420c750c8154c)

        • CHANGES
        • tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
        • tajo-core-tests/src/test/resources/results/TestTajoCli/testHelpSessionVars.result
        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • tajo-common/src/main/java/org/apache/tajo/SessionVars.java
        • tajo-docs/src/main/sphinx/time_zone.rst
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestSortExec.java
        • tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst
        • tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestSortQuery.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #970 (See https://builds.apache.org/job/Tajo-master-build/970/ ) TAJO-1963 : Add more configuration descriptions to document. (jihoonson: rev 254c923944a4859ca1f5180c122420c750c8154c) CHANGES tajo-core/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java tajo-core-tests/src/test/resources/results/TestTajoCli/testHelpSessionVars.result tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java tajo-common/src/main/java/org/apache/tajo/SessionVars.java tajo-docs/src/main/sphinx/time_zone.rst tajo-core-tests/src/test/java/org/apache/tajo/engine/planner/physical/TestSortExec.java tajo-docs/src/main/sphinx/configuration/tajo-site-xml.rst tajo-core-tests/src/test/java/org/apache/tajo/engine/query/TestSortQuery.java

          People

          • Assignee:
            jihoonson Jihoon Son
            Reporter:
            jihoonson Jihoon Son
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development