Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1872

Increase the minimum split size and add a classpath to hadoop tools

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0
    • Component/s: conf and scripts
    • Labels:
      None

      Description

      The current minimum split size is 1 byte. This can cause a wrong behaviour when the underlying file system gives the wrong block size.
      For example, s3a file system sometimes returns 0, so tajo creates a split for each byte as follows.

      default> \d customer
      
      table name: default.customer
      table uri: s3a://id:key@tajo-data-sa-east-1/tpch-1g/customer/customer.tbl
      store type: TEXT
      number of rows: unknown
      volume: 24.3 MB
      
      ...
      2015-09-21 12:42:37,806 INFO org.apache.tajo.storage.FileTablespace: Total # of splits: 24346144
      ...
      

        Activity

        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-0.11.0-build #61 (See https://builds.apache.org/job/Tajo-0.11.0-build/61/)
        TAJO-1872: Increase the minimum split size and add a classpath to hadoop tools. (jihoonson: rev 8f26208710af2c045d4eff51ebd468325134bbbd)

        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • tajo-dist/src/main/bin/tajo
        • CHANGES
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-0.11.0-build #61 (See https://builds.apache.org/job/Tajo-0.11.0-build/61/ ) TAJO-1872 : Increase the minimum split size and add a classpath to hadoop tools. (jihoonson: rev 8f26208710af2c045d4eff51ebd468325134bbbd) tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java tajo-dist/src/main/bin/tajo CHANGES
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Tajo-master-build #886 (See https://builds.apache.org/job/Tajo-master-build/886/)
        TAJO-1872: Increase the minimum split size and add a classpath to hadoop tools. (jihoonson: rev 21b4797e2eabb08ce99af7e206a80f2641b0502a)

        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • CHANGES
        • tajo-dist/src/main/bin/tajo
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Tajo-master-build #886 (See https://builds.apache.org/job/Tajo-master-build/886/ ) TAJO-1872 : Increase the minimum split size and add a classpath to hadoop tools. (jihoonson: rev 21b4797e2eabb08ce99af7e206a80f2641b0502a) tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java CHANGES tajo-dist/src/main/bin/tajo
        Hide
        jihoonson Jihoon Son added a comment -

        Committed to master and 0.11.

        Show
        jihoonson Jihoon Son added a comment - Committed to master and 0.11.
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Tajo-master-CODEGEN-build #525 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/525/)
        TAJO-1872: Increase the minimum split size and add a classpath to hadoop tools. (jihoonson: rev 21b4797e2eabb08ce99af7e206a80f2641b0502a)

        • tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java
        • CHANGES
        • tajo-dist/src/main/bin/tajo
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Tajo-master-CODEGEN-build #525 (See https://builds.apache.org/job/Tajo-master-CODEGEN-build/525/ ) TAJO-1872 : Increase the minimum split size and add a classpath to hadoop tools. (jihoonson: rev 21b4797e2eabb08ce99af7e206a80f2641b0502a) tajo-common/src/main/java/org/apache/tajo/conf/TajoConf.java CHANGES tajo-dist/src/main/bin/tajo
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user asfgit closed the pull request at:

        https://github.com/apache/tajo/pull/773

        Show
        githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/tajo/pull/773
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jihoonson commented on the pull request:

        https://github.com/apache/tajo/pull/773#issuecomment-141883788

        Thanks for the quick review.

        Show
        githubbot ASF GitHub Bot added a comment - Github user jihoonson commented on the pull request: https://github.com/apache/tajo/pull/773#issuecomment-141883788 Thanks for the quick review.
        Hide
        githubbot ASF GitHub Bot added a comment -

        Github user jinossy commented on the pull request:

        https://github.com/apache/tajo/pull/773#issuecomment-141881151

        +1 LGTM

        Show
        githubbot ASF GitHub Bot added a comment - Github user jinossy commented on the pull request: https://github.com/apache/tajo/pull/773#issuecomment-141881151 +1 LGTM
        Hide
        githubbot ASF GitHub Bot added a comment -

        GitHub user jihoonson opened a pull request:

        https://github.com/apache/tajo/pull/773

        TAJO-1872: Increase the minimum split size and add a classpath to hadoop tools

        You can merge this pull request into a Git repository by running:

        $ git pull https://github.com/jihoonson/tajo-2 TAJO-1872

        Alternatively you can review and apply these changes as the patch at:

        https://github.com/apache/tajo/pull/773.patch

        To close this pull request, make a commit to your master/trunk branch
        with (at least) the following in the commit message:

        This closes #773


        commit ca735d69fc23ef3f8132d5fedd2aa718c1f46dc6
        Author: Jihoon Son <jihoonson@apache.org>
        Date: 2015-09-21T05:03:05Z

        TAJO-1872


        Show
        githubbot ASF GitHub Bot added a comment - GitHub user jihoonson opened a pull request: https://github.com/apache/tajo/pull/773 TAJO-1872 : Increase the minimum split size and add a classpath to hadoop tools You can merge this pull request into a Git repository by running: $ git pull https://github.com/jihoonson/tajo-2 TAJO-1872 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/tajo/pull/773.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #773 commit ca735d69fc23ef3f8132d5fedd2aa718c1f46dc6 Author: Jihoon Son <jihoonson@apache.org> Date: 2015-09-21T05:03:05Z TAJO-1872

          People

          • Assignee:
            jihoonson Jihoon Son
            Reporter:
            jihoonson Jihoon Son
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development