Uploaded image for project: 'Tajo'
  1. Tajo
  2. TAJO-1872

Increase the minimum split size and add a classpath to hadoop tools

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.11.0
    • Component/s: conf and scripts
    • Labels:
      None

      Description

      The current minimum split size is 1 byte. This can cause a wrong behaviour when the underlying file system gives the wrong block size.
      For example, s3a file system sometimes returns 0, so tajo creates a split for each byte as follows.

      default> \d customer
      
      table name: default.customer
      table uri: s3a://id:key@tajo-data-sa-east-1/tpch-1g/customer/customer.tbl
      store type: TEXT
      number of rows: unknown
      volume: 24.3 MB
      
      ...
      2015-09-21 12:42:37,806 INFO org.apache.tajo.storage.FileTablespace: Total # of splits: 24346144
      ...
      

        Attachments

          Activity

            People

            • Assignee:
              jihoonson Jihoon Son
              Reporter:
              jihoonson Jihoon Son
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: