Uploaded image for project: 'Tajo (Retired)'
  1. Tajo (Retired)
  2. TAJO-1872

Increase the minimum split size and add a classpath to hadoop tools

    XMLWordPrintableJSON

Details

    • Task
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.11.0
    • conf and scripts
    • None

    Description

      The current minimum split size is 1 byte. This can cause a wrong behaviour when the underlying file system gives the wrong block size.
      For example, s3a file system sometimes returns 0, so tajo creates a split for each byte as follows.

      default> \d customer
      
      table name: default.customer
      table uri: s3a://id:key@tajo-data-sa-east-1/tpch-1g/customer/customer.tbl
      store type: TEXT
      number of rows: unknown
      volume: 24.3 MB
      
      ...
      2015-09-21 12:42:37,806 INFO org.apache.tajo.storage.FileTablespace: Total # of splits: 24346144
      ...
      

      Attachments

        Activity

          People

            jihoonson Jihoon Son
            jihoonson Jihoon Son
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: