Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-4042

Insert into select and CTAS launches fewer tasks(task count limited to number of nodes in cluster) even when target table is of no_sort

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.1.0
    • None

    Description

      Issue:

      At present, When we do insert into table select from or create table as select from, we lauch one single task per node. Whereas when we do a simple select * from table query, tasks launched are equal to number of carbondata files(CARBON_TASK_DISTRIBUTION default is CARBON_TASK_DISTRIBUTION_BLOCK). 

      Thus, slows down the load performance of insert into select and ctas cases.

      Refer Community discussion regd. task lauch

       

      Suggestion:

      Launch the same number of tasks as in select query for insert into select and ctas cases when the target table is of no-sort.

      Attachments

        Activity

          People

            Unassigned Unassigned
            VenuReddy Venugopal Reddy K
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1h 20m
                1h 20m