Tajo
  1. Tajo
  2. TAJO-287

Improve Fragment to be more generic

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.8.0
    • Component/s: physical operator
    • Labels:
      None

      Description

      The current Fragment is only for a file. This patch improves Fragment to be more generic.

      First of all, I've changed Fragment to an interface and the original Fragment to FileFragment respectively. FragmentProto is changed to contain a table name and a bytestring which contains an storage-dependent contents. Then, the added FragmentConvertor transforms FragmentProto to a specified Fragment instance. It would be very useful to represent various fragment types like a row range of Hbase and database tables.

      1. TAJO-287_3.patch
        222 kB
        Hyunsik Choi
      2. TAJO-287_2.patch
        222 kB
        Hyunsik Choi
      3. TAJO-287.patch
        223 kB
        Hyunsik Choi

        Activity

        Hide
        Hudson added a comment -

        SUCCESS: Integrated in Tajo-trunk-postcommit #541 (See https://builds.apache.org/job/Tajo-trunk-postcommit/541/)
        TAJO-287: Improve Fragment to be more generic. (hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=3c22d3eba437e9130ee74de8f935c7fc330b1217)

        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/v2/CSVFileScanner.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestHashJoinExec.java
        • tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/rcfile/TestRCFile.java
        • tajo-catalog/tajo-catalog-common/src/main/proto/CatalogProtos.proto
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/TaskAttemptContext.java
        • tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/v2/TestStorages.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/RawFile.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/RowFile.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestFullOuterMergeJoinExec.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/trevni/TrevniAppender.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestRightOuterHashJoinExec.java
        • tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/TestCompressionStorages.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/storage/TestFragment.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnit.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/query/QueryUnitRequestImpl.java
        • tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/v2/TestCSVScanner.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestPhysicalPlanner.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/fragment/Fragment.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestSortExec.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/fragment/FragmentConvertor.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestLeftOuterNLJoinExec.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/Storage.java
        • tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/json/TableMetaAdapter.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/query/QueryUnitRequest.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/worker/TestRangeRetrieverHandler.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/storage/TestRowFile.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/Fragment.java
        • tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/index/TestSingleCSVFileBSTIndex.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/BSTIndexScanExec.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/client/ResultSetImpl.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/storage/TestFileFragment.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/StorageManager.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestBSTIndexExec.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestHashSemiJoinExec.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestFullOuterHashJoinExec.java
        • tajo-core/tajo-core-storage/src/main/resources/storage-default.xml
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/v2/StorageManagerV2.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestMergeJoinExec.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestRightOuterMergeJoinExec.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/StorageManagerFactory.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/SubQuerySucceeEvent.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestLeftOuterHashJoinExec.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/MergeScanner.java
        • tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/index/TestBSTIndex.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java
        • CHANGES.txt
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/FileAppender.java
        • tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/v2/TestCSVCompression.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/Task.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/v2/RCFileScanner.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/util/IndexUtil.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestNLJoinExec.java
        • tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java
        • tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/TestStorages.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/FileScanner.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestExternalSortExec.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/v2/FileScannerV2.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestBNLJoinExec.java
        • tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestHashAntiJoinExec.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/trevni/TrevniScanner.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java
        • tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TaskSchedulerImpl.java
        • tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/TestMergeScanner.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFileWrapper.java
        • tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/CSVFile.java
        • tajo-core/tajo-core-storage/src/test/resources/storage-default.xml
        Show
        Hudson added a comment - SUCCESS: Integrated in Tajo-trunk-postcommit #541 (See https://builds.apache.org/job/Tajo-trunk-postcommit/541/ ) TAJO-287 : Improve Fragment to be more generic. (hyunsik) (hyunsik: https://git-wip-us.apache.org/repos/asf?p=incubator-tajo.git&a=commit&h=3c22d3eba437e9130ee74de8f935c7fc330b1217 ) tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/v2/CSVFileScanner.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestHashJoinExec.java tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/rcfile/TestRCFile.java tajo-catalog/tajo-catalog-common/src/main/proto/CatalogProtos.proto tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/fragment/FileFragment.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/TaskAttemptContext.java tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/v2/TestStorages.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/RawFile.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/RowFile.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestFullOuterMergeJoinExec.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/trevni/TrevniAppender.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestRightOuterHashJoinExec.java tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/TestCompressionStorages.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/storage/TestFragment.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/QueryUnit.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/query/QueryUnitRequestImpl.java tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/v2/TestCSVScanner.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestPhysicalPlanner.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/fragment/Fragment.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestSortExec.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/fragment/FragmentConvertor.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestLeftOuterNLJoinExec.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/Storage.java tajo-catalog/tajo-catalog-common/src/main/java/org/apache/tajo/catalog/json/TableMetaAdapter.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/query/QueryUnitRequest.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/worker/TestRangeRetrieverHandler.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/storage/TestRowFile.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/Fragment.java tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/index/TestSingleCSVFileBSTIndex.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/BSTIndexScanExec.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/AbstractStorageManager.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/client/ResultSetImpl.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/storage/TestFileFragment.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/StorageManager.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestBSTIndexExec.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestHashSemiJoinExec.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestFullOuterHashJoinExec.java tajo-core/tajo-core-storage/src/main/resources/storage-default.xml tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/v2/StorageManagerV2.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestMergeJoinExec.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestRightOuterMergeJoinExec.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/PhysicalPlannerImpl.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/StorageManagerFactory.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/event/SubQuerySucceeEvent.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestLeftOuterHashJoinExec.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/MergeScanner.java tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/index/TestBSTIndex.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/SubQuery.java CHANGES.txt tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/FileAppender.java tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/v2/TestCSVCompression.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/worker/Task.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/v2/RCFileScanner.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/util/IndexUtil.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestNLJoinExec.java tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/TestStorageManager.java tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/TestStorages.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/ExternalSortExec.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/FileScanner.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestExternalSortExec.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/v2/FileScannerV2.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestBNLJoinExec.java tajo-core/tajo-core-backend/src/test/java/org/apache/tajo/engine/planner/physical/TestHashAntiJoinExec.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/trevni/TrevniScanner.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/querymaster/Repartitioner.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/engine/planner/physical/SeqScanExec.java tajo-core/tajo-core-backend/src/main/java/org/apache/tajo/master/TaskSchedulerImpl.java tajo-core/tajo-core-storage/src/test/java/org/apache/tajo/storage/TestMergeScanner.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/rcfile/RCFileWrapper.java tajo-core/tajo-core-storage/src/main/java/org/apache/tajo/storage/CSVFile.java tajo-core/tajo-core-storage/src/test/resources/storage-default.xml
        Hide
        Hyunsik Choi added a comment -

        Thanks for the review. I've just committed it to master.

        Show
        Hyunsik Choi added a comment - Thanks for the review. I've just committed it to master.
        Hide
        Jihoon Son added a comment -

        +1.
        This patch looks good to me.
        It also passes 'mvn clean verify'.

        Show
        Jihoon Son added a comment - +1. This patch looks good to me. It also passes 'mvn clean verify'.
        Hide
        Jihoon Son added a comment -

        I'll look at the patch this afternoon.

        Show
        Jihoon Son added a comment - I'll look at the patch this afternoon.
        Hide
        Hyunsik Choi added a comment -

        I've uploaded the second patch that reflects Jihoon's suggestion.

        Show
        Hyunsik Choi added a comment - I've uploaded the second patch that reflects Jihoon's suggestion.
        Hide
        Hyunsik Choi added a comment -

        I will handle the TODO that I missed.

        Because getStartOffset and getLength will be called in file scanners, your suggestion is reasonable. I expected that the generic start key and end key may be used in QueryMaster without the fragment specific code. However, I can't imagine any usage cases now. Now, I'm going to accept your suggestion.

        Thank you for the nice review.

        Show
        Hyunsik Choi added a comment - I will handle the TODO that I missed. Because getStartOffset and getLength will be called in file scanners, your suggestion is reasonable. I expected that the generic start key and end key may be used in QueryMaster without the fragment specific code. However, I can't imagine any usage cases now. Now, I'm going to accept your suggestion. Thank you for the nice review.
        Hide
        Jihoon Son added a comment - - edited

        ++1 for this issue. It has a great extensibility.
        I have a couple of things that we need to discuss.

        • FileFragment has getStartKey() and getEndKey() which are inherited from Fragment. These functions return a start offset and a length, respectively. But, their names are not intuitive to suppose what their values mean. How about add getStartOffset() and getLength() function to only FileFragment?
        • In Task, there are some commented out codes and you left a comment as "TODO - to be fixed". What does the comment mean?
        Show
        Jihoon Son added a comment - - edited ++1 for this issue. It has a great extensibility. I have a couple of things that we need to discuss. FileFragment has getStartKey() and getEndKey() which are inherited from Fragment. These functions return a start offset and a length, respectively. But, their names are not intuitive to suppose what their values mean. How about add getStartOffset() and getLength() function to only FileFragment? In Task, there are some commented out codes and you left a comment as "TODO - to be fixed". What does the comment mean?

          People

          • Assignee:
            Hyunsik Choi
            Reporter:
            Hyunsik Choi
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development