Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-638

Add an optional, simple and extensible validation framework for sqoop

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4.2
    • Fix Version/s: 1.4.3
    • Component/s: None
    • Labels:
      None

      Description

      Attempt to add an extensible validation framework to Sqoop. Adds an optional CLI option: --validate
      There are 3 basic interfaces:

      ValidationThreshold - Determines if the error margin between the source and target are acceptable: Absolute, Percentage Tolerant, etc.
      Default implementation is AbsoluteValidationThreshold which ensures the row counts from source and targets are the same.

      ValidationFailureHandler - Responsible for handling failures: log an error/warning, abort, etc. Default implementation logs a warning message to the configured logger.

      Validator - Drives the validation logic by delegating the decision to ValidationThreshold and delegating failure handling to ValidationFailureHandler. The default implementation comes with a RowCountValidator which validates the row counts from source and the target.

      You could extend these interfaces for more specific implementations and override 'em in sqoop configuration that is picked up.

      1. SQOOP-638.patch
        28 kB
        Venkatesh Seetharam
      2. SQOOP-638-r6.patch
        44 kB
        Venkatesh Seetharam

        Issue Links

          Activity

          Hide
          svenkat Venkatesh Seetharam added a comment -

          Attaching an implementation that is described above.

          Show
          svenkat Venkatesh Seetharam added a comment - Attaching an implementation that is described above.
          Hide
          jarcec Jarek Jarcec Cecho added a comment -

          Hi Venkatesh,
          it's quite big patch, would you mind to upload it to Apache Review Board (https://reviews.apache.org/) for easier review?

          Jarcec

          Show
          jarcec Jarek Jarcec Cecho added a comment - Hi Venkatesh, it's quite big patch, would you mind to upload it to Apache Review Board ( https://reviews.apache.org/ ) for easier review? Jarcec
          Hide
          svenkat Venkatesh Seetharam added a comment -

          Sorry that I forgot to put it on RB. Please find the link: https://reviews.apache.org/r/7693/

          Show
          svenkat Venkatesh Seetharam added a comment - Sorry that I forgot to put it on RB. Please find the link: https://reviews.apache.org/r/7693/
          Hide
          svenkat Venkatesh Seetharam added a comment -

          I'm having issues with git and somehow after rebasing to trunk, it does not add the newly added classes to diff output even after I explicitly call git add. Must be missing something. Will upload a patch tomorrow.

          Show
          svenkat Venkatesh Seetharam added a comment - I'm having issues with git and somehow after rebasing to trunk, it does not add the newly added classes to diff output even after I explicitly call git add. Must be missing something. Will upload a patch tomorrow.
          Hide
          svenkat Venkatesh Seetharam added a comment -

          Sorry for the trouble. Uploaded the new patch.
          https://reviews.apache.org/r/7693/diff/2/

          Show
          svenkat Venkatesh Seetharam added a comment - Sorry for the trouble. Uploaded the new patch. https://reviews.apache.org/r/7693/diff/2/
          Hide
          jarcec Jarek Jarcec Cecho added a comment -

          Patch is in: https://git-wip-us.apache.org/repos/asf?p=sqoop.git;a=commit;h=0b465594d24827c5a8d28e81ed3487e82937a72b

          Thank you very much Venkatesh for your contribution.

          Jarcec

          Show
          jarcec Jarek Jarcec Cecho added a comment - Patch is in: https://git-wip-us.apache.org/repos/asf?p=sqoop.git;a=commit;h=0b465594d24827c5a8d28e81ed3487e82937a72b Thank you very much Venkatesh for your contribution. Jarcec
          Hide
          hudson Hudson added a comment -

          Integrated in Sqoop-ant-jdk-1.6-hadoop200 #353 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/353/)
          SQOOP-638: Add an optional, simple and extensible validation framework for sqoop (Revision 0b465594d24827c5a8d28e81ed3487e82937a72b)

          Result = FAILURE
          jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=0b465594d24827c5a8d28e81ed3487e82937a72b
          Files :

          • src/java/org/apache/sqoop/tool/BaseSqoopTool.java
          • src/java/org/apache/sqoop/validation/ValidationFailureHandler.java
          • src/docs/user/validation-args.txt
          • src/java/org/apache/sqoop/validation/ValidationException.java
          • src/docs/user/import.txt
          • src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
          • src/java/org/apache/sqoop/validation/LogOnFailureHandler.java
          • src/java/org/apache/sqoop/validation/Validator.java
          • src/docs/user/SqoopUserGuide.txt
          • src/docs/user/validation.txt
          • src/java/com/cloudera/sqoop/mapreduce/JobBase.java
          • src/java/org/apache/sqoop/mapreduce/ImportJobBase.java
          • src/java/org/apache/sqoop/tool/ImportTool.java
          • src/java/org/apache/sqoop/validation/ValidationContext.java
          • src/java/org/apache/sqoop/validation/RowCountValidator.java
          • src/java/org/apache/sqoop/validation/ValidationThreshold.java
          • src/docs/user/export.txt
          • src/java/org/apache/sqoop/SqoopOptions.java
          • src/java/org/apache/sqoop/validation/AbsoluteValidationThreshold.java
          • src/docs/user/common-args.txt
          • src/test/org/apache/sqoop/validation/RowCountValidatorImportTest.java
          • src/java/org/apache/sqoop/tool/ExportTool.java
          Show
          hudson Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop200 #353 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop200/353/ ) SQOOP-638 : Add an optional, simple and extensible validation framework for sqoop (Revision 0b465594d24827c5a8d28e81ed3487e82937a72b) Result = FAILURE jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=0b465594d24827c5a8d28e81ed3487e82937a72b Files : src/java/org/apache/sqoop/tool/BaseSqoopTool.java src/java/org/apache/sqoop/validation/ValidationFailureHandler.java src/docs/user/validation-args.txt src/java/org/apache/sqoop/validation/ValidationException.java src/docs/user/import.txt src/java/org/apache/sqoop/mapreduce/ExportJobBase.java src/java/org/apache/sqoop/validation/LogOnFailureHandler.java src/java/org/apache/sqoop/validation/Validator.java src/docs/user/SqoopUserGuide.txt src/docs/user/validation.txt src/java/com/cloudera/sqoop/mapreduce/JobBase.java src/java/org/apache/sqoop/mapreduce/ImportJobBase.java src/java/org/apache/sqoop/tool/ImportTool.java src/java/org/apache/sqoop/validation/ValidationContext.java src/java/org/apache/sqoop/validation/RowCountValidator.java src/java/org/apache/sqoop/validation/ValidationThreshold.java src/docs/user/export.txt src/java/org/apache/sqoop/SqoopOptions.java src/java/org/apache/sqoop/validation/AbsoluteValidationThreshold.java src/docs/user/common-args.txt src/test/org/apache/sqoop/validation/RowCountValidatorImportTest.java src/java/org/apache/sqoop/tool/ExportTool.java
          Hide
          jarcec Jarek Jarcec Cecho added a comment -

          Failure in profile hadoop200 is expected due to SQOOP-731.

          Jarcec

          Show
          jarcec Jarek Jarcec Cecho added a comment - Failure in profile hadoop200 is expected due to SQOOP-731 . Jarcec
          Hide
          hudson Hudson added a comment -

          Integrated in Sqoop-ant-jdk-1.6-hadoop100 #347 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/347/)
          SQOOP-638: Add an optional, simple and extensible validation framework for sqoop (Revision 0b465594d24827c5a8d28e81ed3487e82937a72b)

          Result = SUCCESS
          jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=0b465594d24827c5a8d28e81ed3487e82937a72b
          Files :

          • src/java/org/apache/sqoop/tool/BaseSqoopTool.java
          • src/docs/user/import.txt
          • src/java/org/apache/sqoop/tool/ImportTool.java
          • src/java/org/apache/sqoop/validation/Validator.java
          • src/docs/user/validation-args.txt
          • src/java/com/cloudera/sqoop/mapreduce/JobBase.java
          • src/java/org/apache/sqoop/validation/RowCountValidator.java
          • src/docs/user/SqoopUserGuide.txt
          • src/java/org/apache/sqoop/validation/AbsoluteValidationThreshold.java
          • src/test/org/apache/sqoop/validation/RowCountValidatorImportTest.java
          • src/java/org/apache/sqoop/validation/ValidationFailureHandler.java
          • src/java/org/apache/sqoop/mapreduce/ImportJobBase.java
          • src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
          • src/docs/user/common-args.txt
          • src/java/org/apache/sqoop/validation/ValidationContext.java
          • src/java/org/apache/sqoop/SqoopOptions.java
          • src/java/org/apache/sqoop/validation/ValidationException.java
          • src/java/org/apache/sqoop/validation/LogOnFailureHandler.java
          • src/java/org/apache/sqoop/tool/ExportTool.java
          • src/java/org/apache/sqoop/validation/ValidationThreshold.java
          • src/docs/user/export.txt
          • src/docs/user/validation.txt
          Show
          hudson Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop100 #347 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop100/347/ ) SQOOP-638 : Add an optional, simple and extensible validation framework for sqoop (Revision 0b465594d24827c5a8d28e81ed3487e82937a72b) Result = SUCCESS jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=0b465594d24827c5a8d28e81ed3487e82937a72b Files : src/java/org/apache/sqoop/tool/BaseSqoopTool.java src/docs/user/import.txt src/java/org/apache/sqoop/tool/ImportTool.java src/java/org/apache/sqoop/validation/Validator.java src/docs/user/validation-args.txt src/java/com/cloudera/sqoop/mapreduce/JobBase.java src/java/org/apache/sqoop/validation/RowCountValidator.java src/docs/user/SqoopUserGuide.txt src/java/org/apache/sqoop/validation/AbsoluteValidationThreshold.java src/test/org/apache/sqoop/validation/RowCountValidatorImportTest.java src/java/org/apache/sqoop/validation/ValidationFailureHandler.java src/java/org/apache/sqoop/mapreduce/ImportJobBase.java src/java/org/apache/sqoop/mapreduce/ExportJobBase.java src/docs/user/common-args.txt src/java/org/apache/sqoop/validation/ValidationContext.java src/java/org/apache/sqoop/SqoopOptions.java src/java/org/apache/sqoop/validation/ValidationException.java src/java/org/apache/sqoop/validation/LogOnFailureHandler.java src/java/org/apache/sqoop/tool/ExportTool.java src/java/org/apache/sqoop/validation/ValidationThreshold.java src/docs/user/export.txt src/docs/user/validation.txt
          Hide
          hudson Hudson added a comment -

          Integrated in Sqoop-ant-jdk-1.6-hadoop20 #344 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/344/)
          SQOOP-638: Add an optional, simple and extensible validation framework for sqoop (Revision 0b465594d24827c5a8d28e81ed3487e82937a72b)

          Result = FAILURE
          jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=0b465594d24827c5a8d28e81ed3487e82937a72b
          Files :

          • src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
          • src/java/org/apache/sqoop/tool/ImportTool.java
          • src/java/org/apache/sqoop/validation/LogOnFailureHandler.java
          • src/java/org/apache/sqoop/validation/ValidationContext.java
          • src/docs/user/common-args.txt
          • src/java/org/apache/sqoop/tool/BaseSqoopTool.java
          • src/java/org/apache/sqoop/validation/Validator.java
          • src/docs/user/validation.txt
          • src/docs/user/import.txt
          • src/docs/user/export.txt
          • src/java/org/apache/sqoop/validation/ValidationFailureHandler.java
          • src/java/org/apache/sqoop/mapreduce/ImportJobBase.java
          • src/docs/user/SqoopUserGuide.txt
          • src/java/org/apache/sqoop/tool/ExportTool.java
          • src/docs/user/validation-args.txt
          • src/java/org/apache/sqoop/validation/RowCountValidator.java
          • src/java/com/cloudera/sqoop/mapreduce/JobBase.java
          • src/java/org/apache/sqoop/SqoopOptions.java
          • src/java/org/apache/sqoop/validation/ValidationException.java
          • src/java/org/apache/sqoop/validation/AbsoluteValidationThreshold.java
          • src/java/org/apache/sqoop/validation/ValidationThreshold.java
          • src/test/org/apache/sqoop/validation/RowCountValidatorImportTest.java
          Show
          hudson Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop20 #344 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop20/344/ ) SQOOP-638 : Add an optional, simple and extensible validation framework for sqoop (Revision 0b465594d24827c5a8d28e81ed3487e82937a72b) Result = FAILURE jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=0b465594d24827c5a8d28e81ed3487e82937a72b Files : src/java/org/apache/sqoop/mapreduce/ExportJobBase.java src/java/org/apache/sqoop/tool/ImportTool.java src/java/org/apache/sqoop/validation/LogOnFailureHandler.java src/java/org/apache/sqoop/validation/ValidationContext.java src/docs/user/common-args.txt src/java/org/apache/sqoop/tool/BaseSqoopTool.java src/java/org/apache/sqoop/validation/Validator.java src/docs/user/validation.txt src/docs/user/import.txt src/docs/user/export.txt src/java/org/apache/sqoop/validation/ValidationFailureHandler.java src/java/org/apache/sqoop/mapreduce/ImportJobBase.java src/docs/user/SqoopUserGuide.txt src/java/org/apache/sqoop/tool/ExportTool.java src/docs/user/validation-args.txt src/java/org/apache/sqoop/validation/RowCountValidator.java src/java/com/cloudera/sqoop/mapreduce/JobBase.java src/java/org/apache/sqoop/SqoopOptions.java src/java/org/apache/sqoop/validation/ValidationException.java src/java/org/apache/sqoop/validation/AbsoluteValidationThreshold.java src/java/org/apache/sqoop/validation/ValidationThreshold.java src/test/org/apache/sqoop/validation/RowCountValidatorImportTest.java
          Hide
          hudson Hudson added a comment -

          Integrated in Sqoop-ant-jdk-1.6-hadoop23 #512 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/512/)
          SQOOP-638: Add an optional, simple and extensible validation framework for sqoop (Revision 0b465594d24827c5a8d28e81ed3487e82937a72b)

          Result = SUCCESS
          jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=0b465594d24827c5a8d28e81ed3487e82937a72b
          Files :

          • src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
          • src/java/org/apache/sqoop/validation/ValidationException.java
          • src/docs/user/SqoopUserGuide.txt
          • src/java/com/cloudera/sqoop/mapreduce/JobBase.java
          • src/docs/user/export.txt
          • src/java/org/apache/sqoop/validation/ValidationFailureHandler.java
          • src/java/org/apache/sqoop/tool/ExportTool.java
          • src/test/org/apache/sqoop/validation/RowCountValidatorImportTest.java
          • src/java/org/apache/sqoop/validation/LogOnFailureHandler.java
          • src/java/org/apache/sqoop/validation/AbsoluteValidationThreshold.java
          • src/java/org/apache/sqoop/validation/RowCountValidator.java
          • src/java/org/apache/sqoop/mapreduce/ImportJobBase.java
          • src/docs/user/validation-args.txt
          • src/java/org/apache/sqoop/SqoopOptions.java
          • src/docs/user/import.txt
          • src/java/org/apache/sqoop/validation/ValidationThreshold.java
          • src/java/org/apache/sqoop/tool/BaseSqoopTool.java
          • src/java/org/apache/sqoop/validation/ValidationContext.java
          • src/java/org/apache/sqoop/validation/Validator.java
          • src/java/org/apache/sqoop/tool/ImportTool.java
          • src/docs/user/common-args.txt
          • src/docs/user/validation.txt
          Show
          hudson Hudson added a comment - Integrated in Sqoop-ant-jdk-1.6-hadoop23 #512 (See https://builds.apache.org/job/Sqoop-ant-jdk-1.6-hadoop23/512/ ) SQOOP-638 : Add an optional, simple and extensible validation framework for sqoop (Revision 0b465594d24827c5a8d28e81ed3487e82937a72b) Result = SUCCESS jarcec : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=0b465594d24827c5a8d28e81ed3487e82937a72b Files : src/java/org/apache/sqoop/mapreduce/ExportJobBase.java src/java/org/apache/sqoop/validation/ValidationException.java src/docs/user/SqoopUserGuide.txt src/java/com/cloudera/sqoop/mapreduce/JobBase.java src/docs/user/export.txt src/java/org/apache/sqoop/validation/ValidationFailureHandler.java src/java/org/apache/sqoop/tool/ExportTool.java src/test/org/apache/sqoop/validation/RowCountValidatorImportTest.java src/java/org/apache/sqoop/validation/LogOnFailureHandler.java src/java/org/apache/sqoop/validation/AbsoluteValidationThreshold.java src/java/org/apache/sqoop/validation/RowCountValidator.java src/java/org/apache/sqoop/mapreduce/ImportJobBase.java src/docs/user/validation-args.txt src/java/org/apache/sqoop/SqoopOptions.java src/docs/user/import.txt src/java/org/apache/sqoop/validation/ValidationThreshold.java src/java/org/apache/sqoop/tool/BaseSqoopTool.java src/java/org/apache/sqoop/validation/ValidationContext.java src/java/org/apache/sqoop/validation/Validator.java src/java/org/apache/sqoop/tool/ImportTool.java src/docs/user/common-args.txt src/docs/user/validation.txt

            People

            • Assignee:
              svenkat Venkatesh Seetharam
              Reporter:
              svenkat Venkatesh Seetharam
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development