Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-638

Add an optional, simple and extensible validation framework for sqoop

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.4.2
    • 1.4.3
    • None
    • None

    Description

      Attempt to add an extensible validation framework to Sqoop. Adds an optional CLI option: --validate
      There are 3 basic interfaces:

      ValidationThreshold - Determines if the error margin between the source and target are acceptable: Absolute, Percentage Tolerant, etc.
      Default implementation is AbsoluteValidationThreshold which ensures the row counts from source and targets are the same.

      ValidationFailureHandler - Responsible for handling failures: log an error/warning, abort, etc. Default implementation logs a warning message to the configured logger.

      Validator - Drives the validation logic by delegating the decision to ValidationThreshold and delegating failure handling to ValidationFailureHandler. The default implementation comes with a RowCountValidator which validates the row counts from source and the target.

      You could extend these interfaces for more specific implementations and override 'em in sqoop configuration that is picked up.

      Attachments

        1. SQOOP-638-r6.patch
          44 kB
          Venkatesh Seetharam
        2. SQOOP-638.patch
          28 kB
          Venkatesh Seetharam

        Issue Links

          Activity

            People

              svenkat Venkatesh Seetharam
              svenkat Venkatesh Seetharam
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: