Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-638

Add an optional, simple and extensible validation framework for sqoop

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4.2
    • Fix Version/s: 1.4.3
    • Component/s: None
    • Labels:
      None

      Description

      Attempt to add an extensible validation framework to Sqoop. Adds an optional CLI option: --validate
      There are 3 basic interfaces:

      ValidationThreshold - Determines if the error margin between the source and target are acceptable: Absolute, Percentage Tolerant, etc.
      Default implementation is AbsoluteValidationThreshold which ensures the row counts from source and targets are the same.

      ValidationFailureHandler - Responsible for handling failures: log an error/warning, abort, etc. Default implementation logs a warning message to the configured logger.

      Validator - Drives the validation logic by delegating the decision to ValidationThreshold and delegating failure handling to ValidationFailureHandler. The default implementation comes with a RowCountValidator which validates the row counts from source and the target.

      You could extend these interfaces for more specific implementations and override 'em in sqoop configuration that is picked up.

        Attachments

        1. SQOOP-638-r6.patch
          44 kB
          Venkatesh Seetharam
        2. SQOOP-638.patch
          28 kB
          Venkatesh Seetharam

          Issue Links

            Activity

              People

              • Assignee:
                svenkat Venkatesh Seetharam
                Reporter:
                svenkat Venkatesh Seetharam
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: