Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-1856

Sqoop2: Handling failures ( Row and Field level ) in Sqoop

    XMLWordPrintableJSON

Details

    • Task
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • 2.0.0
    • None
    • None

    Description

      Skipping corrupted rows in Sqoop

      What is the proposed strategy for handling such scenarios in batch transfer?
      Probably one of the below ..
      1. Skip/ignore and still continue for good records
      2. just bail out once we have a bad record?
      3. have a threshold of how many bad rows we can tolerate? that is configurable.

      From Anand Iyer

      Sqoop is the most obvious place for the functionality discussed in this thread. But at some point, we should start think about adding ... functionality such as (Policy Driven SLAs and Data Validation) ....

      This means we want to be able to define not just failure handling, but more elaborate strategies for sqoop data validation, metrics exposing the state of transfer etc.

      Attachments

        Activity

          People

            Unassigned Unassigned
            vybs Veena Basavaraj
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: