Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-869

Pig scripts should be able to handle scenario where input datasets not present/or empty before running

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 0.2.0
    • None
    • impl
    • None
    • grid environment testing of pig 2.2

    Description

      Pig 2.2 does not handle situatiosn where dataset is not present, as in file missing, or empty file.

      It would be great if Pig would within scripts enforce some data checks.
      It can be any simple command like below that can be easily wrapped around all input sources--

      if ( datapath_valid && data_present && file_not_empty) {
      run the rest of the script
      }
      else {
      throw an exception/error code
      --this should be easily trappable valuecode in logs
      }

      This improvement can be beneficial for our DQ check.

      Attachments

        Activity

          People

            Unassigned Unassigned
            rekhajos Rekha
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:

              Time Tracking

                Estimated:
                Original Estimate - 168h
                168h
                Remaining:
                Remaining Estimate - 168h
                168h
                Logged:
                Time Spent - Not Specified
                Not Specified