Uploaded image for project: 'Apache Gobblin'
  1. Apache Gobblin
  2. GOBBLIN-1025

Add retry for PK-Chunking iterator

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 0.15.0
    • None
    • None

    Description

      In SFDC connector, there is a class called `ResultIterator` (I will change the name to SalesforceRecordIterator).
      It was using by only PK-Chunking currently. It encapsulated fetching a list of result files to a record iterator.

      However, the csvReader.nextRecord() may throw out network IO exception. We should do retry in this case.

      When a result file is fetched partly and one network IO exception happens, we are in a special situation - first half of the file is already fetched to our local, but another half of the file is still on datasource. 
      We need to
      1. reopen the file stream
      2. skip all the records that we already fetched, seek the cursor to the record which we haven't fetched yet.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              arekusuri Alex Li
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 4h
                  4h