Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-1986 Sqoop2: Schema matching improvements
  3. SQOOP-2010

Matching is invoked on every record ( row ) we write, is not this super expensive?

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Not A Problem
    • Affects Version/s: None
    • Fix Version/s: 2.0.0
    • Component/s: None
    • Labels:
      None

      Description

      the following code in SqoopMapper/ Writer invokes the matching once the data is got from the "Extractor", can we have a say whether or not to invoke this if we are sure the from/to match?

          @Override
          public void writeArrayRecord(Object[] array) {
            fromIDF.setObjectData(array);
            writeContent();
          }
      
          @Override
          public void writeStringRecord(String text) {
            fromIDF.setCSVTextData(text);
            writeContent();
          }
      
          @Override
          public void writeRecord(Object obj) {
            fromIDF.setData(obj);
            writeContent();
          }
      
          private void writeContent() {
            try {
              if (LOG.isDebugEnabled()) {
                LOG.debug("Extracted data: " + fromIDF.getCSVTextData());
              }
              // NOTE: The fromIDF and the corresponding fromSchema is used only for the matching process
              // The output of the mappers is finally written to the toIDF object after the matching process
              // since the writable encapsulates the toIDF ==> new SqoopWritable(toIDF)
              toIDF.setObjectData(matcher.getMatchingData(fromIDF.getObjectData()));
              // NOTE: We do not use the reducer to do the writing (a.k.a LOAD in ETL). Hence the mapper sets up the writable
              context.write(writable, NullWritable.get());
            } catch (Exception e) {
              throw new SqoopException(MRExecutionError.MAPRED_EXEC_0013, e);
            }
          }
      
      

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              vybs Veena Basavaraj
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: