Hadoop Common
  1. Hadoop Common
  2. HADOOP-4955

Make DBOutputFormat us column names from setOutput(...)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.19.0
    • Fix Version/s: 0.19.1
    • Component/s: None
    • Labels:
      None
    • Environment:

      Ubuntu 8.04, Java 6, MySQL 5.0 (most likely not relevant)

    • Hadoop Flags:
      Reviewed

      Description

      In org.apache.hadoop.mapred.lib.db.DBOutputFormat, the supplied names of the columns are not used for inserting values. The column names supplied to DBOutputFormat.setOutput(JobConf, String, String...) are used to determine the number of values to insert, but the order is dictated by the table definition of the underlying database. This affects the correct indices for DBWritable.write(PreparedStatement).

      I will attach a patch that correctly maps these values.

      I am characterizing this as a bug rather than an improvement because there may be existing code which implicitly relied on DBOutputFormat ignoring the supplied table names.

      1. hadoop-4955.txt
        3 kB
        Kevin Peterson
      2. patch.txt
        0.7 kB
        Kevin Peterson

        Activity

        Kevin Peterson created issue -
        Kevin Peterson made changes -
        Field Original Value New Value
        Status Open [ 1 ] Patch Available [ 10002 ]
        Kevin Peterson made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Kevin Peterson added a comment -

        I'm not able to connect to svn.apache.org to get latest source. This patch is relative to the 0.19.0 release.

        Show
        Kevin Peterson added a comment - I'm not able to connect to svn.apache.org to get latest source. This patch is relative to the 0.19.0 release.
        Kevin Peterson made changes -
        Attachment patch.txt [ 12396884 ]
        Hide
        Kevin Peterson added a comment -

        Let me know if I need to regenerate the patch in a different format.

        Show
        Kevin Peterson added a comment - Let me know if I need to regenerate the patch in a different format.
        Kevin Peterson made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Enis Soztutar added a comment -

        Somwtimes the names of the fields are not available, in which case we send a new String[] with null values in it. Could you change the patch to first check this :

        if(fieldNames == null || fieldNames.length < 0 )
          throw new IllegalArgumentException("Fields names cannot be empty");
        
        if(fieldNames[0] != null) {
            query.append(" (");
             .....
        }
        
        Show
        Enis Soztutar added a comment - Somwtimes the names of the fields are not available, in which case we send a new String[] with null values in it. Could you change the patch to first check this : if (fieldNames == null || fieldNames.length < 0 ) throw new IllegalArgumentException( "Fields names cannot be empty" ); if (fieldNames[0] != null ) { query.append( " (" ); ..... }
        Hide
        Chris Douglas added a comment -

        Canceling patch while Enis's comments are addressed.

        The patch should apply with patch -p0 < patch

        Show
        Chris Douglas added a comment - Canceling patch while Enis's comments are addressed. The patch should apply with patch -p0 < patch
        Chris Douglas made changes -
        Status Patch Available [ 10002 ] Open [ 1 ]
        Hide
        Kevin Peterson added a comment -

        I will incorporate Enis's comments into patch this weekend.

        Show
        Kevin Peterson added a comment - I will incorporate Enis's comments into patch this weekend.
        Hide
        Kevin Peterson added a comment -

        Updated patch to allow supplying an array of nulls to indicate the number of fields without knowing their names. Added unit test covering this behavior.

        I allowed zero length array. I'm not sure if there's a valid use for this, or if most databases would even allow it. But it's harmless to allow it.

        Show
        Kevin Peterson added a comment - Updated patch to allow supplying an array of nulls to indicate the number of fields without knowing their names. Added unit test covering this behavior. I allowed zero length array. I'm not sure if there's a valid use for this, or if most databases would even allow it. But it's harmless to allow it.
        Kevin Peterson made changes -
        Attachment hadoop-4955.txt [ 12398713 ]
        Hide
        Enis Soztutar added a comment -

        [exec] +1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 3 new or modified tests.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

        I will commit this one once the tests finish.

        Show
        Enis Soztutar added a comment - [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity. I will commit this one once the tests finish.
        Hide
        Enis Soztutar added a comment -

        I committed this to trunk, 0.19 and 0.20 branches. Thanks Kevin!

        Show
        Enis Soztutar added a comment - I committed this to trunk, 0.19 and 0.20 branches. Thanks Kevin!
        Enis Soztutar made changes -
        Status Open [ 1 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Fix Version/s 0.19.1 [ 12313473 ]
        Hadoop Flags [Reviewed]
        Hide
        Hudson added a comment -
        Show
        Hudson added a comment - Integrated in Hadoop-trunk #756 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/756/ )
        Nigel Daley made changes -
        Status Resolved [ 5 ] Closed [ 6 ]
        Chris Douglas made changes -
        Assignee Kevin Peterson [ kevinpet ]
        Owen O'Malley made changes -
        Component/s mapred [ 12310690 ]

          People

          • Assignee:
            Kevin Peterson
            Reporter:
            Kevin Peterson
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development