Issue Details (XML | Word | Printable)

Key: HADOOP-4955
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Kevin Peterson
Reporter: Kevin Peterson
Votes: 0
Watchers: 3
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

Make DBOutputFormat us column names from setOutput(...)

Created: 30/Dec/08 01:51 AM   Updated: 08/Jul/09 04:53 PM
Return to search
Component/s: None
Affects Version/s: 0.19.0
Fix Version/s: 0.19.1

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works hadoop-4955.txt 2009-01-26 04:37 AM Kevin Peterson 3 kB
Text File Licensed for inclusion in ASF works patch.txt 2008-12-30 01:53 AM Kevin Peterson 0.7 kB
Environment: Ubuntu 8.04, Java 6, MySQL 5.0 (most likely not relevant)

Hadoop Flags: Reviewed
Resolution Date: 28/Jan/09 01:50 PM


 Description  « Hide
In org.apache.hadoop.mapred.lib.db.DBOutputFormat, the supplied names of the columns are not used for inserting values. The column names supplied to DBOutputFormat.setOutput(JobConf, String, String...) are used to determine the number of values to insert, but the order is dictated by the table definition of the underlying database. This affects the correct indices for DBWritable.write(PreparedStatement).

I will attach a patch that correctly maps these values.

I am characterizing this as a bug rather than an improvement because there may be existing code which implicitly relied on DBOutputFormat ignoring the supplied table names.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Kevin Peterson added a comment - 30/Dec/08 01:53 AM
I'm not able to connect to svn.apache.org to get latest source. This patch is relative to the 0.19.0 release.

Kevin Peterson added a comment - 30/Dec/08 02:11 AM
Let me know if I need to regenerate the patch in a different format.

Enis Soztutar added a comment - 20/Jan/09 02:07 PM
Somwtimes the names of the fields are not available, in which case we send a new String[] with null values in it. Could you change the patch to first check this :
if(fieldNames == null || fieldNames.length < 0 )
  throw new IllegalArgumentException("Fields names cannot be empty");

if(fieldNames[0] != null) {
    query.append(" (");
     .....
}

Chris Douglas added a comment - 24/Jan/09 12:00 AM
Canceling patch while Enis's comments are addressed.

The patch should apply with patch -p0 < patch


Kevin Peterson added a comment - 24/Jan/09 12:14 AM
I will incorporate Enis's comments into patch this weekend.

Kevin Peterson added a comment - 26/Jan/09 04:37 AM
Updated patch to allow supplying an array of nulls to indicate the number of fields without knowing their names. Added unit test covering this behavior.

I allowed zero length array. I'm not sure if there's a valid use for this, or if most databases would even allow it. But it's harmless to allow it.


Enis Soztutar added a comment - 27/Jan/09 02:52 PM
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 3 new or modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
[exec]
[exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

I will commit this one once the tests finish.


Enis Soztutar added a comment - 28/Jan/09 01:50 PM
I committed this to trunk, 0.19 and 0.20 branches. Thanks Kevin!

Hudson added a comment - 16/Feb/09 05:00 PM