Index: contrib/src/java/org/apache/hadoop/hive/contrib/serde2/RegexSerDe.java =================================================================== --- contrib/src/java/org/apache/hadoop/hive/contrib/serde2/RegexSerDe.java (revision 1102648) +++ contrib/src/java/org/apache/hadoop/hive/contrib/serde2/RegexSerDe.java (working copy) @@ -45,25 +45,25 @@ /** * RegexSerDe uses regular expression (regex) to serialize/deserialize. - * + * * It can deserialize the data using regex and extracts groups as columns. It * can also serialize the row object using a format string. - * + * * In deserialization stage, if a row does not match the regex, then all columns * in the row will be NULL. If a row matches the regex but has less than * expected groups, the missing groups will be NULL. If a row matches the regex * but has more than expected groups, the additional groups are just ignored. - * + * * In serialization stage, it uses java string formatter to format the columns * into a row. If the output type of the column in a query is not a string, it * will be automatically converted to String by Hive. - * + * * For the format of the format String, please refer to {@link http * ://java.sun.com/j2se/1.5.0/docs/api/java/util/Formatter.html#syntax} - * + * * NOTE: Obviously, all columns have to be strings. Users can use * "CAST(a AS INT)" to convert columns to other types. - * + * * NOTE: This implementation is using String, and javaStringObjectInspector. A * more efficient implementation should use UTF-8 encoded Text and * writableStringObjectInspector. We should switch to that when we have a UTF-8 @@ -179,7 +179,12 @@ // Report the row LOG.warn("" + unmatchedRows + " unmatched rows are found: " + rowText); } - return null; + + for(int c = 0; c < numColumns; c++) { + row.set(c, null); + } + return row; + } // Otherwise, return the row.