Cassandra
  1. Cassandra
  2. CASSANDRA-3826

Pig cannot use output formats other than CFOF

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Fix Version/s: 1.1.0
    • Component/s: Hadoop
    • Labels:
      None

      Description

      Pig has ColumnFamilyOutputFormat hard coded.

      1. 3826.txt
        4 kB
        Brandon Williams

        Activity

        Hide
        Brandon Williams added a comment -

        Patch to keep CFIF/OF as defaults, but allow overriding both with environment variables.

        Show
        Brandon Williams added a comment - Patch to keep CFIF/OF as defaults, but allow overriding both with environment variables.
        Hide
        Pavel Yaskevich added a comment -

        I guess we better do `if (format.contains("."))` at the time when

        {input,output}format is set instead of getter methods? I also can suggest to make "org.apache.cassandra.hadoop.ColumnFamilyInputFormat" and "org.apache.cassandra.hadoop.ColumnFamilyOutputFormat" as DEFAULT_{INPUT, OUTPUT}_FORMAT and just set them to {input,output}

        format variables when user didn't give any by System.env(...), what do you think?

        Show
        Pavel Yaskevich added a comment - I guess we better do `if (format.contains("."))` at the time when {input,output}format is set instead of getter methods? I also can suggest to make "org.apache.cassandra.hadoop.ColumnFamilyInputFormat" and "org.apache.cassandra.hadoop.ColumnFamilyOutputFormat" as DEFAULT_{INPUT, OUTPUT}_FORMAT and just set them to {input,output} format variables when user didn't give any by System.env(...), what do you think?
        Hide
        Brandon Williams added a comment -

        Updated patch. I wanted to do some of this in ConfigHelper, but it doesn't really make sense because we can't cleanly return instance like getOutputPartitioner does since we have both the old and new hadoop interfaces to comply with, but also because Configuration is not how hadoop determines the input/output formats, those are set on the Job directly. So it's probably best to keep this pig-specific, since other M/R jobs can already control these classes that way.

        Show
        Brandon Williams added a comment - Updated patch. I wanted to do some of this in ConfigHelper, but it doesn't really make sense because we can't cleanly return instance like getOutputPartitioner does since we have both the old and new hadoop interfaces to comply with, but also because Configuration is not how hadoop determines the input/output formats, those are set on the Job directly. So it's probably best to keep this pig-specific, since other M/R jobs can already control these classes that way.
        Hide
        Pavel Yaskevich added a comment -

        +1 with following nit:

        private String getFullyQualifiedClassName(String classname)
        {
            String fqcn = classname.contains(".") ? classname : "org.apache.cassandra.hadoop." + classname;
            return fqcn;
        }
        

        can be changed to

        private String getFullyQualifiedClassName(String classname)
        {
            return classname.contains(".") ? classname : "org.apache.cassandra.hadoop." + classname;
        }
        
        Show
        Pavel Yaskevich added a comment - +1 with following nit: private String getFullyQualifiedClassName( String classname) { String fqcn = classname.contains( "." ) ? classname : "org.apache.cassandra.hadoop." + classname; return fqcn; } can be changed to private String getFullyQualifiedClassName( String classname) { return classname.contains( "." ) ? classname : "org.apache.cassandra.hadoop." + classname; }
        Hide
        Brandon Williams added a comment -

        Committed w/ternary change.

        Show
        Brandon Williams added a comment - Committed w/ternary change.

          People

          • Assignee:
            Brandon Williams
            Reporter:
            Brandon Williams
            Reviewer:
            Pavel Yaskevich
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development