Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4059 Pig on Spark
  3. PIG-5197

Replace IndexedKey with PigNullableWritable in spark branch

Details

    • Sub-task
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • spark-branch
    • spark
    • None

    Description

      The function of IndexedKey and PigNullableWritable is similar.
      The difference between these two is IndexedKey contains Index,key while PigNullableWritable contains index,key,value.
      Besides,the comparators for PigNullableWritable have lot of conditions for the different data types taken care of and IndexedKey can miss some of that. We can try to replace IndexedKey with PigNullableWritable.

      Attachments

        1. PIG-5197.patch
          32 kB
          liyunzhang

        Activity

          kellyzly liyunzhang added a comment -

          rohini: I tried to replace IndexedKey wih PigBNullableWritable but it failed because PigNullableWritable is not serializable. So i will remain IndexedKey in spark package. Can you give me some suggestion?

          exception info

          Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 32.0 mpl.io.NullableTuple
          Serialization stack:
          	- object not serializable (class: org.apache.pig.impl.io.NullableTuple, value: Null: false in
          [
          
          
          kellyzly liyunzhang added a comment - rohini : I tried to replace IndexedKey wih PigBNullableWritable but it failed because PigNullableWritable is not serializable. So i will remain IndexedKey in spark package. Can you give me some suggestion? exception info Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0.0 in stage 32.0 mpl.io.NullableTuple Serialization stack: - object not serializable (class: org.apache.pig.impl.io.NullableTuple, value: Null: false in [

          Why not make PigNullableWritable implement Serializable?

          rohini Rohini Palaniswamy added a comment - Why not make PigNullableWritable implement Serializable?
          kellyzly liyunzhang added a comment - - edited

          rohini: we can not replace IndexedKey with PigNullableWritable.I replaced IndexedKey with PigNullableWriable in PIG-5197.patch. Just run TestSparkSecondarySort to verify. TestSecondarySortSpark#testNestedSortMultiQueryEndToEnd3 fails and throws exception like

          had a not serializable result: org.apache.hadoop.io.Text$
          

          It is because

           HDataType.getWritableComparableTypes -> org.apache.pig.impl.io.NullableText#NullableText(java.lang.String)->org.apache.hadoop.io.Text

          For the case we use chararray as type, this exception will be thrown out as org.apache.hadoop.io.Text is not serializable. Can you provide suggestion to solve it or remain IndexedKey in spark package?

          kellyzly liyunzhang added a comment - - edited rohini : we can not replace IndexedKey with PigNullableWritable.I replaced IndexedKey with PigNullableWriable in PIG-5197 .patch. Just run TestSparkSecondarySort to verify. TestSecondarySortSpark#testNestedSortMultiQueryEndToEnd3 fails and throws exception like had a not serializable result: org.apache.hadoop.io.Text$ It is because HDataType.getWritableComparableTypes -> org.apache.pig.impl.io.NullableText#NullableText(java.lang. String )->org.apache.hadoop.io.Text For the case we use chararray as type, this exception will be thrown out as org.apache.hadoop.io.Text is not serializable. Can you provide suggestion to solve it or remain IndexedKey in spark package?

          Don't have a suggestion to workaround that. Closing it as Won't Fix for now. If there are performance and/or comparator issues, we will revisit.

          rohini Rohini Palaniswamy added a comment - Don't have a suggestion to workaround that. Closing it as Won't Fix for now. If there are performance and/or comparator issues, we will revisit.

          People

            Unassigned Unassigned
            kellyzly liyunzhang
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: