Description
Running pig on a cluster i got an instantiation exception for my custom StoreFunc:
08/02/13 22:58:42 ERROR mapreduceExec.MapReduceLauncher: Error message from task (map) tip_200802110401_0072_m_000000 java.lang.RuntimeException: java.io.IOException: null at org.apache.pig.impl.PigContext.instantiateFunc(PigContext.java:427) at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:435) at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigOutputFormat.getRecordWriter(PigOutputFormat.java:58) at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigOutputFormat.getRecordWriter(PigOutputFormat.java:47) at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.setupMapPipe(PigMapReduce.java:205) at org.apache.pig.backend.hadoop.executionengine.mapreduceExec.PigMapReduce.run(PigMapReduce.java:103) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:192) at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1760)
Easy to figure out that there is a problem with my StoreFunc, but hard to figure out what exactly.
Looking into the pig code up from PigContext#instantiateFunc() there is a kind of exception handling which seems unecessary complicated.
Any exception which can happen while instantiating the store func (like InstantiationException or InvocationTargetException) is catched and wrapped with a IOException.
Later on the cause of the IOException is inspected (LOLoad, around line 60) or wrapped into a RuntimeException without handing the causes over (PigSplit, around line 101).
Since every exception which can raise on PigContext#instantiateFunc() is rather an user error since a temporary environment problem, i think this method can just throw an unchecked exception and don't have to declare IOeception anymore. This should save a lot of trouble in calling methods.