Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17572

Write.df is failing on spark cluster

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • 2.0.0
    • None
    • SparkR
    • None

    Description

      Hi,

      We have spark cluster with four nodes, all four nodes have NFS partition shared(there is no HDFS), We have same uid on all servers. When we are trying to write data we are getting following exceptions. I am not sure whether it is a error or not and not sure will I lost the data in the output.

      The command which I am using to save the data.

      saveDF(banking_l1_1,"banking_l1_v2.csv",source="csv",mode="append",schema="true")
      
      16/09/17 08:03:28 ERROR InsertIntoHadoopFsRelationCommand: Aborting job.
      java.io.IOException: Failed to rename DeprecatedRawLocalFileStatus{path=file:/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task_201609170802_0013_m_000000/part-r-00000-46a7f178-2490-444e-9110-510978eaaecb.csv; isDirectory=false; length=436486316; replication=1; blocksize=33554432; modification_time=1474099400000; access_time=0; owner=; group=; permission=rw-rw-rw-; isSymlink=false} to file:/nfspartition/sankar/banking_l1_v2.csv/part-r-00000-46a7f178-2490-444e-9110-510978eaaecb.csv
          at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:371)
          at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.mergePaths(FileOutputCommitter.java:384)
          at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitJob(FileOutputCommitter.java:326)
          at org.apache.spark.sql.execution.datasources.BaseWriterContainer.commitJob(WriterContainer.scala:222)
          at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoopFsRelationCommand.scala:144)
          at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
          at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
          at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
          at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115)
          at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60)
          at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58)
          at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
          at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
          at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:115)
          at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:136)
          at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
          at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:133)
          at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:114)
          at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:86)
          at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:86)
          at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:487)
          at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:211)
          at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:194)
          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
          at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
          at java.lang.reflect.Method.invoke(Method.java:498)
          at org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:141)
          at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:86)
          at org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:38)
          at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
          at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
          at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
          at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
          at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
          at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
          at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
          at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
          at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
          at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
          at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
          at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
          at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
          at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
          at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
          at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
          at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
          at java.lang.Thread.run(Thread.java:745)
      16/09/17 08:03:28 WARN FileUtil: Failed to delete file or dir [/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task_201609170802_0013_m_000000/.part-r-00000-46a7f178-2490-444e-9110-510978eaaecb.csv.crc]: it still exists.
      16/09/17 08:03:28 WARN FileUtil: Failed to delete file or dir [/nfspartition/sankar/banking_l1_v2.csv/_temporary/0/task_201609170802_0013_m_000000/part-r-00000-46a7f178-2490-444e-9110-510978eaaecb.csv]: it still exists.
      16/09/17 08:03:28 ERROR DefaultWriterContainer: Job job_201609170803_0000 aborted.
      16/09/17 08:03:28 ERROR RBackendHandler: save on 625 failed
      Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) : 
        org.apache.spark.SparkException: Job aborted.
          at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply$mcV$sp(InsertIntoHadoopFsRelationCommand.scala:149)
          at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
          at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand$$anonfun$run$1.apply(InsertIntoHadoopFsRelationCommand.scala:115)
          at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:57)
          at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:115)
          at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:60)
          at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:58)
          at org.apache.spark.sql.execution.command.ExecutedCommandExec.doE
      

      Thanks
      Sankar

      Attachments

        Activity

          People

            Unassigned Unassigned
            sankar.mittapally@creditvidya.com Sankar Mittapally
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: