Uploaded image for project: 'CarbonData'
  1. CarbonData
  2. CARBONDATA-3925

flink-integration write carbon file to hdfs error

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.0.0
    • 2.1.0
    • flink-integration
    • None

    Description

      in CarbonWriter.java code ,you can find this;
      public abstract class CarbonWriter extends ProxyFileWriter<Object[]> {
      private static final Logger LOGGER =
      LogServiceFactory.getLogService(CarbonS3Writer.class.getName());}
      always wo can find logfile print like ;
      2020-07-27 14:19:25,107 DEBUG org.apache.carbon.flink.CarbonS3Writer
      this is puzzledï¼›

      2020-08-19 13:14:13,329 INFO org.apache.carbondata.hadoop.api.CarbonTableOutputFormat - Closed writer task attempt_f229b922-1f77-426f-a4bc-42e49aa53df7_0000_m_1968253873_-1049302646
      2020-08-19 13:14:13,329 DEBUG org.apache.carbon.flink.CarbonLocalWriter - Commit write. org.apache.carbon.flink.CarbonLocalWriter@41f5c4a9
      2020-08-19 13:14:13,329 DEBUG org.apache.carbon.flink.CarbonS3Writer - Upload file[/home/hadoop/yutest/d963e9836ccb4318aa8fc953af983d07/part-0-a132f98547584dcabae6c43090626baf_batchno0-0-null-1597814047953.snappy.carbondata] to [hdfs://beh/user/dc_cbss/warehouse/yutest/tf_b_trade/stage_data] start.
      2020-08-19 13:14:13,329 INFO org.apache.carbondata.core.util.CarbonUtil - Copying /home/hadoop/yutest/d963e9836ccb4318aa8fc953af983d07/part-0-a132f98547584dcabae6c43090626baf_batchno0-0-null-1597814047953.snappy.carbondata to hdfs://beh/user/dc_cbss/warehouse/yutest/tf_b_trade/stage_data, operation id 1597814053329
      2020-08-19 13:14:13,331 DEBUG org.apache.carbondata.core.util.CarbonUtil - The configured block size is 1 KB, the actual carbon file size is 277 KB, choose the max value 277 KB as the block size on HDFS
      2020-08-19 13:14:13,331 DEBUG org.apache.carbondata.core.util.CarbonUtil - HDFS file block size for file: hdfs://beh/user/dc_cbss/warehouse/yutest/tf_b_trade/stage_data/part-0-a132f98547584dcabae6c43090626baf_batchno0-0-null-1597814047953.snappy.carbondata is 284160 (bytes
      2020-08-19 13:14:13,332 INFO org.apache.carbondata.processing.util.CarbonLoaderUtil - Deleted the local store location: /tmp/f97548ae6efc43d2ba269c9d35295bb9_attempt_f229b922-1f77-426f-a4bc-42e49aa53df7_0000_m_1968253873_-1049302646 : Time taken: 2
      2020-08-19 13:14:13,358 ERROR org.apache.carbon.flink.CarbonS3Writer - Problem while copying file from local store to carbon store
      org.apache.carbondata.core.datastore.exception.CarbonDataWriterException: Problem while copying file from local store to carbon store
      at org.apache.carbondata.core.util.CarbonUtil.copyCarbonDataFileToCarbonStorePath(CarbonUtil.java:2694)
      at org.apache.carbon.flink.CarbonWriter.uploadSegmentDataFiles(CarbonWriter.java:90)
      at org.apache.carbon.flink.CarbonLocalWriter.commit(CarbonLocalWriter.java:155)
      at org.apache.carbon.flink.CarbonLocalWriter.flush(CarbonLocalWriter.java:129)
      at org.apache.flink.streaming.api.functions.sink.filesystem.BulkPartWriter.closeForCommit(BulkPartWriter.java:61)
      at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.closePartFile(Bucket.java:239)
      at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.prepareBucketForCheckpointing(Bucket.java:280)
      at org.apache.flink.streaming.api.functions.sink.filesystem.Bucket.onReceptionOfCheckpoint(Bucket.java:253)
      at org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.snapshotActiveBuckets(Buckets.java:250)
      at org.apache.flink.streaming.api.functions.sink.filesystem.Buckets.snapshotState(Buckets.java:241)
      at org.apache.flink.streaming.api.functions.sink.filesystem.StreamingFileSink.snapshotState(StreamingFileSink.java:447)
      at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.trySnapshotFunctionState(StreamingFunctionUtils.java:118)
      at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.snapshotFunctionState(StreamingFunctionUtils.java:99)
      at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.snapshotState(AbstractUdfStreamOperator.java:90)
      at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:402)
      at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(StreamTask.java:1403)
      at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1337)
      at org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:974)
      at org.apache.flink.streaming.runtime.tasks.StreamTask.lambda$performCheckpoint$5(StreamTask.java:870)
      at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$SynchronizedStreamTaskActionExecutor.runThrowing(StreamTaskActionExecutor.java:94)
      at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:843)
      at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:803)
      at org.apache.flink.streaming.runtime.io.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:86)
      at org.apache.flink.streaming.runtime.io.CheckpointBarrierAligner.processBarrier(CheckpointBarrierAligner.java:177)
      at org.apache.flink.streaming.runtime.io.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:155)
      at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:133)
      at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69)
      at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:310)
      at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)
      at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:485)
      at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:469)
      at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
      at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
      at java.lang.Thread.run(Thread.java:748)
      Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): Specified block size is less than configured minimum value (dfs.namenode.fs-limits.min-block-size): 284160 < 1048576
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2821)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2774)
      at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:610)
      at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.create(AuthorizationProviderProxyClientProtocol.java:117)
      at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:413)
      at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
      at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2278)
      at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2274)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:422)
      at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
      at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2272)

      at org.apache.hadoop.ipc.Client.call(Client.java:1476)
      at org.apache.hadoop.ipc.Client.call(Client.java:1413)
      at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
      at com.sun.proxy.$Proxy23.create(Unknown Source)
      at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:498)
      at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
      at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
      at com.sun.proxy.$Proxy24.create(Unknown Source)
      at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1652)
      at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1689)
      at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1624)
      at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
      at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
      at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
      at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459)
      at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
      at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:910)
      at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:891)
      at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.getDataOutputStream(AbstractDFSCarbonFile.java:363)
      at org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.getDataOutputStream(AbstractDFSCarbonFile.java:357)
      at org.apache.carbondata.core.datastore.impl.FileFactory.getDataOutputStream(FileFactory.java:252)
      at org.apache.carbondata.core.util.CarbonUtil.copyLocalFileToCarbonStore(CarbonUtil.java:2721)
      at org.apache.carbondata.core.util.CarbonUtil.copyCarbonDataFileToCarbonStorePath(CarbonUtil.java:2678)
      ... 33 more

      Attachments

        Activity

          People

            Unassigned Unassigned
            yutaochina yutao
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 1.5h
                1.5h