Uploaded image for project: 'Apache Hop (Retired)'
  1. Apache Hop (Retired)
  2. HOP-2952

Could not determine the type of file in transform Parquet File output running in Ubuntu

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Not A Problem
    • pre-apache
    • Not applicable
    • GUI
    • Linux Ubuntu 20.04 with Oracle JDK 1.8.0_281 and hop-client-1.0-20210610.151152-1

    Description

      Report: In this environment I get an error when running the pipeline through the GUI

      2021/06/11 09:19:27 - Hop - Pipeline opened.2021/06/11 09:19:27 - Hop - Launching pipeline [filename]...2021/06/11 09:19:27 - Hop - Started the pipeline execution.2021/06/11 09:19:27 - filename - Executing this pipeline using the Local Pipeline Engine with run configuration 'local'2021/06/11 09:19:27 - filename - Execution started for pipeline [filename]2021/06/11 09:19:27 - Text file input.0 - Opening file: file:///opt/eng/hop/config/projects/samples/datasets/Zipssortedbycitystate.csv2021/06/11 09:19:48 - Parquet File Output .0 - ERROR: Unexpected error2021/06/11 09:19:48 - Parquet File Output .0 - ERROR: org.apache.hop.core.exception.HopException: 2021/06/11 09:19:48 - Parquet File Output .0 - Unable to create output file 's3://openin-hop/test/test-linux-00-0001.parquet'2021/06/11 09:19:48 - Parquet File Output .0 - 2021/06/11 09:19:48 - Parquet File Output .0 - 2021/06/11 09:19:48 - Parquet File Output .0 - org.apache.commons.vfs2.FileSystemException: Could not determine the type of file "s3:///openin-hop/test".2021/06/11 09:19:48 - Parquet File Output .0 - Could not determine the type of file "s3:///openin-hop/test".2021/06/11 09:19:48 - Parquet File Output .0 - 2021/06/11 09:19:48 - Parquet File Output .0 - Could not determine the type of file "s3:///openin-hop/test".2021/06/11 09:19:48 - Parquet File Output .0 - 2021/06/11 09:19:48 - Parquet File Output .0 - 2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.hop.parquet.transforms.output.ParquetOutput.openNewFile(ParquetOutput.java:197)2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.hop.parquet.transforms.output.ParquetOutput.processRow(ParquetOutput.java:96)2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.hop.pipeline.transform.RunThread.run(RunThread.java:60)2021/06/11 09:19:48 - Parquet File Output .0 - at java.lang.Thread.run(Thread.java:748)2021/06/11 09:19:48 - Parquet File Output .0 - Caused by: org.apache.hop.core.exception.HopFileException: 2021/06/11 09:19:48 - Parquet File Output .0 - 2021/06/11 09:19:48 - Parquet File Output .0 - org.apache.commons.vfs2.FileSystemException: Could not determine the type of file "s3:///openin-hop/test".2021/06/11 09:19:48 - Parquet File Output .0 - Could not determine the type of file "s3:///openin-hop/test".2021/06/11 09:19:48 - Parquet File Output .0 - 2021/06/11 09:19:48 - Parquet File Output .0 - Could not determine the type of file "s3:///openin-hop/test".2021/06/11 09:19:48 - Parquet File Output .0 - 2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.hop.core.vfs.HopVfs.getOutputStream(HopVfs.java:324)2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.hop.parquet.transforms.output.ParquetOutput.openNewFile(ParquetOutput.java:181)2021/06/11 09:19:48 - Parquet File Output .0 - ... 3 more2021/06/11 09:19:48 - Parquet File Output .0 - Caused by: org.apache.commons.vfs2.FileSystemException: Could not determine the type of file "s3:///openin-hop/test".2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.commons.vfs2.provider.AbstractFileObject.attach(AbstractFileObject.java:169)2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.commons.vfs2.provider.AbstractFileObject.getType(AbstractFileObject.java:1336)2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.commons.vfs2.provider.AbstractFileObject.exists(AbstractFileObject.java:945)2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.hop.core.vfs.HopVfs.getOutputStream(HopVfs.java:293)2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.hop.core.vfs.HopVfs.getOutputStream(HopVfs.java:322)2021/06/11 09:19:48 - Parquet File Output .0 - ... 4 more2021/06/11 09:19:48 - Parquet File Output .0 - Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: WFMKRG8SRQR75593; S3 Extended Request ID: J6bvWNGsTBQBMXZtXIpm9VrlA8hhoFmkm6s7sy3u3/PhBiSjiLOI7rZdMrO814ky1p5OfYHO8Y4=), S3 Extended Request ID: J6bvWNGsTBQBMXZtXIpm9VrlA8hhoFmkm6s7sy3u3/PhBiSjiLOI7rZdMrO814ky1p5OfYHO8Y4=2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1712)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1367)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4914)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4860)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4854)2021/06/11 09:19:48 - Parquet File Output .0 - at com.amazonaws.services.s3.AmazonS3Client.listObjects(AmazonS3Client.java:880)2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.hop.vfs.s3.s3.vfs.S3FileObject.handleAttachException(S3FileObject.java:149)2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.hop.vfs.s3.s3common.S3CommonFileObject.doAttach(S3CommonFileObject.java:206)2021/06/11 09:19:48 - Parquet File Output .0 - at org.apache.commons.vfs2.provider.AbstractFileObject.attach(AbstractFileObject.java:160)2021/06/11 09:19:48 - Parquet File Output .0 - ... 8 more2021/06/11 09:19:48 - Parquet File Output .0 - Finished processing (I=0, O=0, R=1, W=0, U=0, E=1)2021/06/11 09:19:48 - filename - Pipeline detected one or more transforms with errors.2021/06/11 09:19:48 - filename - Pipeline is killing the other transforms!2021/06/11 09:19:48 - Text file input.0 - Finished processing (I=10003, O=0, R=0, W=10002, U=1, E=0)2021/06/11 09:19:48 - filename - Pipeline duration : 20.903 seconds [ 20.902" ]2021/06/11 09:19:48 - filename - ERROR: Errors detected!2021/06/11 09:19:48 - filename - Execution finished on a local pipeline engine with run configuration 'local'

      See the error print in the attached image called parquet_error_ubuntu
      Locally it is possible to generate the .parquet

      Attachments

        1. filename.hpl
          6 kB
          Ricardo Gouvea
        2. Zipssortedbycitystate.csv
          385 kB
          Ricardo Gouvea
        3. parquet_error_ubuntu.png
          324 kB
          Ricardo Gouvea

        Activity

          People

            Unassigned Unassigned
            rgouvea Ricardo Gouvea
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment