Uploaded image for project: 'Apache Hop (Retired)'
  1. Apache Hop (Retired)
  2. HOP-4452

Allow hop to talk to S3 Compatible Object Stores

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • Migrated to GHI
    • None
    • None

    Description

      MinIO is a free and opensource S3 compatible object store. It is easy to get going in docker containers and is very performant.  It would be wonderful if Hop could read and write to the object store.  In theory, this should be easy.  Most of the other software I see out there with S3 connections allow two connection variables to be set that point at Minio instead of Amazon S3, and another setting about how the path access style should be.  Please see below:

       

      https://docs.dremio.com/software/data-sources/s3/#configuring-s3-for-minio

      Dremio has this cool way of allowing access to S3-compatible object stores, like Minio by using two connection flags:
      *fs.s3a.path.style.access = true
      *fs.s3a.endpoint = minio_server:9000*

      These appear to be settings that Hadoop jars are familiar with.  Are they supported in VFS in Hop in some way to allow it to read and write to MinIO but essentially speak "S3" to it?

      Not just a dremio thing.  All kinds of hits like Spark with similar settings are there under "hadoopConfiguration.set(...":
      https://www.jitsejan.com/setting-up-spark-with-minio-as-object-storage

      Pivotal Greenplum does the same here:
      https://gpdb.docs.pivotal.io/6-3/pxf/s3_objstore_cfg.html

      Attachments

        1. minio_docker.7z
          0.9 kB
          Brandon Jackson

        Activity

          People

            Unassigned Unassigned
            usbrandon Brandon Jackson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment