Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
Description
MinIO is a free and opensource S3 compatible object store. It is easy to get going in docker containers and is very performant. It would be wonderful if Hop could read and write to the object store. In theory, this should be easy. Most of the other software I see out there with S3 connections allow two connection variables to be set that point at Minio instead of Amazon S3, and another setting about how the path access style should be. Please see below:
https://docs.dremio.com/software/data-sources/s3/#configuring-s3-for-minio
Dremio has this cool way of allowing access to S3-compatible object stores, like Minio by using two connection flags:
*fs.s3a.path.style.access = true*
*fs.s3a.endpoint = minio_server:9000*
These appear to be settings that Hadoop jars are familiar with. Are they supported in VFS in Hop in some way to allow it to read and write to MinIO but essentially speak "S3" to it?
Not just a dremio thing. All kinds of hits like Spark with similar settings are there under "hadoopConfiguration.set(...":
https://www.jitsejan.com/setting-up-spark-with-minio-as-object-storage
Pivotal Greenplum does the same here:
https://gpdb.docs.pivotal.io/6-3/pxf/s3_objstore_cfg.html