Details
-
Test
-
Status: Resolved
-
P2
-
Resolution: Done
-
None
-
None
Description
We should add a large scale performance test for HadoopInputFormatIO. We should use PerfKitBenchmarker based performance testing framework [1] to manage Kubernetes based muti-node data store and to publish benchmark results.
Example input format implementation to use: DBInputFormat to connect to a Postgres instance.
https://github.com/hanborq/hadoop/blob/master/src/mapred/org/apache/hadoop/mapreduce/lib/db/DBInputFormat.java
Example docker image to use: https://hub.docker.com/_/postgres/