Description
Processing of geo-spatial data is compute intensive. If done on a large scale as for example when joining millions of GPS traces with huge spatial data sets such as OpenStreetMap, spatial processing can benefit from massively parallel processing platforms such as Apache Flink.
We propose a project to draft and bootstrap a geo-spatial processing library for Apache Flink.
The focus of this project lies on the design of the library. Key features should be a good integration with other Flink APIs, good extensibility, and compatibility with existing data formats. The library could include a variety of processing functionality such as:
- import of common data formats
- fuzzy spatial join for different applications
- containment in area
- minimum distance to object
- intersection of polygons
- ...
- spatial partitioning techniques
- ...