Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-10159

Support Reading data from Databricks Delta

Details

    • New Feature
    • Status: Open
    • P3
    • Resolution: Unresolved
    • None
    • None
    • io-ideas
    • None

    Description

      Databricks Delta is an open source storage layer on top of different filesystems. The current implementation of Delta is strongly coupled with Spark so we cannot rely on it because it would break Beam portability.

      However now there is an open specification for Delta's protocol.
      https://github.com/delta-io/delta/blob/master/PROTOCOL.md

      Another possible approach could be to investigate how if Beam could use a manifest based approach like Presto does:
      https://docs.databricks.com/delta/presto-integration.html

      Attachments

        Activity

          People

            Unassigned Unassigned
            iemejia Ismaël Mejía
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated: