Uploaded image for project: 'Apache Arrow'
  1. Apache Arrow
  2. ARROW-7905

[Go][Parquet] Port the C++ Parquet implementation to Go

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      I’m currently in the progress of porting the C++ version of Parquet in the Apache Arrow project to Golang. Many projects and companies have been and are building their data lakes and persistence layer using Parquet. Apache Spark uses it heavily for persistence (including Databricks DeltaLake).

      To me this is the missing component for people to truly begin using the Go implementation of Arrow with any existing data architectures.

      If you have any interest in this project, give this issue a watch as it will keep me motivated to finish the port. Also, if you have specific use cases feel free to drop them in here so I can keep them in mind as I continue with the port.

      Things with the code base are rather in flux at the moment as I figure out how to solve various nuances between the features of C++ and Go. As soon as I have a solid chunk of the port working, I’ll create a PR in the Apache Arrow project on Github and let everyone know in here.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            zeroshade Matthew Topol
            nickpoorman Nick Poorman
            Votes:
            3 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Time Spent - 232h 56m Remaining Estimate - 25.55h
                25.55h
                Logged:
                Time Spent - 232h 56m Remaining Estimate - 25.55h
                232h 56m

                Slack

                  Issue deployment