Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-319

BulkLoaderVertexProgram for generalized batch loading across graphs

    XMLWordPrintableJSON

Details

    Description

      After working on BulkLoaderVertexProgram for Titan, it is trivial to add this generally to TinkerPop – equivalent to BlueprintsOutputFormat (or whatever the bulk loader was known that was blueprints specific). However, given that Titan and TinkerPop have the same data model, Titan having its own BulkLoaderVertexProgram isn't necessary as there is no longer a data model alignment issue. The difference would be that instead of:

      g.V.compute().program(BulkLoaderVertexProgram.build().titan(propertiesFile).create()).submit()
      

      It would simply be:

      g.V.compute().program(BulkLoaderVertexProgram.build().factory(propertiesFile).create()).submit()
      

      ...and BulkLoaderVertexProgram would use GraphFactory.open() to instantiate the connection to the graph. Moreover, (and spmallette will need to clear my head here), if the factory opened up a Gremlin Server connection, then we get parallel writing to embedded graph databases like Neo4j.

      BulkLoaderVertexProgram is simply a vertex program that parallel loads a graph (with a graph computer) to any other graph that can be accessed via GraphFactory (which is every TP3 graph).

      dalaro @mbroecheler dkuppitz

      EXTENDED NOTES:

      • SchemaInference would be a MapReduce job executed prior to BulkLoaderVertexProgram
      • Titan and Neo4j can each have their own SchemaInference implementations.
      • Incremental loading .... I forget how this worked.
      • Bulk mutations ... this is possible at the TP3 level with hidden properties and smart add/remove/etc.

      Attachments

        Activity

          People

            dkuppitz Daniel Kuppitz
            okram Marko A. Rodriguez
            Votes:
            1 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: