Uploaded image for project: 'Beam'
  1. Beam
  2. BEAM-4567

Can't use mongo connector with Atlas MongoDB

Details

    • Bug
    • Status: Resolved
    • P2
    • Resolution: Fixed
    • 2.4.0
    • 2.12.0
    • io-java-mongodb
    • Google Cloud Dataflow

    Description

      I can't use the MongoDB connector with a managed Atlas instance. The current implementations makes use of splitVector which is a high-privilege function that cannot be assigned to any user in Atlas.

      An open Jira issue for MongoDB suggests using $sample and $bucketAuto to circunvent this necessity.

      Following is the exception thrown (removed some identifiable information):
      Exception in thread "main" org.apache.beam.sdk.Pipeline$PipelineExecutionException: com.mongodb.MongoCommandException: Command failed with error 13: 'not authorized on <collection> to execute command { splitVector: "<collection>.<table>", keyPattern:

      { _id: 1 }

      , force: false, maxChunkSize: 1 }' on server <server>. The full response is { "ok" : 0.0, "errmsg" : "not authorized on <collection> to execute command { splitVector: \"<collection>.<table>\", keyPattern:

      { _id: 1 }

      , force: false, maxChunkSize: 1 }", "code" : 13, "codeName" : "Unauthorized" }
       
      at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
       
      at org.apache.beam.sdk.Pipeline.run(Pipeline.java:297)
       
      at br.dotz.datalake.ingest.mongodb.MongoDBCollectorPipeline.main(MongoDBCollectorPipeline.java:27)
       
      Caused by: com.mongodb.MongoCommandException: Command failed with error 13: 'not authorized on <collection> to execute command { splitVector: "<collection>.<table>", keyPattern:

      { _id: 1 }

      , force: false, maxChunkSize: 1 }' on server <server>. The full response is { "ok" : 0.0, "errmsg" : "not authorized on <collection> to execute command { splitVector: \"<collection>.<table>\", keyPattern:

      { _id: 1 }

      , force: false, maxChunkSize: 1 }", "code" : 13, "codeName" : "Unauthorized" }
       
      at com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:115)
       
      at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:114)
       
      at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159)
       
      at com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286)
       
      at com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173)
       
      at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215)
       
      at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:186)
       
      at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:178)
       
      at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:91)
       
      at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:84)
       
      at com.mongodb.operation.CommandReadOperation.execute(CommandReadOperation.java:55)
       
      at com.mongodb.Mongo.execute(Mongo.java:772)
       
      at com.mongodb.Mongo$2.execute(Mongo.java:759)
       
      at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:130)
       
      at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:124)
       
      at com.mongodb.MongoDatabaseImpl.runCommand(MongoDatabaseImpl.java:114)
       
      at org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.split(MongoDbIO.java:332)
       
      at org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$InputProvider.getInitialInputs(BoundedReadEvaluatorFactory.java:210)
       
      at org.apache.beam.runners.direct.ReadEvaluatorFactory$InputProvider.getInitialInputs(ReadEvaluatorFactory.java:87)
       
      at org.apache.beam.runners.direct.RootProviderRegistry.getInitialInputs(RootProviderRegistry.java:62)

      Attachments

        1. mongodbio.py
          17 kB
          Susumu Asaga
        2. mongodbio_test.py
          11 kB
          Susumu Asaga
        3. mongodbio_it_test.py
          3 kB
          Susumu Asaga

        Issue Links

          Activity

            People

              sandboxws Ahmed El.Hussaini
              lucas-rosa Lucas de Sio Rosa
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - 168h
                  168h
                  Remaining:
                  Remaining Estimate - 168h
                  168h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified