Mongodb Java Driver Batch Insert: Ordered vs Unordered Execution

Document can be either a Clojure map (in the majority of cases, it is)or an instance of com.mongodb.DBObject (referred to later asDBObject). In case your application obtains DBObjects from otherlibraries (for example), you can insert those:

Mongodb Java Driver Batch Insert

Download Zip: https://urlcod.com/2vEDu1

If you insert a document without the :_id key, MongoDB Java driverthat Monger uses under the hood will generate one foryou. Unfortunately, it does so by mutating the document you passit. With Clojure's immutable data structures, that won't work the wayMongoDB Java driver authors expected.

Sometimes you need to insert a batch of documents all at once and you need it to be done efficiently. MongoDB supports batchinserts feature. To do it with Monger, use monger.collection/insert-batch function:

As an example, the following code inserts two rows into a table that contains an INTEGER column and a VARCHAR column. The examplebinds values to the parameters in the INSERT statement and calls addBatch() and executeBatch() to perform a batchinsert.

When you use this technique to insert a large number of values, the driver can improve performance by streaming the data (withoutcreating files on the local machine) to a temporary stage for ingestion. The driver automatically does this when the number ofvalues exceeds a threshold.

When handling errors and exceptions for a JDBC application, you can use theErrorCode.javafile that Snowflake provides to determine the cause of the problems.Error codes specific to the JDBC driver start with 2, in the form: 2NNNNN.

The link to the ErrorCode.java in the public snowflake-jdbc git repository points to the latest version of the file, which might differ fromthe version of the JDBC driver you currently use.

Depending on the application one of the mongodb-driver-sync, mongodb-driver-reactivestreams artifacts is is required next to the mandatory mongodb-driver-core.It is possible to combine the sync and reactive drivers in one application if needed.

For most tasks, you should use MongoTemplate or the Repository support, which both leverage the rich mapping functionality. MongoTemplate is the place to look for accessing functionality such as incrementing counters or ad-hoc CRUD operations. MongoTemplate also provides callback methods so that it is easy for you to get the low-level API artifacts, such as com.mongodb.client.MongoDatabase, to communicate directly with MongoDB. The goal with naming conventions on various API artifacts is to copy those in the base MongoDB Java driver so you can easily map your existing knowledge onto the Spring APIs.

While com.mongodb.client.MongoClient is the entry point to the MongoDB driver API, connecting to a specific MongoDB database instance requires additional information, such as the database name and an optional username and password. With that information, you can obtain a com.mongodb.client.MongoDatabase object and access all the functionality of a specific MongoDB database instance. Spring provides the org.springframework.data.mongodb.core.MongoDatabaseFactory interface, shown in the following listing, to bootstrap connectivity to the database:

The MongoTemplate class implements the interface MongoOperations. In as much as possible, the methods on MongoOperations are named after methods available on the MongoDB driver Collection object, to make the API familiar to existing MongoDB developers who are used to the driver API. For example, you can find methods such as find, findAndModify, findAndReplace, findOne, insert, remove, save, update, and updateMulti. The design goal was to make it as easy as possible to transition between the use of the base MongoDB driver and MongoOperations. A major difference between the two APIs is that MongoOperations can be passed domain objects instead of Document. Also, MongoOperations has fluent APIs for Query, Criteria, and Update operations instead of populating a Document to specify the parameters for those operations.

While com.mongodb.reactivestreams.client.MongoClient is the entry point to the reactive MongoDB driver API, connecting to a specific MongoDB database instance requires additional information, such as the database name. With that information, you can obtain a com.mongodb.reactivestreams.client.MongoDatabase object and access all the functionality of a specific MongoDB database instance. Spring provides the org.springframework.data.mongodb.core.ReactiveMongoDatabaseFactory interface to bootstrap connectivity to the database. The following listing shows the ReactiveMongoDatabaseFactory interface:

The ReactiveMongoTemplate class implements the ReactiveMongoOperations interface. As much as possible, the methods on ReactiveMongoOperations mirror methods available on the MongoDB driver Collection object, to make the API familiar to existing MongoDB developers who are used to the driver API. For example, you can find methods such as find, findAndModify, findOne, insert, remove, save, update, and updateMulti. The design goal is to make it as easy as possible to transition between the use of the base MongoDB driver and ReactiveMongoOperations. A major difference between the two APIs is that ReactiveMongoOperations can be passed domain objects instead of Document, and there are fluent APIs for Query, Criteria, and Update operations instead of populating a Document to specify the parameters for those operations.

There are many convenience methods on ReactiveMongoTemplate to help you easily perform common tasks. However, if you need to access the MongoDB driver API directly to access functionality not explicitly exposed by the MongoTemplate, you can use one of several execute callback methods to access underlying driver APIs. The execute callbacks give you a reference to either a com.mongodb.reactivestreams.client.MongoCollection or a com.mongodb.reactivestreams.client.MongoDatabase object. See Execution Callbacks for more information.

Now, I've set the "ContinueOnLastError" to true, and I hope that this will prevent exceptions (I'm using Java driver) in the bulk inserts. However, I would prefer that in case of a collision, the documents are overwritten, instead of preferring the one that's already in the collection.

Even with BulkMode.ORDERED, there can be other failures during this batch insertion for ex, network blips, server crashes etc. Thus, this approach is fine if we are bulk inserting thousands of rows (which would probably take a second or two). However, for inserting millions of records, it is best idea to batch this process using Spring Batch or a custom batching logic. The idea is that you want to be able to resume the insertion in an event of the failure. How the failure is handled totally depends on the type of failure occurred.

batchSize() helps us reduce network overhead transparently in the MongoDB driver. But sometimes the only way to optimize your network round trips is to tweak your application logic. For instance consider this logic:

Spring Initializr creates a simple class for the application. The following listing shows the class that Initializr created for this example (in src/main/java/com/example/accessingdatamongodb/AccessingDataMongodbApplication.java):

Now you need to modify the simple class that the Initializr created for you. You need to set up some data and use it to generate output. The following listing shows the finished AccessingDataMongodbApplication class (in src/main/java/com/example/accessingdatamongodb/AccessingDataMongodbApplication.java): 2ff7e9595c
LILY & JACOB

Mongodb Java Driver Batch Insert: Ordered vs Unordered Execution

Mongodb Java Driver Batch Insert

Recent Posts

Comments

Bridal

Book

LILY & JACOB

Capture It

VISIT OUR BLOG