MongoDB script has limited computational ability in realizing complicated operations, so it is difficult to solve problems of this kind using it alone. In many cases, you can only perform further computations after retrieving the desired data out. And there is no less difficulty in trying to realizing this kind of set operations with high-level programming languages like Java. In this case, you can use esProc to help with the computation in MongoDB. An example will be provided for explaining how esProc works.
There is a collection – test – in MongoDB, as shown below:
> db.test.find({},{"_id":0})
{ "value" : NumberLong(112937552) }
{ "value" : NumberLong(715634640) }
{ "value" : NumberLong(487229712) }
{ "value" : NumberLong(79198330) }
{ "value" : NumberLong(440998943) }
{ "value" : NumberLong(93148782) }
{ "value" : NumberLong(553008873) }
{ "value" : NumberLong(336369168) }
{ "value" : NumberLong(369669461) }
…
Specifically, test includes multiple values, each of which is a digital string. It is required that each digital string be compared with all the other digital strings and find the biggest same digit and the biggest different digit in each digital string. If the number 1 exists both in the first row and in the nth row, their same digit will be counted as one. If the number exists only in the first row, and there is no such a number in the nth row, you can count one different digit.
esProc code:
The final result after the code is executed
is as follows:
A1: Connect to MongoDB. Both IP and the
port number is localhost:27017. The
database name, user name and the password all are test.
A2: find
function is used to fetch data from MongoDB and create a cursor. orders is the collection, the filtering
condition is null and _id , the
specified key, won’t be fetched. It can be seen that esProc uses the same
parameter format in find function as
that in find statement in MongoDB.
esProc’s cursor supports fetching and processing data in batches, thereby
avoiding the memory overflow caused by importing big data at once. As the data
size is not big, fetch function is
used to get the records altogether from the cursor.
A3: Add two new columns to A2 for storing
the biggest same and different numbers. And, at the same time, convert values
into strings.
A4: Perform loop on the collection in A3,
the loop body covers an area of B4-D10.
B4: Get the value on the current loop.
C4: Use array@s
to split the column value into a sequence consisting of single characters and
remove the duplicate values.
B5: Perform an inner loop on the collection
in A3. The loop body is C6-D10.
C5: If the loop position of the inner loop
is the same as the current one in the outer loop, that is, they hold the same
value, skip the current inner loop and move on to the next.
C6: Get the value on the current inner
loop.
C7: Define two variables - same and diff – for storing the same numbers and different numbers respectively
got through the current comparison. The initial value is defined as zero.
C8: loop
function is used to examine one by one in the inner loop the numerical values
of the sequence formed by splitting values in the outer loop. If a same value
is caught, the value of same will increase by one; otherwise the value of diff
will increase by one.
C9, C10: Compare same and diff with those
in A4, and reassign the bigger values to the same and diff in A4.
Note: esProc isn’t equipped with a Java
driver included in MongoDB. So to access MongoDB using esProc, you must put
MongoDB’s Java driver (a version of 2.12.2 or above is required for esProc,
e.g. mongo-java-driver-2.12.2.jar) into [esProc installation
directory]\common\jdbc beforehand.
The esProc script used to help MongoDB in
the computation is easy to be integrated into the Java program. You just need
to add another line of code – A11 – that is, result A3, for outputting a result
in the form of resultset to Java
program. For the detailed code, please refer to esProc Tutorial. In the same way, MongoDB’s Java driver must be put
into the classpath of a Java program before the latter accesses MongoDB by
calling an esProc program.
No comments:
Post a Comment