MongoDB doesn’t support the complex
subquery which can only be realized by retrieving the data out first and then
performing further computation. The operation is the same complex even if Java
or other programming languages are used to write the program. In view of this,
we can consider using esProc to help MongoDB with the computation, the method
of which will be illustrated in detail through an example.
In MongoDB, there are two docments: orders, which stores the orders data,
and employee, which stores the
employee data. They are as follows:
> db.orders.find();
{ "_id" : ObjectId("5434f88dd00ab5276493e270"),
"ORDERID" : 1, "CLIENT" : "UJRNP
", "SELLERID" : 17,
"AMOUNT" : 392, "ORDERDATE" : "2008/11/2 15:28" }
{ "_id" :
ObjectId("5434f88dd00ab5276493e271"), "ORDERID" : 2,
"CLIENT" : "SJCH"
, "SELLERID" : 6,
"AMOUNT" : 4802, "ORDERDATE" : "2008/11/9 15:28"
}
{ "_id" :
ObjectId("5434f88dd00ab5276493e272"), "ORDERID" : 3,
"CLIENT" : "UJRNP
", "SELLERID" : 16,
"AMOUNT" : 13500, "ORDERDATE" : "2008/11/5 15:28"
}
{ "_id" :
ObjectId("5434f88dd00ab5276493e273"), "ORDERID" : 4,
"CLIENT" : "PWQ",
"SELLERID" : 9, "AMOUNT" :
26100, "ORDERDATE" : "2008/11/8 15:28" }
…
> db.employee.find();
{ "_id" :
ObjectId("5437413513bdf2a4048f3480"), "EID" : 1,
"NAME" : "Rebecca", "
SURNAME" : "Moore",
"GENDER" : "F", "STATE" : "California",
"BIRTHDAY" : "1974-1
1-20", "HIREDATE" :
"2005-03-11", "DEPT" : "R&D",
"SALARY" : 7000 }
{ "_id" :
ObjectId("5437413513bdf2a4048f3481"), "EID" : 2,
"NAME" : "Ashley", "S
URNAME" : "Wilson",
"GENDER" : "F", "STATE" : "New York",
"BIRTHDAY" : "1980-07-
19", "HIREDATE" :
"2008-03-16", "DEPT" : "Finance",
"SALARY" : 11000 }
{ "_id" :
ObjectId("5437413513bdf2a4048f3482"), "EID" : 3,
"NAME" : "Rachel", "S
URNAME" : "Johnson",
"GENDER" : "F", "STATE" : "New Mexico",
"BIRTHDAY" : "1970-
12-17", "HIREDATE" :
"2010-12-01", "DEPT" : "Sales",
"SALARY" : 9000 }
…
It is required to select the orders data in
which SELLERID is the EID of records of employees where STATE is California in employee. This can be expressed in SQL
like this:
Select * from
orders where orders.sellerid in (select eid from employee where employee.state=’
California’)。
The data of orders cannot be loaded entirely because their size is big, while
the data size of both employee and
the final result are not big.
The esProc script used to help MongoDB
realize this subquery is as follows:
A1: Connect to MongoDB. localhost:27017 is the IP and the port number. test is the database name, user name as
well as the password.
A2: Use find
function to fetch the data from MongoDB and create a cursor. orders is the collection, with a
filtering condition being null and _id
, the specified key, not being fetched. Here esProc uses the same parameter
format in find function as that in find statement of MongoDB. As esProc’s cursor
supports fetching and processing data in batches, the memory overflow caused by
importing big data all at once can thus be avoided.
A3: Fetch the desired data from employee according to the condition that
STATE is California.
A4: Sort the EID field of A3’s employee
data.
A5: Select the desired data from orders according to the condition
requiring that SELLERID must exist in the table sequence in A4. That is, the
condition can be expressed as SELLERID in
A4. The resulting data can be loaded all at once. But if the data are too
big, they can be fetched in batches. To fetch the top 1,000 rows, for example,
can be expressed as fetch(1000).
The result is as follows:
Note:esProc isn’t equipped
with a Java driver included in MongoDB. So to access MongoDB using esProc, you
must put MongoDB’s Java driver (a version of 2.12.2 or above is required for
esProc, e.g. mongo-java-driver-2.12.2.jar) into [esProc installation
directory]\common\jdbc beforehand.
The esProc script used to help MongoDB with
the computation is easy to be integrated into the Java program. You just need
to add another line of code - result A6
to output a result in the form of resultset
to Java program. For the detailed code, please refer to esProc Tutorial. In the same way, MongoDB’s Java driver must be put
into the classpath of a Java program before the latter accesses MongoDB by
calling an esProc program.
No comments:
Post a Comment