Many a time, data could be stored in several data tables, for example, product sales records of several categories, and employee profiles of each department. In this case, we need to merge the data from multiple data tables for combined use. For the several normal homogeneous TSeqs, you can use A.conj() or A.merge(x) to merge the records of each TSeq into RSeq for use. If the big data is used in the data table, then you can also use CS.conj@x() and CS.merge@x(x) to combine the data in each cursor of cursor sequence CS, and merge and read them out when retrieving.
The data in cursor can only be traversed once, so it is impossible to sort over again after merging and retrieving all data from cursor. In view of this, the data in each cursor must be ordered in case of merging the data from multiple cursors.
Next, let's learn about the usage and difference between CS.conj@x() and CS.merge@x(x). Firstly, let's have a look at situation about the simple union.
Four pieces of text data are respectively used to record the order information about wines, electrical appliances, foods, and books. In A6, the data in the four pieces of text data cursor will be united. To find out the order in which the data are retrieved, the following code retrieves 300 records each time, and suspends data retrieval once the retrieved data contains records of goods of different categories. In this case, the retrieved TSeq can be seen in B7 as follows:
As can be
learned from the result, regarding the union cursor, after all wine order data are
retrieved from the 1st text data table, start to retrieve the electrical
appliance data from the 2nd text data table. In other words, after
the simple unionby using CS.conj@x() function, the records in the resulting cursor will be retrieved in
the same order as each cursor is arranged in the cursor sequence CS.
In this case, we intend to have a clear view of the order in which the records are retrieved from cursor after merging in proper order. To server this purpose, only the first 300 entries are retrieved. The TSeq in B7 is shown below:
As can be
seen, the data are retrieved in a specified order of Date. Once all wine order data of January 1stis retrieved,
retrieving all electrical appliance order data of the January 1st will start.
Because retrieving data with cursor is a forward-only operation that can only
be performed from the first to the last, the order data in each cursor must be
ordered by date. After using function CS.merge@x() to merge in proper order, by comparing the current computation
expression value on each data table, the result cursor will choose from the cursors
of sequence CS
to retrieve data when retrieving records. In this
way, we can ultimately get the result arranged in the specified order. In data
retrieving, each cursor will still traverse the records in each data table for
once.
When
merging the data in multiple cursors in proper order, the multiple cursors are
simply merged into one, and the orders in which to retrieve data in each cursor
has adjusted, without increasing or decreasing any record data.
Before the
data in cursor is merged in proper order by the product sequence number, you
must ensure the data in each cursor is ordered for the product sequence number.
To do so, in A5, use function cs.sortx() to complete the sorting.
Please
note that the cursor and TSeq are sorted differently. Because there are usually
great amount of data in the cursor, they cannot be loaded into the memory all
at once for sorting. Therefore, the data retrieving is performed along with the
data sorting. The data will be saved as temporary data files
when they are accumulated to a certain amount. Once all data are retrieved and
sorted, all temporary data files will be merged in proper order, and return as
the result cursor.
In B7, the retrieved records are shown below:
As can be seen, the
ordered merging can be accomplished once the data in each cursor have been
sorted.
No comments:
Post a Comment