There are a lot of occasions when set operations are needed to do
structured data summarizing and analysis. For example, to list all students who
have published papers, find the employees who have participated in all trainings
and select students who passed the examination for the re-examination. esProc uses
sets everywhere. Its most commonly used data types, such as sequences and table
sequences, are all sets. Therefore, a better understanding and manipulation of sets
helps to perform data computing in a more reasonable and faster way.
For example, the table below contains sales data:
We are to find the clients whose monthly sales amount is always in
the top 20 in 2013. To do this we need to first retrieve the 2013 sales data, group
it by the month, then select the top 20 clients of each month by loops and
finally get the result by performing intersection. It is too difficult to
handle such a complex problem in either SQL or the stored procedure.
esProc performs veritable data grouping by dividing data into multiple sets as required. This is different from SQL "group by" command, which returns only the aggregate results. The grouping result of A3 is as follows:
Data has been automatically sorted before it is grouped. Each group is a set of sales records. Below are the records of March:
To know the total sales amount of each client in each month, we need to group the data by clients. In esProc, we just loop through data of every month and group it by customers. We can use A.(x) to loop through set members without the need to write loop code.
After the second grouping, in A4 every month’s data is a set of sets:
Now, the data of March is as follows:
We can see that each subgroup in March’s data is a set of
transaction records of a certain client.
A5 loops through each month’s data to get records of the top 20 clients of each month. A6 lists their names and monthly sales amount. The result of A6 is as follows:
Now it’s time to get the final result:
List names of the top 20 clients of every month in A7 and, finally, get the intersection of the 12 groups of top 20 names in A8:
From this example we can see that esProc ordered sets make the
problem-solving more intuitive. With sets, we can easily perform grouping,
sorting and other computations, making each step of data processing focused and
easy to understand. Moreover, the use of set concept can reduce the complexity
and coding workload of loop operation as well as the set operations like intersection.
No comments:
Post a Comment