August 19, 2014

Application of Index Sequences in esProc

In databases, creating appropriate indexes for some tables can greatly increase query efficiency. Similarly, index sequences can be created for record sequences or table sequences in esProc to increase efficiency in querying data repeatedly.

For example, we need to query food order Order_Foods.txt repeatedly.

Records of food orders queried by A1 are as follows (altogether 50,000 records):

A2 gets 1,000 records of food orders arbitrarily and records their product names and purchase quantities as query conditions for use in the later test query (here repetition is allowed). Data in A2 are as follows:

In the following, in order to test the role of index sequences, we'll query data of food orders among A1's data according to 1,000 names in A2 with and without an index sequence respectively.

First let’s look at the situation without an index sequence. Since records in Order_Foods.txt are sorted by Date, i.e. the order date, binary search cannot be used when searching by product names, otherwise errors will occur.

Expressions in B2 and B3 get the current time through now()function and roughly estimate query time (millisecond). Query results are stored in B3 as follows:

Estimated time for B3 is as follows:

Then let's move to situation where anindex sequence is used:

First create an index sequence corresponding to PName and Quantity, thus binary search can be used to make query by making use of indexes. In order to compare the efficiency of the two situations, time for creating an index sequence is also be included. The index sequence in A4 is as follows:

As binary search is used in A5 to query data, the query condition should be modified to mode x==0. Results are the same as those in A3:

Estimated time for B5 is as follows:

By comparing results in B3 and B5, it can be seen that the second method is much more efficient. That is to say, query speed can be significantly increased by using binary search on the condition that anindex sequence is created. Note that computations are involved to create index sequences. The more we use anindex sequence to query, the more efficient the query becomes. So it is unnecessary to create anindex sequence if query is not frequent.

For specific databases and query modes, it is not necessary to create an index sequence each time query is executed. The index sequence can be stored after it is created. For example:

Thus it is no need to recreate the index sequence for the next query. It will do by simply importing the index file. 

In this way, the query speed is faster than creating a new one. 

August 18, 2014

Alignment Grouping and Enumeration Gouping in esProc

Grouping records is often required during presenting and analyzing data in databases. Though records can group the designated fields by using Group By statement in SQL statements, this type of grouping serving the purpose of summaryis too simple to deal with some complex situations, like grouping according to designated order or grouping with identical records. But with esProc, we can use alignment grouping function P.align@a() or enumeration grouping function P.enum() to manage various complicated requirements for grouping.

We'll look at how to use alignment grouping and enumeration grouping in esProc through the following example.

1. Alignment grouping

Database table LIQUORS has information of some wines:
But how can we group these wines by varieties in the order of Vodka, Gin, Rum, Whisky, Brandy, Tequila and Cordial?
We can manage it easily in esProc with P.align@a() function:
We just need to execute alignment grouping on records in A1 by using the field TYPE according to the designated sequence A2. The computed results in A3 are as follows:
Among these results, each group is made up of records in A1. Double-click to see more.

When P.align@a() is at work, it will look up every member of the record sequence P and put it into the group that meets the condition. A member cannot appear more than once during this process. If we want to modify A2 and add another Gin as the grouping value as follows:

Seen from the computed results of A3, alignment grouping will only put records to the first eligible group and make the last group an empty one:
esProc's P.align() function provides many options that can manage various situations in alignment operations.
If @n is used by P.align@a() function, a new group for taking all the ungrouped records will appear except those designated ones. Because @n option is only used in alignment grouping operations, here @a can be omitted:

Computed results of A3 are as follows:
If @a option is not used by P.align() function, only the first value of each group will be retained:

Computed results of A3 are:

If @p option is used by P.align@a() function,groups won't store records except their sequence numbers:

Computed results of A3 are:
It can be seen that only sequence number of each record is stored in the group.

If @s option is used by P.align() function, result returned will be a record sequence, that is, the re-sorted records in the table according to designated sequences. This is similar to the result obtained by using order by clause in SQL statements, only with a different sorting principle:

Now the computed results of A3 are as follows:
Note that when @s option is at work, it is invalid to introduce @a option at the same time.

If @b option is used by P.align@a() function, binary search will be used for alignment grouping. Using binary research will bring higher efficiency, but meanwhile, the alignment sequence designated by grouping must be ordered:
Sort A2 in A3,and the computed results are as follows when @a option and @b option are used simultaneously:

While A4 doesn't execute sorting of A2, so the results of alignment grouping are incorrect when @a option and @b option are used simultaneously:
If grouping expressions directly compute each record's place in alignment grouping, then P.align@a(n,y)function can be used:

Because the places have been directly set, the specific value of each member in grouping value sequence of align function won't affect the result, and we can use the simplest sequence to(n) to complete alignment grouping according to the number of groups. Here to(n) can be abbreviated to n. Expressions in A3 and A4 have the same computed results:

If @r option is used by P.align@a(n,y) function, each record will correspond to a sequence of group numbers and can be put into more than one group. Such as:

Computed results of A3 are:
P.align() function can manage more complicated jobs through combinations of its options.

2.Enumeration grouping

Enumeration grouping is, in fact, a type of alignment grouping. Its alignment basisis the computed results of designated expressions.
For example, divide the wine information in table LIQUORS into three groups by names: ?<"D",?<"K", and ?>="K". First create a sequence according to the conditions, then execute enumeration grouping on all information of wines by using P.enum() function according to the condition sequences: 

Computed results of A3 are:
By default, records of P only appear once when P.enum() function is used for grouping. It can be noticed that the grouping condition of the second group is ?<"K", and though the records in the first group satisfy this condition, they won't appear again in the second group.

Various options can be used by enum function to realize its different functions.

If @r option is used by P.enum() function, the same records are allowed to appear in more than one groups:

Computed results of A3 are as follows:
It can be seen that, for the time being, the second group contains all eligible records, including those in the first group.

If @p option is used by P.enum() function, the groups will store only the corresponding record numbers:

Now the computed results of A3 are as follows:
It can be seen that each group stores only the corresponding record numbers.

If @n is used by P.enum() function, a new group will be appended in the end to take the ungrouped records. If the third grouping condition is modified and @n option is used in enumeration grouping:

Results of A3 are as follows:

August 17, 2014

Comparison of esProc and R Language in Processing Text Files

As languages for data computations, both esProc and R language have rich functions to process text files. They have many similarities in basic usage, as well as obvious differences, such as in the aspect of processing files with fixed column width and big text files, reading and writing designated columns, computational performance, etc. The article aims to compare their similarities and differences.

1.Comparison of basic functions

Description:

There are six columns in sales.txt, they are separated from each other by tab \t. Lines are separated from each other by line break\n. The first row contains column names. Read the file into the memory and write it anew. The first rows of the file are as follows:
esProc:
data=file("e:\\sales.txt").import@t()
file("e:\\salesResult.txt").export@t(data)

R language:
data<-read.table("e:\\sales.txt",sep="\t", header=TRUE)
write.table(data, file="e:\\ salesResult.txt",sep="\t",quote=FALSE,row.names=FALSE)

Comparison:
1.  Both esProc and R language can do this job conveniently. esProc uses function option "@t"to represent that the first row contains column names, while R language uses "header=TURE" to do the same thing.
2.  Line breaks are the most common separators for separating lines from each other. Both esProc and R language support line breaks by default. And tabs are the most common separators for separating columns from each other. esProc supports tabs by default. If other types of separators like comma are designated to be used, the code should be import@t(;",").In R language, default column separators are “blanks and tabs”, which can mistakenly separate the Client column containing blanks into two columns,  thus sep="\t" is needed to define separators as tabs. In addition, "quote=FALSE,row.names=FALSE" in the code represents that it is not necessary to put elements in quotes and to output row number.
3. Usually, files read into the memory will be stored as structured two-dimensional data objects, which are called table sequence(TSeq) in esProc or data frame (data.frame) in R language. Both TSeq and data.frame have rich computational functions. For example,group by Client and SellerID, then sum up Amount and find maximum. The code for esProc to perform the computations is:
data.groups(Client,SellerId;sum(Amount),max(OrderID))
As data.frame doesn't directly support simultaneous use of multiple aggregation methods, two steps are needed to sum up and find maximum. Finally, cbind will be used to combine the results. See below:
result1<-aggregate(data[,4],data[c(2,3)],sum) 
result2<-aggregate(data[,1],data[c(2,3)],max)
result<-cbind(result1,result2[,3])
4. Except storing files as the structured two-dimensional data objects in the memory, esProc can access files by cursor objects. While R language can access files by matrix objects.
Conclusion:For basic file reading and writing, both esProc and TSeq provide rich functions to meet users’ needs.

2. Reading files with fixed column width

In some files, fixed width, instead of separators, is used to differentiate one column from another. For example, read file static.txt which contains three columns of data into the memory and modify column names respectively to col1, col2 and col3, among which the width of col1 is 1, that of col2 is 4 and that of col3 is 3.
A1.501.2
A1.551.3
B1.601.4
B1.651.5
C1.701.6
C1.751.7
esProc:
data=file("e:\\static.txt").import()
data.new(mid(_1,1,1):col1, mid(_1,2,4):col2, mid(_1,6,8):col3)

R language:
data<-read.fwf("e:\\sales.txt ", widths=c(1, 4, 3),col.names=c("col1","col2","col3"))

Comprison:
R language does this job directly while esProc does it indirectly by reading the file into the memory first and split it into multiple columns. Note that in the code mid(_1,1,1), “_1” represents default column names, and if the file read into the memory has more than one column, the default column names will be in due order: _1_2_3 and so on.

Conclusion:R language is more convenient than esProc because it can read files with fixed column width.

3.  Reading and writing designated columns

Sometimes only someof the data columns are needed in order to save memory and enhance performance. In this example, read columns ORDERID, CLIENT and AMOUNT into the memory and write ORDERID and AMOUNT to a new file.


esProc:
data=file("e:\\sales.txt").import@t(ORDERID,CLIENT,AMOUNT)
file("e:\\salesResult.txt").export@t(data,ORDERID,AMOUNT)

R language:
data<-read.table("e:\\sales.txt",sep="\t", header=TRUE)
col3<-data[,c(“ORDERID”,”CLIENT”,”AMOUNT”)]
col2<-col3[,c(“ORDERID”,”AMOUNT”)]
write.table(col2, file="e:\\ salesResult.txt", sep="\t",quote=FALSE,row.names=FALSE)

Comparison:
esProc does the job directly, while R language does it indirectly by reading all columns into the memory and saving designated columns in a new variable.

Conclusion:R language can only read all columns into the memory, which will occupy a relatively large memory.

4.Processing big text files

Big text files are files whose sizes are bigger than memory size. Usually they are processed by reading and computing in batches. For example, in big text file sales.txt, filter data according to the condition Amount>2000 and sum up Amount of each SellerID.
esProc:
A1: As reading the big text file into the memory at a time will result in memory overflow, it will be read in batches with cursor.
A2: Read by loop with 100,000 rows of data each time and store them in TSeq A2.
B3: Among each batch of data, filter out records whose order amount is greater than 2,000.
B4: Group and summarize the filtered data, and seek each seller’s sales amount in this batch.
B5: Append the computed results of this batch to a certain variable (B1), and begin the computation of the next batch.
B6: After the computations all batches are over, each seller’s sales amount of each batch can be found in B1, execute another and the last grouping and summarizing to get the total sales amount of each seller.


R language:

1-4Create an empty data frame data to generate each batch's data frame databatch.
5-9Create an empty data frame agg to append the results of grouping and summarizing of each batch.
11-13Read in the file by rows, with 100,000 lines each time, but skip the column names of the first row.
15-21In each batch of data, filter out records whose order amount is greater than 2,000.
22Group and summarize the filtered data, and seek each seller’s sales amount of this batch.
23Append the computed results of this batch to a certain variable (agg), and begin the computation of next batch.
24After the computations of all batches are over, each seller’s sales amount of each batch can be found in B1, execute another and the last grouping and summarizing to get the total sales amount of each seller.  
Comparison:
1.Both of them have the same way of thinking. Differences are that esProc does the job with library function and its code is concise and easy to understand, while R language needs to process a great deal of details manually and its code is lengthy, complicated and error-prone.
2With esProc cursor, the above computations can be performed more easily, that is:
In this piece of code, esProc engine can automatically process data in batches, and it is not necessary for programmers to control manually by loop statements.
Conclusion:In processing big text files, esProc code is more concise, more flexible and easier to understand than that of R language.

5.  Processing big text files in parallel

Parallel computing can make full use of the resource of multi-core CPU and significantly improve computational performance. 

The example in the above part is still used here, but parallel computing is used. That is, divide sales.txt into four segments to give four CPU cores to perform computations, then filter data according to the condition Amount>2000 and compute the total sales amount of each seller.


esProc:
Main programpro5.dfx
         A1: Set the number of parallel tasks as four, meaning the file would be divided into four segments.
         A2: Call subprogram to perform multithreaded parallel computing, and there are two task parameters: to(A1) and A1. Value of to(A1) is [1,2,3…24], representing segment number assigned to each task; A1 is the total number of segments. When all the tasks are completed, all computed results will be stored in the current cell.
         A3: Merge the computed results of every task in A2 according to SellerID.
         A4: Group and summarize the merge results and seek each seller’s sales amount.


Subprogramsub.dfx
         A1: Read the file with cursor, and decide which segment of the file the current task should process according to the parameter sent by the main program. Take the third task as an example, value of the parameter segment is 3 and that of parameter total is always 4.
         A2: Select records whose order amount is greater than 2,000.
         A3: Group and summarize the filtered data.
         A4: Return the computed results of current task to main program.
R language:
It cannot do this job by using parallel computing.
Comparison: esProc can read big text files segmentally by bytes, and designated part by skipping useless data and supporting multithreaded parallel computing in the low level.
Though R language can perform parallel computing of in-memory data, it cannot read files in disk segmentally by bytes. It can also read data by skipping multiple rows, but this method has to traversal all useless data, resulting in poor performance and inability to perform parallel computing of big text files in the low level.
In addition, esProc can automatically manage the situation that there is only half line of data when segmenting by bytes, as shown in the above code, thus it is unnecessary for programmers to handle it manually.
Summary:
esProc can process big text files in parallel and has a high computational performance. R language cannot perform the parallel computing of big text files in the low level and has a much poorer performance.

6.  Computational performance

Under the same test circumstance, use esProc and R language to read a file of 1G size, and summarize one of the fields. 
esProc:
=file("d:/T21.txt").cursor@p(#1:long)
=A1.groups(;sum(#1))
R language:
   con<- file("d:/T21.txt", "r")
   lines=readLines(con,n=1024)
   value=0
   while( length(lines) != 0) {
         for(line in lines){
                   data<-strsplit(line,'\t')
                   value=value+as.numeric(data[[1]][1])
         }
         lines=readLines(con,n=1024)
   }
   print(value)
   close(con)

Comparison:
1. It takes esProc 26 seconds and R language 9 minutes and 47 seconds respectively to finish the task. Their gap exceeds an order of magnitude.
2. In processing big files, R language cannot use data frame objects and library function. It can only write loop statements manually and compute while the file is being read, so the performance is poor. esProc can directly use cursor objects and library function and has a higher performance. But there is no big difference between them when processing small files.

Summary:
         esProc's performance is far beyond that of R language in processing big text files.


August 14, 2014

Comparison of Loop Function in esProc and R Language

Loop function can traverse every member of an array or a set, express complicated loop statements with simple functions, as well as reduce the amount of code and increase readability. Both esProc and R language support the loop function. The following will compare their similarities and differences in usage.

1.Generating data

Generate odd numbers between 1 and 10.
esProc:
    x=to(1,10).step(2)            
In the code, to(1,10)generates consecutive integers from 1 to 10, step function gets members in consecutively according to the computed result of last step and the final result is [1,3,4,5,7,9]. This type of data in esProc is called a sequence.
The code has a simpler version: x=10.step(2).
R language:
         x<-seq(from=1,to=10,by=2)    
This piece of code gets integers directly and inconsecutively from 1 to 10. Computed result is c(1,3,4,5,9). This type of data in R language is called vector.

A simpler version of this piece of code isx<-seq(1,10,2).
Comparison:
1.Both can solve the problem in this example. esProc needs two steps to solve it, indicating theoretically a poor performance. While R language can resolve it with only one step, displaying a better performance.
2.The method for esProc to develop code is getting members from a set according to the sequence number. It is a common method. For example, there is a string sequence A1=["a", "bc", "def"……],now get strings in the positions of odd numbers. Here it’s no need to change the type of code writing, the code isx=A1.step(2).

R language generates data directly, thus it has a better performance. It can write common expressions, too. For example, get strings in the positions of odd numbers from the string vector quantity A1=c("a", "bc", "def"……), the expression in R language can bex=A1[seq(1,length(A1),2)].
3.esProc loop function has characteristics that R language hasn’t, that is, built-in loop variables and operators. “~” represents the loop variable, “#” represents the loop count, “[]” represents relative position and “{}” represents relative interval. By using these variables and operators, esProc can produce common concise expressions. For example, seek square of each member of the set A2=[2,3,4,5,6]:
         A2.(~*~)                              /Result is[4,9,16,25,36], which can also be written as A2**A2. But the latter lacks a sense of immediacy and commonality.R language can only use A2*A2 to express the result.
         Get the first three members:
         A2.select(#<=3)                 / Result is [2,3,4]
         Get each member’s previous member and create a new set:
         A2.(~[-1])                             / Result is [null,2,3,4,5]
         Growth rate:
         A2.((~ - ~[-1])/ ~[-1])         /Result is [null,0.5,0.33333333333,0.25,0.2]
         Moving average:
         A2.(~{-1,1}.avg())               /Result is [2.5, 3.0, 4.0, 5.0, 5.5]
Summary:
         In this example, that R language can directly generate data and produce common expressions shows that it is more flexible and takes less memory space than esProc.

2. Filtering records

Computational objects of a loop function can be an array or a set whose members are single value, or two-dimensional structured data objects whose members are records. In fact, loop function is mainly used in processing the latter. For example, select orders of 2010 whose amount is greater than 2,000 from sales, the order records.
Note: sales originates from a text file, some of its data are as follows: 
esProc:
sales.select(ORDERDATE>=date("2010-01-01") && AMOUNT>2000)
Some of the results are:
R language:
Some of the results are:
Comparison:
1. Both esProc and R language can realize this function. Their difference lies that esProc uses select loop function while R language directly uses index. But there isn't an essential distinction between them. In addition, R language can further simplify the expression by using attach function:
sales[as.POSIXlt(ORDERDATE)>=as.POSIXlt("2010-01-01") & AMOUNT>2000,]
Thus, there are more similarities between them.         
2. Except query, loop function can be used to seek sequence number, sort, rank, seek Top N, group and summarize, etc. For example, seek sequence numbers of records.
    sales.pselect@a(ORDERDATE>=date("2010-01-01") && AMOUNT>2000)   /esProc
    which(as.POSIXlt(sales$ORDERDATE)>=as.POSIXlt("2010-01-01") &sales$AMOUNT>2000) #R language

For example, sort records by SELLERID in ascending order and by AMOUNT in descending order.
    sales.sort(SELLERID,AMOUNT:-1)                              /esProc
    sales[order(sales$SELLERID,-sales$AMOUNT),]    /R language
For example, seek the top three records by AMOUNT.
    sales.top(-AMOUNT;3)                                      /esProc
    head(sales[order(-sales$AMOUNT),],n=3)               /R language

3. Sometimes, R language computes with index, like filtering; sometimes it computes with functions, like seeking sequence numbers of records; sometimes it programs in the form of “data set + function + data set”, like sorting; and other times it works in the way of “function + data set + function”, like seeking TopN. Its programming method seems flexible but is liable to greatlyconfuse programmers. By comparison, esPoc always adopts object-style method “data set + function + function …”in access. The method has a simple and uniform structure and is easy for programmers to grasp.
Here is an example of performing continuous computations. Filter records and seek Top N. esProc will computelike this:
    sales.select(ORDERDATE>=date("2010-01-01") && AMOUNT>2000).top(AMOUNT;3)
And R language will compute in this way:
    Mid<-sales[as.POSIXlt(sales$ORDERDATE)>=as.POSIXlt("2010-01-01") &sales$AMOUNT>2000,]
    head(Mid [order(Mid$AMOUNT),],n=3)

As you can see, esProc is better at programming multi-step continuous computations.
Summary:In this example, esPoc gains the upper hand in ensuring syntax consistency and performing continuous computations, and is more beginner-friendly.

3.       Grouping and summarizing

The loop function is often employed in grouping and summarizing records. For example, group by CLIENT and SELLERID, and then sum up AMOUNT and seek the maximum value.
esProc:
    sales.groups(CLIENT,SELLERID;sum(AMOUNT),max(AMOUNT))
Some of the results are as follows:
R language:
    result1<-aggregate(sales[,4],sales[c(3,2)],sum) 
    result2<-aggregate(sales[,4],sales[c(3,2)],max)
    result<-cbind(result1,result2[,3])
Some of the results are as follows:
Comparison:
1.In this case, more than one summarizing method is required. esProc can complete the task in one step. R language has to go through two steps to sum up and seek the maximum value, and finally, combine the results with cbind, because its built-in library function cannot directly use multiple summarizing methods simultaneously. Besides, R language will have more memory usage in completing the task.
2. Another thing is the illogical design in R language. For sales[c(3,2)], the group order in the code is that SELLERID is ahead of CLIENT, but in business, the order is completely opposite. In the result, the order changes again and becomes the same as that in the code. In a word, there is not a unified standard for business logic, the code and the computed result.
Summary:In this example, esProc has the advantages of high efficiency, small memory usage and having a unified standard.

4.Seeking quadratic sum

Use a loop function to seek quadratic sum of the set v=[2,3,4,5].
Please note that both esProc and R language have functions to seek quadratic sum, but a loop function will be used here to perform this task.
esProc:
v.loops(~~+~*~;0)
R language:
1.Both esProc and R language can realize this function easily.
2.The use of loops function by esProc means that it sets zero as the initial value, computes every member of v in order and returns the final result. In the code, "~" represents member being computed and "~~" represents computed result of last step. For example, the arithmetic in the first step is 0+2*2 and that in the second step is4+3*3, and so forth.The final result is 54.
The use of reduce function by R language means that it computes members of [0,2,3,4,5] in order, and puts the computed result of the current step into the next one to go on with the computation. As esProc, the arithmetic in the first step is 0+2*2 and that in the second step is 4+3*3, and so forth.
3. R language employs lambda expression to perform the operation. This is one of the programming methods of anonymous functions, and can be directly executed without specifying the function name. In this example, function(x,y),the specification, defines two parameters; x+y*y, the body, is responsible for performing the operation; c(0,v) combines  0and into[0,2,3,4,5] in which every member will take part in the operation in order. Because it can input a complete function, this programming method becomes quite flexible and is able to perform operations containing complicated functions.
The esProc programming method can be regarded as an implicit lambda expression, which is essentially the same as the explicit expression in R language. Butit has a bare expression without function name, specification and variables and its structure is simpler. In this example, "~" represents the built-in loop variable unnecessary to be defined; ~~+~*~is the expression responsible for performing the operation; v is a fixed parameter in which every member will take part in the operation in order. Being unable to input a function, it is not as good as R language theoretically in flexibility and ability of expression.
4. Despite being not flexible enough in theory, esProc programming method boasts convenient built-in variables and operators, like ~, ~~, #, [], {}, etc., and gets a more powerful expression in practical use. For example, esProc uses“~~” to directly represent the computed result of last step, while R language needs reduce function and extra variables to do this. esProc can use “#” to directly represent the current loop number while R language is difficult to do this. Also, esProc can use “[]”to represent relative position. For example, ~[1]is used to represent the value of next member and Close[-1]is used to represent value of the field Close in the last record.
In addition, esProc can use“{}”to represent relative interval. For example, {-1,1}represents the three members between the previous and next member. Therefore,the common expression v.(~{-1,1}.avg())can be used to compute moving average, while R language needs specific functions to do this. For example,there is even no such a function for “seeking average” in the expression filter(v/3, rep(1, 3),sides = 1), which is difficult to understand for beginners.

Summary:In this case, the lambda expression in R language is more powerful in theory but is a little difficult to understand. By comparison, esProc programming method is easier to understand.

5. Inter-rows and –groups operation

Here is a table stock containing daily trade data of multiple stocks. Please compute daily growth rate of closing price of each stock.
Some of the original data are as follows:
esProc:
   A10=stock.group(Code)
   A12=A11.(~.derive((Close-Close[-1]):INC))
R language:
for(I in 1:length(A10){
    A10[[i]][order(as.numeric(A10[[i]]$Date)),] #sort by Date in each group
         A10[[i]]$INC<-with(A10[[i]], Close-c(0,Close[- length (Close)])) #add a column, increased price
}
Comparison:
1. Both esProc and R language can achieve the task. esProc only uses loop function in computing, achieving high performance and concise code. R language requires writing code manually by using for statement, which brings poor performance and readability.
2.  To complete the task, two layers of loop are required: loop each stock, and then loop each record of the stocks. Except being good at expressing the innermost loop, loop function of R language (including lambda syntax) hasn't built-in loop variables and is hard to express multi-layer loops. Even if it manages to work out the code, the code is unintelligible.
Loop function of esProc can not only use “~” to represent the loop variable, but also be used in nested loop, therefore, it is expert at expressing multi-layer loops. For example, A10.(~.sort(Date))in the code is in fact the abbreviation of A10.(~.sort(~.Date)).The first “~” represents the current stock, and the second "~" represents the current record of this stock.
3. As a typical ordered operation, it is required that the closing price of last day be subtracted from the current price. With the useful built-in variables and operators, such as #,[] and {}, esProc is easy to express this type of ordered operation. For example, Close-Close[-1]can represent the increasing amount. R language can also perform the ordered operation, but its syntax is much too complicated due to the lack of facilities like loop number, relative position, relative interval and so on. For example, the expression of increasing amount is Close-c(0,Close[- length (Close)]).
It is hard enough for loop function in R language to perform the relative simple ordered operation in this example, let alone the more complicated operations. In those cases, multi-layer for loop is usually needed. For example, find out how many days the stock has been rising:
A10<-split(stock, stock $Code)
for(I in 1:length(A10){
   A10[[i]][order(as.numeric(A10[[i]]$Date)),] #sort by Date in each group
   A10[[i]]$INC<-with(A10[[i]], Close-c(0,Close[- length (Close)])) #add a column, increased price
         if(nrow(A10[[i]])>0){  #add a column, continuous increased days
                   A10 [[i]]$CID[[1]]<-1
         for(j in 2:nrow(A3[[i]])){
         if(A10 [[i]]$INC[[j]]>0 ){
                 A10 [[i]]$CID[[j]]<-A10 [[i]]$CID[[j-1]]+1
         }else{
                 A10 [[i]]$CID[[j]]<-0
               }
             }   
           }
}

The code in esProc is still concise and easy to understand:
    A10=stock.group(Code)
    A11=A10.(~.sort(Date))
A12=A11.(~.derive((Close-Close[-1]):INC), if(INC>0,CID=CID[-1]+1, 0):CID))

Summary:In performing multi-layer loops or inter-rows and -groups operations, esProc loop function has higher computational performance and more concise code.