Inter-row computation is one of those complicated SQL-style computations we often need to handle during database application development. For instance, to compute the link relative ratio by dividing each month’s sales amount by that of the last month, or to compute the year-on-year comparison by dividing each month’s sales amount by that of the same month in last year. As some databases don’t provide the window functions of SQL2003-standard (or don’t support them completely), we need to use JOIN operation to realize the inter-row computation. This method produces difficult code and has poor performance. Even if the window functions are used, we will still be faced with problems like the use of nested subquery. The SQL statements in this case are always quite complicated. However, esProc provides an easy and intuitive way to perform the inter-row computation, which will be illustrated through an example.

Table

esProc code:

A1：Select data from the database according to a specified time period. begin and end are external parameters. For instance, begin=”2011-01-01 00:00:00″, end=”2014-07-08 00:00:00″. Some of the selected data are as follows:

A2=A1.groups(year(OrderDate):y,month(OrderDate):m; sum(Amount):mAmount)

This line of code groups the selected data by the year and the month and sums up the sales amount of each month. groups function is used to perform data grouping and summarizing. Its parameters have two parts, the one before the semicolon is a grouping expression, which is year(OrderDate):y,month(OrderDate):m, and the other one after it is a summarizing expression, which is sum(Amount):mAmount. The summarized field is named

A3=A2.derive(mAmount/mAmount[-1]:lrr)

This line of code adds a new field

Please note that the link relative ratio of the initial month (January of the year 2011) is empty.

A4=A3.sort(m)

This line of code sorts the data in A3 by the month and the year so as to compute the year-on-year comparison. The complete code is =A3.sort(m,y). Because the data in A3 have been sorted by the year, we just need to sort the data here by the month, that is, A3.sort(m), which gets a higher performance. Part of the result is as follows:

A5=A4.derive(if(m==m[-1],mAmount/mAmount[-1],null):yoy)

This line of code adds a new field yoy – the year-on-year comparison of monthly sale amount – to the result of A4. The expression for the field is if (m==m[-1],mAmount/mAmount[-1],null), which means the year-on-year comparison has value only when same months are compared. if function has three parameters, among which m==m[1-] is a boolean expression. When the value of this expression is true, return mAmount/mAmount[-1]; when it is false, return null. Please note that the value of year-on-year comparison of each month of the initial year (the year of 2011) is always null. Part of the result is as follows:

For the convenience of viewing, a line of code – A6=A5.sort(y:-1,m) – can be added, which means sorting the result of A5 by the year in descending order and by the month in ascending order. Please note the data are valid by the end of July of the 2014. Result is as follows:

*sales*stores years of information of orders. Please compute the link relative ratio and the year-on-year comparison of each month in a specified time period based on this table. Some of the original data are as follows:esProc code:

A1：Select data from the database according to a specified time period. begin and end are external parameters. For instance, begin=”2011-01-01 00:00:00″, end=”2014-07-08 00:00:00″. Some of the selected data are as follows:

A2=A1.groups(year(OrderDate):y,month(OrderDate):m; sum(Amount):mAmount)

This line of code groups the selected data by the year and the month and sums up the sales amount of each month. groups function is used to perform data grouping and summarizing. Its parameters have two parts, the one before the semicolon is a grouping expression, which is year(OrderDate):y,month(OrderDate):m, and the other one after it is a summarizing expression, which is sum(Amount):mAmount. The summarized field is named

*mAmount*. Part of the computed result is as follows:A3=A2.derive(mAmount/mAmount[-1]:lrr)

This line of code adds a new field

*Irr*– the monthly link relative ratio – to the result of A2. The expression for the field is mAmount/mAmount[-1]. In esProc, [N] or [-N] is used to refer to Nth record after or before the current one. Thus mAmount in the code represents the sales amount of the current period and mAmount[-1] represents the sales amount of the last period. Result of A3 is as follows:Please note that the link relative ratio of the initial month (January of the year 2011) is empty.

A4=A3.sort(m)

This line of code sorts the data in A3 by the month and the year so as to compute the year-on-year comparison. The complete code is =A3.sort(m,y). Because the data in A3 have been sorted by the year, we just need to sort the data here by the month, that is, A3.sort(m), which gets a higher performance. Part of the result is as follows:

A5=A4.derive(if(m==m[-1],mAmount/mAmount[-1],null):yoy)

This line of code adds a new field yoy – the year-on-year comparison of monthly sale amount – to the result of A4. The expression for the field is if (m==m[-1],mAmount/mAmount[-1],null), which means the year-on-year comparison has value only when same months are compared. if function has three parameters, among which m==m[1-] is a boolean expression. When the value of this expression is true, return mAmount/mAmount[-1]; when it is false, return null. Please note that the value of year-on-year comparison of each month of the initial year (the year of 2011) is always null. Part of the result is as follows:

For the convenience of viewing, a line of code – A6=A5.sort(y:-1,m) – can be added, which means sorting the result of A5 by the year in descending order and by the month in ascending order. Please note the data are valid by the end of July of the 2014. Result is as follows:

## No comments:

## Post a Comment