August 10, 2015

esProc Assists Report Development – Transpose Operation for Crosstab Creation

It’s difficult to handle unconventional statistical tasks using simply the reporting tool, like Jasper or BIRT, or SQL. One of the cases is that the source data don’t meet the crosstab’s requirements and thus need to be transposed for display. Having powerful computing engine to process structured data and being integration-friendly, esProc is very useful in assisting the handling of the case. An example will be cited to explain the transposition for designing a crosstab report.

The database table booking holds the summary data of goods orders in every year with four fields that include the year and three types of order status. Some of the data are as follows:

The report table should display the order information of the specified year and the previous one, in which the row headers are the three types of order status and the column headers include years and the growth rate for each order status in the specified year. The measurement is the order data of the current year. The layout and appearance of the report is as follows: 

The difficulty of creating this crosstab report is that the source data cannot be used directly and the values in the summary column need to be computed dynamically based on relative positions. However the difficulty will be significantly reduced if the column and row data in the source table can be rotated and summary values are computed, as shown below:   

Then use esProc code to compute the necessary data for the report: 

A1=yearBegin=yearEnd-1

yearEnd is a user-defined report parameter representing the specified year, such as the year of 2014. A1’s code is used to determine the previous year, which can be defined as yearBegin for the convenience of reference.

This line of code retrieves data of the specified year and the previous one from the database. myDB1, the data source name, points to MySQL. query function can not only execute the SQL statement but accept the parameters. Suppose that the value of yearEnd is 2014, A2’s result will be as follows: 

A3=create(row,col,value)

This line of code creates a table sequence with three fields – row, col and value – to store the transposed data and the summary values. The new table sequence is as follows: 

Note: Similar to the database result set, a table sequence is also a structured two-dimensional table. But its genericity allows a field to have data of different types and its orderliness allows the data being accessed by their sequence numbers. These two features of table sequence are conveniently made use of in implementing this task.


Through accessing the set ["visits","bookings","successfulbookings"] by loop and appending data to A3’s table sequence, this line of code gets data ready for report creation. The working range of for statement, B4-C7, is represented by indentation instead of the parentheses or identifiers like begin and end. Within the working range, A4, the name of the cell where forstatement resides, is used to reference the loop variable. During the first loop, for instance, A4’s value is “visits”.

Now let’s look at the code in the loop body.

B4=endValue=eval("A2(1)."+A4)    

This line of code dynamically retrieves order status data of the first record from A2. eval function can parse the string into an expression. For instance "A2(1)."+A4 will be parsed into A2(1).visits during the first loop and its result is 500. “A2(1)” represents the first record and “.visits” means retrieving the record’s visits field (as shown by the red box in the following figure). 

C4=beginValue=eval("A2(2)."+A4)

Similar to endValue, beginValue dynamically retrieves order status data of the second record from A2. Its value during the first loop is 400.

B5=A3.insert(0,A4,A2(1).year,endValue)

C5=A3.insert(0,A4,A2(2).year,beginValue)

These two lines code insert records into A3’s table sequence. insert function is used to insert one or more records into a table sequence. Its first parameter specifies the position where the insertion happens. If the value of this parameter is 0, then append the record in the end.

During the first loop, for instance, B5 inserts “visits”, 2014 and 500 into the table sequence and C5 inserts “visits”, 2013 and 400 into it. Then A3 becomes this: 

B6=endValue/beginValue-1

This line of code computes the growth rate of the specified year. Its value is B6=500/400-1=0.25 for the first loop.

C6=if(B6>0:"+",B6<0:"-")+string(B6,"#%")

This line of code is used to format the result of B6. The way is to add “+” before the percentage if B6>0 and to add “-” before it if B6<0. C6’s value during the first loop is “+25%”. Note: This step is not indispensable as data formatting can be executed more conveniently by the reporting tool.

B7=A3.insert(0,A4,string(yearEnd)+"/"+string(yearBegin),C6)

This line of code appends new records, such as “visits”, “2014/2013”, “+25%” during the first loop, to A3’s table sequence, as shown below: 

Note that the type of these data is string, which is different from that of data previously inserted.

After the whole loop is executed, all data the report requires will have been appended to A3, as shown below: 

result A3

This line of code returns the result table sequence in A3 to the reporting tool. esProc provides JDBC interface for integrating with the reporting tool that will identify esProc as a database. See related documents for the integration solution. 

Then a simple crosstab will be created with JasperReport, for instance. The template is as follows: 

Define parameter pyearEnd in the report to correspond to its counterpart in the esProc script. The following is the preview of the final report: 

The reporting tool calls the esProc script in the same way as that in which it calls the stored procedure. Save the esProc script as, say booking.dfx, to be called by booking $P{pendYear} in JasperReport’s SQL designer. 

No comments:

Post a Comment