September 4, 2014

How to Clear Cell Values to Release Memory in esProc

In esProc, the storage of cellset variables is ubiquitous. Cell values are convenient references during computation, but they could occupy too much memory. Data can be cleared from cells to reduce memory footprint after they accomplish their mission in computation. It should be particularly noted that, when the intermediate data are obtained and further complicated computations are needed,cell values no longer to be used must be deleted to reduce memory usage in order to effectively avoid memory overflow.

Please look at the following case. List top 200 transaction records of all household appliances orders and food orders according to total order amount, and sort by product names. Order records come from two text files: Order_Appliances.txt and Order_Foods.txt. First summarize the data in the two sheets, get top 200 order records in total order amount, and then sort by product names. 

Computed results of all cells are as follows:
The table sequence in A1 contains order records of Order_Appliances.txt:

The table sequence in A2 contains order records of Order_Foods.txt:
A3 combines records of the two table sequences simply for the use of filtering in the next step.

A4 filters out top 200 order records according to total sales amount and selects from them the needed fields to generate a new table sequence. Here we need to sort by sales amount in descending order. Thus the Amount in top()function is preceded by a minus sign and the results are sorted by sales amount:

A5 sorts top 200 order records of sales amount by product names as required:

In fact, what we really need finally is the data in A5. After A4 gets all the necessary information of order records, the information in original cells A1 and A2 becomes useless. Deleting these useless data after getting intermediate data can release memory and make the operations more steady.

Therefore, cellset program can be sort out according to the following method:
If cell value is set as null, the data in the cell would be deleted, as shown by statements in C3 and B4. After statement in C3 deletes the cell value of A2 and B4 deletes the data in A3 referencing records of food orders, the original food orders will be removed from memory.

T.reset() function, which is a little different from others, in B3 will delete all records in the table sequence but will retain its data structure. After B3 is executed, value of A1 is as follows:

We can choose the method for deleting cell values as needed. Setting cell value as null is more commonly used. T.reset() is used only when the table sequence’s data structure is really need retaining.

Note that though statement in B5 sets the value of A4 as null, it cannot reduce memory footprint. Because the result A5 returns is a record sequence in which records come from the table sequence in A4, these records cannot be deleted and will be still in use in A5 even if A4 is set as null. Therefore, when the method of setting cell values as null is to be used, we must find out whether data in the cell is being used or not.

In addition, A5 sorts records in A4, but the execution won’t produce new records. What A5 stores is merely the references resulted from sorting records, which have a limited memory footprint and won’t increase memory usage.