September 3, 2013

Powerful and Easy-to-use Data Computation Tool

esProc is the development tool for database computation. esProc IDE is simple and easy-to-use, providing an higher development efficiency than SQL does. It is especially ideal for reaching a complex computational goal, or acting as a data source computation tool of reports, or a data computational layer of applications.

Characteristics: Tailored for Database Computation

1. Its basic data type supports the structured data

The TSeq is the commonest basic data type introduced in esProc  As the result set of esProc, TSeq is the set of structural records, which is same as that of SQL; However, TSeq is more superior than the result set of SQL in many respects owing to its support for the access via sequence number, generic data, and explicit set. The TSeq can be used for common computations on structural data, and more suitable for simplifying the complex database computation.

2. Its syntax structure is especially tailored for the database computation.

The syntax of esProc is agile and efficient, especially designed for the database computation. For example:

Basic filtering:
Dichotomic filtering:

Dichotomic filtering and filtering out the complementary set:

Getting the amount field for the last record: A1.m(-1).(amount)

The intersection of 2 result sets: A1^B1

The cyclical function of esProc can simplify and streamline the complex SQL, for example:

Add the computational column LRR, and compute the yearly link relative ratio of amount: A1.derive(amount/amount[-1]-1: LRR)

Compute the moving average of recent 3 days, and assign it to column (ma= ~{-1,1}.( amount).avg())

3. Fascinating functions for database
In esProc IDE, users can compose SQL statement directly, and take advantages of the database structure browser, SQL wizard, and other facilitating functions. The figure is shown below:

4. Retrieve and modify the structural data directly

The structural data is mainly stored in the database, and partly from the Excel and Txt.  For these three types of data, esProc supports the direct retrieval and modification.

For example, firstly, retrieve the order information from the database and client details from Txt file. Secondly, by computing, find out the clients who have bought the products of all kinds. Lastly, write the result into the Excel file.

Characteristics: Easy-to-use Development Environment

1. Cellset of grid style

The script of esProc is stored in the grid-like cellset, which saves the effort of formatting. The scripts are presented in a clear and readable way by nature. For example, scripts will auto-aligned to the ruler. When composing the judgment statement, loop statement, and other long statements, users can be benefited by indentation in indicating the work scope of computation straightforwardly. The figure is shown below:

2. Step-by-step computation

The step-by-step refers to decomposing the computation goal into several simple steps. This is the most effective method and the most important metric on the ability to solve the complex computation. Because SQL lacks the ability to support the step-by-step computation, it is quite difficult to achieve the complex computation goal. With the grid style, esProc can achieve the step-by-step computation and result reuse easily.

As shown above, the data filtering for B5 cell has completed easily. Click B5 to view the result of this cell on the right directly. In other cells, B6 for example, the result of B5 can be referenced directly through the cell name. No need to define the variables specially. B6 has completed the grouping and summarizing on B5, and B7 can continue to work on B6. In this way, the interactive computation can be carried on constantly. Each step only requires completing one simple computation to get the solution to the complex computational goal.

3. Debug functions
esProc provides the perfect debugging functions, with support for the breakpoint, single step, and run to cursor functions, as shown in the below figure:

Many long SQL/SP statements can only be composed and comprehended as a whole. The internal working details cannot be monitored. So, their debugging functions are not practical. For example, for the grouping and the summarizing, esProc can implement them in 2 steps, while SQL cannot.

4. Instant computation mode

esProc supports the instant computation mode of auto-computing after each step of scripting. The result will auto-appear in the result column on the right side. With the instant computation mode, users can script and monitor the result at the same time. The algorithm for the next step can thus be decided on the basis of the characteristics of result. The computational procedure will be more focused, and scripting will be more smooth and natural. Neither R language, SQL, or other computational utilities have such ability.

5. JDBC output

Since esProc supports JDBC interface, other tools (reporting tools for example) or Java language can retrieve the result from esProc through JDBC interface. Judging from the code reuse and maintenance perspectives, esProc can be taken as the data computational layer of low coupling in the application. Regarding the performance, esProc can off-load the complex data computing from database, so that the pressure on database server is relieved greatly.

6. Big data computation

esProc has implemented the hadoop interface, is capable of being called by mapreduce, allows for the easy retrieval and writing to the HDFS, and supports the big data computing directly. esProc is especially fit for the complex data computation, which makes it superior than other big data computation tools.

7. Plotting arbitrarily
esProc supports the graphic plotting functions and the graphic parameter editor. Not only the common statistical charts can be generated directly, but also the underlying chart element control functions are made available to esProc users. They are thus enabled to plot the personalized charts of any kinds arbitrarily.