June 18, 2014

Benefits or problems? Data computing of stored procedure

It owns higher performance in big data computing. The efficiency of database IO is generally poor. 

To perform computation after moving a large amount of data out of database will lead to a higher IO cost. However, computing by stored procedure will retain data in the database to get a better performance.

1. It is harder to program and debug stored procedure; there is no good IDE for developing stored procedure.

2. It is difficult to be transplanted; as the stored procedure strongly depends on the database syntax, it won’t work in another type of DB.

3. It is harder to be managed; you need to compile the stored procedure in advance, also have to bother dba when adding or modifying temporary computations. And besides, the stored procedures in the database are hardly managed in hierarchies, so easy to mess up over a long time.

4. For a large amount of traversal involved computation, its performance is also not good. The performance of interpreted stored procedure is far lower than ordinary high-level languages. It only benefits from the fact that it does not need to move the data out of database for computing.

In a word, the stored procedure could do more harm than good. In principle, if the computation is not involving a large amount of data, it would be better not to use it if it is not necessary; even if necessary, you should try to avoid using it.

In addition to ordinary high-level languages, alternative solution for stored procedure may also be esProc, a script language customized for structured data computing. esProc enables all kinds of SQL-like computing functions, and is much easier to be programmed and debugged than stored procedure. Besides, esProc can execute computing out of database in order to make management more convenient.