However, these engineers found that the single-threaded program does not take full advantage of the of the server's computing power. Practice has proved that the use of esProc's multi-threading capability can take advantage of the server's quaddual core, or even more CPUs. The change from single-threaded to multi-threaded requires very little workload.
The Operation Department provided the following requirements for computation of users online time:
1. Login should be considered as the starting point of online time, and overnight should be take into consideration.
2. If the time interval between any two operations is less than 3 seconds, then this interval should not be added to online time.
3. If after login, the time interval between any two operations is longer than 600 seconds, then the user should be considered as logged out.
4. If there is only login, without logout, then the last operation time should be treated as time for logout.
5. For users who completed a post operation, his/her current time online time will be tripled in computation.
To shift from single-threaded computing to parallel computing, following steps needs to be done:
The first step: Adjust the log file preprocessor with the @g option of export function, to retrieve the log file for one week into a segmented binary file. In subsequent parallel processing, log file could be retrieved by block for different users. The use of @g option is to ensure the segmented data retrieval is aligned to group borders, removing the possibility for assigning data of the same user to two blocks. The actual procedures are as following:
The second step: Rewrite the online time computing program into a parallel subroutine. The part in the following red box is where we need to modify for parallel processing. Because different parallel tasks are used compute for different users, you can see that very little changes are required for parallel computing. The only change required, is to replace the use of files with different blocks from the binary file.
First we need to add parameters to subroutine, to pass the log file name, block number and total number of blocks for the week when called by the main program.
And then modify the program as following:
The above screenshot illustrates that:
1. As we
previously used export@g
to retrieve the file according to different user
ID, the use of @z
option by cursor to handle specific block (value is block number) among total (value is total blocks) from file, as shown in the
redbox, will retrieve the complete group for the same userID.
Data for one userwill not be split into two blocks.
2. A16 returns
the resulting file as cursor to the main
program.
The third step: writing main
program for parallel computing, to call the parallel computing subroutine.
Because
the total cores of the
server CPU is 8,the IT engineers
decided to use
six
threads for parallel computing. This take full advantage of multi-core CPUs to improve
performance.
Note: for specific measurements regarding
esProc's performance gain with parallel computing, please refer to related test
reports for esProc.
Upon the meeting of this requirement, IT
engineers from the Web Company are facing a new problem: the user numbers for
the online application grew explosively. Colleagues from the Operation Department
complained that the online time computation program is still running too slow. The
single-machine, multi-threaded approach can no longer enhance the computing
speed significantly. Can these IT engineers effectively solve the performance
issue using esProc’s parallel multi-machine computing capability? Is it too
costly to transform to a multi-machine parallel mode? See "Computing the Online
Time for Users with esProc (IV)"
No comments:
Post a Comment