esProc, A Script Language for Data Analytics with Parallel Mechanism: esProc Divides Letters into Segments

The RBP_VALUE field in the rbd_dtl table stores the string “all” and some letters. We want to convert values in the field to strings of segments. The desired result is A-F,H,J-K,N-O,Q-S,U-V,X-Z.

Below is the original data：

Rules to be followed are：

If RBD_VALUE= “ALL”, then discard the value;

According to the alphabetical order if certain RBD_VALUE values are consecutive letters , then make them a segment with both the first and last letter included and “-” between them, like “A-F”;

If two values of RBD_VALUE are inconsecutive, like K and N between which L and M are absent, then unite the former with the letter or letters before it and join the latter with the letter or letters after it. Separate the two segments with a comma. This is like “J-K”, “N-O”;

If a value in RBD_VALUE has not consecutive letters both before and after it, then take it as a separate segment, like the letter H. G and I are absent respectively before and after it.

esProc approach：

A1: Execute the SQL statement.

A2: Align RBD_VALUE field in A1 with the alphabet. Below is a selection of the result:

A3: Group rows in A2. Group rows where RBD_VALUE is null together and put the other rows where RBD_VALUE is non-null into another group. @o means performing merge-style data grouping, which only compares the neighboring values. The result is as follows:

A4: Select the group in which RBD_VALUE is non-null.

A5: Compose strings according to the number of members in each group. ~.m(-1) means getting the last record from the current group. The result is as follows: