Test data preparation is a critical work in
software testing. High-quality test data can better simulate the business case.
It helps to meet the testing requirements by timely and effective evaluation of
software performance, or finding potential issues in the software builds. Most
of the time, the amount of data used in testing is relatively large, and the
data needs to be randomly generated according to specific requirements.
Sometimes there is certain relationship between the data, and there is the need
to retrieve data from an existing database. Therefore, the preparation of test
data often means complexity and and huge workload.
esProc is a handy tool for test data preparation.
Now we need to prepare the test data for employee’s
information in text format, including employee number, name, gender, date of
birth, city and state of residence, etc. Through this example, we can understand
the way test data are being prepared.
We have the following requirements for test
data: the employee numbers are generated sequentially. Name and gender are randomly
generated. Birthdays are randomly generated, however we need to ensure that the
current age of the employees are between 18 to 55 years old. City and states are
randomly obtained from a table in database.
In 3 text files Top100MaleNames.txt, Top100FemaleNames.txt
and Top100Surnames.txt, there are 100 most used male and female names, and
surnames stored.
The cities of employees need to be
retrieved randomly from the CITIES table in database:
According to the STATEID field in CITIES
table, we can retrieve the abbreviations of the states for the employees from STATES
table:
The code for preparing test data is as follows:
The following is the explanation of the code in the cellcet.
The first two lines generate the raw data
of names. Note that when generating the employee information, the name of the
employee is related to his/her gender. Therefore we need to retrieve the text
data first, combine the most used male and female names, and add the gender
field to them:
After data is arranged, we can see in C2
the following table sequence consisting of names and genders:
Similarly, the cities and the abbreviations
of states are also related. After retrieving data from database in line 3, the
abbreviations of states are added to city information:
Then the basic information for generating data
is prepared in line 4, including the data structure for employee information
table, and the amount of test data to be generated, etc.:
Among this, the number in C4 is the
definition of cache, meaning that after generation of every 1,500 records we
need to output data to the text file. This way we can control the use of memory.
In B5 the data structure of employee information table is output to the text
file.
As the next step, we can now run a loop from
line 6 to line 15 to generate the test data for every employee:
B6 generates
a random sequence number as reference to a
name, while C6 generates
one for a surname.
They are used to generate the
name and gender of an
employee. Accordiing to the requirements, B9 randomly
generates the age, and according
to the age, line 10 selects a random date in the corresponding year
as this employee's birthday. In line 11, 12 of the code, randomly
select a city and get the city and state for the employee. After the required
data
is generated,
B13
will add the data to the
table sequence of employee information created in A4. C14 controls the data
output, and write data to text file after every 1,500 records are generated. After data output, A4 is dumped to avoid too much memory
use.
After all data output, the text file is as follows:
When preparing test data with esProc, we can run a loop to generate large amount of random data. Meanwhile, in the loop, we can invoke existing database data or text data easily, to generate data according to business needs and to avoid writing complex programs.
No comments:
Post a Comment