Reporting of data load errors  
Case: IBM WebSphere Commerce 
 
 
Tomáš Varčok 
 
 
 
 
 
 
 
Bachelor’s thesis 
December 2016 
School of Technology, Communication and Transport 
Degree Program in Information and Communications Technology 
 
 
 
 
 
 
Description 
Author(s)  
Varčok, Tomáš 
Type of publication  
Bachelor’s thesis 
Date 
December 2016 
Language of publication:   
English 
Number of pages  
47 
Permission for web publi-
cation: yes 
Title of publication  
Reporting of data load errors  
Case: IBM WebSphere Commerce 
Degree programme  
Information and Communications Technology 
Supervisor(s) 
Salmikangas, Esa 
 
 
Assigned by 
Solteq Oyj 
Abstract 
The assigner Solteq Oyj has been struggling with the process of handling errors occurring 
during the data loading process into IBM WebSphere Commerce web store solution. This 
process has required a great deal of manual work on a daily basis.  
Therefore, it was proposed to try to identify the possibilities of improving this process. At 
first, studying and understanding the IBM WebSphere Commerce Data Load utility was 
needed in order to be able to identify the available options. After evaluation and decision-
making process it was possible to continue to planning, designing and developing the im-
provements.  
The main functionality was implemented in the form of a module connected to the Data 
Load utility. Programming languages Java and Python were utilized during the implementa-
tion phase.  
The thesis resulted in a major automation possibility. Based on the provided configuration, 
the module is to a certain extent capable of performing all the previous manual steps such 
as information gathering, analysis and forming understandable and self-explanatory error 
messages. The employees can then focus on different tasks, which offers the possibility of 
saving time and reducing the costs to a considerable extent.  
The solution for a real problem was provided, and the results are very beneficial for the 
company. Exploring and understanding the Data Load utility resulted in gaining a great 
deal of useful knowledge for future use.  
 
Keywords/tags (subjects)  
Data processing, automation, web store, e-commerce 
 
Miscellaneous 
 
 
1 
 
Contents 
1 Introduction ................................................................................................... 4 
1.1 Objective ...................................................................................................... 4 
1.2 Hosting company ......................................................................................... 4 
1.3 Outline of the thesis .................................................................................... 5 
2 IBM WebSphere Commerce ............................................................................ 5 
2.1 E-commerce ................................................................................................. 5 
2.2 IBM WebSphere Commerce ........................................................................ 6 
3 Data Load utility in WebSphere Commerce ..................................................... 7 
3.1 Main data entities ....................................................................................... 7 
3.2 Data Load utility .......................................................................................... 8 
3.2.1 Utility in general ..................................................................................... 8 
3.2.2 Working with Data Load utility ............................................................... 9 
3.2.3 Architecture of the Data Load utility .................................................... 11 
3.2.4 Log files ................................................................................................. 14 
3.2.5 Data load errors .................................................................................... 16 
3.2.6 Handling of data load errors ................................................................. 17 
4 Assignment .................................................................................................. 18 
4.1 Current situation and background ............................................................ 18 
4.2 Initial idea for improvement ..................................................................... 19 
4.3 Initial requirements ................................................................................... 20 
5 Research and planning .................................................................................. 21 
5.1 Simple architectural overview ................................................................... 21 
5.2 Choosing the best approach ...................................................................... 22 
5.2.1 Available options .................................................................................. 22 
5.2.2 Post processing the log file ................................................................... 22 
2 
 
 
5.2.3 Making modifications already to Data Load utility .............................. 23 
5.2.4 Chosen solution .................................................................................... 25 
5.3 Vision of the solution after exploring and planning phase ....................... 25 
5.4 Technical planning ..................................................................................... 26 
6 Implementation ............................................................................................ 27 
6.1 Used software development methodologies ............................................ 27 
6.2 Connecting module to Data Load utility ................................................... 29 
6.2.1 Architecture analysis ............................................................................ 29 
6.2.2 Connecting in the mediators ................................................................ 30 
6.2.3 Connecting in some central place of utility .......................................... 31 
6.2.4 Decided solution ................................................................................... 31 
6.3 Functional analysis (algorithm) ................................................................. 32 
6.4 Components .............................................................................................. 35 
6.4.1 Module overview .................................................................................. 35 
6.4.2 Connecting components ....................................................................... 36 
6.4.3 Processing components ........................................................................ 37 
6.4.4 Assisting components ........................................................................... 41 
6.4.5 Configuration components ................................................................... 41 
6.4.6 Configuration testing tool ..................................................................... 44 
7 Results ......................................................................................................... 44 
7.1 Current status and results ......................................................................... 44 
7.2 Putting into use ......................................................................................... 45 
8 Conclusion .................................................................................................... 45 
References ........................................................................................................... 47 
 
  
3 
 
 
Figures 
Figure 1. Data flow ......................................................................................................... 8 
Figure 2. User roles interaction with the utility and flow of the processes ................. 10 
Figure 3. Data Load utility architectural overview ....................................................... 12 
Figure 4. Example of entry in a log file ......................................................................... 14 
Figure 5. Part of default error message written to log in case of error ....................... 15 
Figure 6. Initial idea for providing more understandable error reports ...................... 20 
Figure 7. Simple overview of the data load solution architecture ............................... 21 
Figure 8. Class diagram of important part of the data load solution ........................... 29 
Figure 9. Activity diagram of the module´s algorithm ................................................. 33 
Figure 10. Processing of the message part of the algorithm ....................................... 34 
Figure 11. Log file script post-processing part of the algorithm .................................. 35 
Figure 12. Schema of the module ................................................................................ 36 
Figure 13. Examples of messages used in Dynamic parameters component. ............. 38 
Figure 14. Processing of messages by Dynamic parameters component .................... 39 
Figure 15. Error report with list of affected items ....................................................... 40 
Figure 16. Configuration of the module in the XML file .............................................. 42 
Figure 17. Configuration of known errors, reporting messages and other related 
settings for the specific error ....................................................................................... 43 
Figure 18. Error where the source of the error is not clear ......................................... 43 
Figure 19. Information from reporting e-mail ............................................................. 44 
 
  
4 
 
 
1 Introduction 
1.1 Objective 
Automation is a big trend today, since it saves time and other resources in general 
and especially, money. Even though information technologies are in many cases a 
tool to set up the automation, there are also many processes and areas which can be 
automated there.  
One of the vital parts of running a web store is uploading data there and keeping the 
current data up to date. Sometimes these operations can be done on a daily basis. 
Data integration brings big benefits, however, on the other hand, various challenges 
and issues as well. It also applies to the data loading process into the IBM Web-
Sphere web store solution. The current process of handling these kinds of errors in 
the hosting company Solteq Oyj consists of many manual procedures. Information is 
gathered from different sources, analyzed and the problem is explained to a data 
provider by telling what happened, why it happened and what steps are needed to 
fix it.  
Even though the same types of problems reoccur, the necessary steps are always 
almost the same, which means that the time and cost requirements of this process 
are not decreasing despite growing quantity and quality of the available knowledge. 
Because of these reasons, the need for some improvements emerged from the em-
ployees of the company.  
To solve this problem, it would be very good to introduce some solution to automate 
all or at least some of the steps which are now performed manually. This solution 
should be able to collect all the necessary data and transform it into easily under-
standable error messages.  
1.2 Hosting company 
Solteq Oyj is a company providing various information technology solutions. It is a 
Finnish company with several sites in Finland, and nowadays also in other European 
5 
 
 
countries. It characterizes itself as a middle-sized company. It is possible to find large 
Finnish companies as well as international clients among its customers.  
1.3 Outline of the thesis 
The thesis starts with an introduction of the product IBM WebSphere Commerce and 
its Data Load utility, which is the main point of interest in the whole complex 
solution. The problems of data loading process are introduced there as well. Latter 
chapters explain the current process and why improvement was needed. 
Additionally, the analysis and the implementation of a new module providing certain 
improvements is discussed.  
 
2 IBM WebSphere Commerce 
2.1 E-commerce 
E-commerce is the term commonly used for electronic commerce, which is an um-
brella term covering all the areas belonging to or operating electronic business – of-
fering and selling products or services using electronic communication, especially 
over the Internet. Nowadays the main core of e-commerce consists of web stores 
providing an online option to browse a catalog of available products and services, 
make orders and payments, track the status of the order, etc. However, there are 
also other parts of e-commerce, for example, online marketing, which tries to pro-
mote the online store, its offers etc. in order to increase the number of sales and 
generated profit. (What is e-commerce, 2016) 
Today, e-commerce is not only about providing basic options of browsing a catalog of 
products and capability of buying some product. Much more complex solutions are 
currently used to satisfy business needs of sellers of all sizes from small businesses to 
huge retail companies. The online store is no longer a separate system operating on 
its own but it is strongly integrated with other systems and services. This co-
operation of multiple software solutions brings a need for big data integration and 
6 
 
 
migration. The easy flow of data brings benefits to both sellers and customers. Sellers 
benefit from less manual work and automation of many different processes. It then 
brings faster processing and delivery of the order to end customers, which is a very 
important aspect and competitive advantage over other online stores.  
E-commerce at its highest level is divided into several groups according to the 
categorization of the buyer of products or services. In relation to the standard online 
stores, two of them are applicable. The first option is business to business (B2B), 
which means that both seller and buyer are some kinds of business units. Another 
one is a business to customer (B2C), where an end customer is a single person. There 
are many differences between these two areas because the whole online store 
(same applies to regular real stores) should always be adjusted to its customers, their 
needs and habits. (What is e-commerce, 2016) 
2.2 IBM WebSphere Commerce 
IBM® WebSphere® Commerce Enterprise is an omni-channel eCommerce plat-
form that enables business-to-consumer (B2C) and business-to-business (B2B) 
sales to customers across channels—web, mobile, social, call center or store. It 
supports seamless marketing, selling and fulfillment with precision marketing, 
merchandising tools, site search, customer experience management, catalog and 
content management, social commerce and advanced starter stores. It dynami-
cally optimizes content for various device types and formats including web, 
mobile and tablet. (WebSphere Commerce Enterprise, 2016) 
IBM WebSphere Commerce (its various versions) provides companies of all sizes with 
a complex set of tools for operating their e-commerce businesses. It is a huge soft-
ware solution trying to provide all the options which sellers might need to reach their 
targets – be it a basic type of service like catalog management, various marketing 
tools, etc.  
 
7 
 
 
3 Data Load utility in WebSphere Commerce 
3.1 Main data entities 
Data model of an online store can be complicated. However, there are still main data 
entities, which are the core of everything both from technical and the end user´s 
point of view. Not all of them have to be used or they can be used only to a certain 
extent depending on the size of the shop, the relation of seller-buyer (B2B or B2C), 
its range of offered products, etc.  
Category 
Categories (catalog groups) represent sets of products. They can be formed into a 
more complex architecture when subcategories are used.  
Product  
Product (catalog entry) is a central entity in the web shop. It has relations to many 
other entities and what the end user sees as a product in the storefront is usually a 
combination of all related entities.  
Attributes, attachments  
The product can be described by different attributes (length, color, etc.), which are 
loaded separately, and the product can then have different values of these attributes 
(105 cm, blue). Also, attachments such as specification or documents with instruc-
tions can be provided to end customers.   
Prices   
As in a real world, also in an online store more types of prices can be found, especial-
ly differentiated by their relation to taxes. Also, it is very usual that not all the poten-
tial buyers have the same price as the seller can specify different discounts from reg-
ular prices or totally different prices for each customer (especially in B2B stores).  
Customers (organizations)  
Customer organizations are used in so-called B2B (business-to-business) stores, 
where the end customer is another organization and not a single private person.  
8 
 
 
Users 
Roles and capabilities of users of the WebSphere Commerce strongly depend on the 
type of the store from a business point of view – whether it is B2B or B2C store. 
When the user is the end customer, he/she can do basically all the operations in the 
web store such as buying the products in particular. In B2B stores all the users of the 
application always have to belong to some organization for which they can make 
purchases, see already placed orders, etc. Users within organizations can have differ-
ent roles, which specify their available actions to perform in the web store. 
3.2 Data Load utility 
3.2.1 Utility in general 
The Data Load utility is an enhanced business object based loading utility. This 
utility provides an efficient solution for loading information into your WebSphere 
Commerce database. You can also customize the Data Load utility to load other 
types of data. The Data Load utility is the recommended loading utility. (Over-
view of the Data Load utility, 2016) 
Data load utility is a set of components responsible for loading data to an existing 
web store. Technically it provides the capability of moving data from input sources to 
database in such a form and structure, which is required by WebSphere Commerce. 
(See Figure 1) 
 
 
Figure 1. Data flow 
 
9 
 
 
Typically, data load is a part of more complex architecture consisting of more various 
tools and programs with different responsibilities. Their cooperation then finally pro-
vides the needed behavior.  
In real use, it is not enough to get some data and load it into the database because 
source data can come from multiple systems, usually ERP (Enterprise Resource Plan-
ning). Also, those systems are often not capable of providing data in the format re-
quested by the Data Load utility, which then brings the need to make modifications 
or put another system in place to take care of translation of the source data into 
needed format and structure. After the translation data can be delivered to data load 
solution itself and loading into the database can start. The area between translation 
and starting data load has space for various adjustments and modifications and more 
different programs, scripts or software solutions can be placed there.   
A certain set of data is always given do the Data Load utility. Data are being loaded in 
groups of entity types in order, which is required by relations between them. For 
example, all the categories have to be loaded before products, which are referencing 
the categories.   
3.2.2 Working with Data Load utility 
Several user roles are needed to successfully set up the Data Load utility, provide the 
data in expected format, load the data and verify the results of the data load. The 
diagram in Figure 2 illustrates different user roles and flow of the processes.  
User roles responsibilities can be defined as follows:  
Business user is responsible for managing the business data. Developer is respon-
sible for defining the data source template, business object mappings, and cus-
tomizing the Data Load utility. Site administrator is responsible for the day-to-
day operation of the Data Load utility. (Overview of the Data Load utility, 2016) 
 
10 
 
 
 
Figure 2. User roles interaction with the utility and flow of the processes (adapted 
from Overview of the Data Load utility, 2016) 
 
As it is visible in the diagram, there are many different processes regarding data han-
dling in the IBM WebSphere Commerce. All the steps and processes are explained in 
the following list.   
1. The business user provides the developer with the business data. 
2. The developer creates a data source template, which defines how source 
data must be formatted before the data is loaded. 
3. The developer also creates the business object configuration file. The 
business object configuration file defines how the Data Load utility maps 
the input data to the business object and how to transform the business 
object to physical data. 
11 
 
 
4. The site administrator uses the business object configuration file to define 
and create the load order configuration file. 
5. The site administrator sets the store and database settings in the envi-
ronment configuration file. 
6. The business data is formatted according to the rules of the data source 
template before the data is loaded to the database. 
7. The formatted source data is provided to the site administrator. 
8. The site administrator runs the Data Load utility along with the three con-
figuration files (environment, load order, and business object configura-
tion files) to load the formatted source data into the WebSphere Com-
merce database. After the utility runs, the site administrator also verifies 
the results of the load. 
9. The business data is available in WebSphere Commerce to be managed by 
the business user. 
(Overview of the Data Load utility, 2016) 
 
3.2.3 Architecture of the Data Load utility  
In order to work with the Data Load utility and do customizations, it is at first crucial 
to understand the way it works and to know its structure (see Figure 3). There are 
several components working together within the Data Load utility to perform all of 
its tasks. 
 
12 
 
 
 
Figure 3. Data Load utility architectural overview (adapted from Data Load utility ar-
chitectural overview, 2016) 
 
Business object builder layer 
The business object builder layer contains the data reader and the business ob-
ject builder. The data reader is responsible for reading the raw data and passing 
it to the business object builder for processing and building business objects. The 
business object builder takes the data as input from the data reader, then popu-
lates and instantiates the business objects. Each business object is defined as a 
common entity throughout the WebSphere Commerce data model. In other 
words, you only have to understand a single representation of the data through 
the store front, authoring tools, and the data load infrastructure. (Data Load util-
ity architectural overview, 2016) 
13 
 
 
The business object layer supports different types of input data sources. The data can 
come from CSV files (comma-separated values), XML files, external databases and 
other systems (mainly ERP - Enterprise Resource Planning). A CSV and XML data 
readers are provided out of the box with the Data Load utility. However, these read-
ers work only with specified data format. If data which is about to be loaded into the 
WebSphere Commerce database has a different format, it is necessary to make some 
modifications to the configuration files or create new custom data reader. If the in-
put data does not come stored in CSV or XML format, new custom data reader is al-
ways necessary.  
Business object mediator layer 
The business object mediator layer contains the business object mediator. The 
business object mediator converts the business objects into objects that repre-
sent the physical database schema, also referred to as physical objects. Several 
mediators are available for catalog, inventory, and price components. (Data Load 
utility architectural overview, 2016) 
The Data Load utility also offers an option to be modified in a way that allows it to be 
used to load data to any table using TableObjectMediator. This might be useful in 
cases when some custom table is created in the database for some WebSphere 
Commerce customization. The data there can be then loaded in the same way as for 
any other entity, which makes the process of regular loading all types of data much 
simpler, because there are no exceptions, different processes, etc.  
The ID resolver, which is a part of this layer, is used to retrieve the primary key of the 
physical object. The physical object represents a row in a table. If the object already 
exists in the database, its primary key is returned. If it does not, either a new primary 
key is returned (for this new physical object) or in some cases, it can result in an er-
ror, which will stop processing of the current object. This situation can occur if there 
is a reference to a non-existing object in the database, e.g. when some product is 
being loaded and it refers to a category, which does not exist in the database. ID re-
solver then tries to obtain primary key of this specific category entry, which results in 
an error.  
14 
 
 
Persistence layer 
The persistence layer saves physical objects received from the previous layer into the 
database using data writers. 
3.2.4 Log files  
Each run of the Data Load utility produces a log file with the sum of all the infor-
mation related to that specific run. This file is a key for debugging problems of data, 
performance or Data Load utility itself. An example of summary information from the 
log file can be seen in Figure 4. Some parts of this log entry were modified to hide 
sensitive information.  
 
 
Figure 4. Example of entry in a log file 
 
15 
 
 
The log shows times of start and end of the whole process as well as the duration of 
loaders of each entity type. Those can come useful for performance analysis. There is 
summary information about the amount of data which has been processed, how 
many business objects and database tables have been affected. The log also contains 
information about what operation was performed (inserting a new item, updating 
existing one, removing) and how many times based on the provided data. (Data Load 
Utility in WebSphere Commerce Introduction, 2016) 
If everything goes smoothly, the log does not have to be inspected very often. The 
main reason can be monitoring performance, gathering statistical data, planning and 
doing some performance improvements. 
If any problem occurs during loading the data there are error messages, exception 
messages and stack traces of exceptions written into the log file (See Figure 5). The 
log file then needs to be checked properly, and all the important information from 
there has to be investigated in order to identify the real cause of the problem.  
 
 
Figure 5. Part of default error message written to log in case of error 
 
16 
 
 
The messages in the log file can sometimes be quite clear for a person with better 
knowledge of this area, however, still not really understandable for not so skilled 
person as they are not very user-friendly and easy to interpret. Some error messages 
can be a complicated case even for a specialist knowing the product and it can take a 
longer time to totally understand what is wrong and what the cause of the error is.  
The Data Load utility is able to receive data mainly in XML and CSV formats. Even 
though from a performance point of view XML is worse, for investigating errors of 
data loading process it is better thanks to human readability and better structure. 
Finding an error or any kind of problem there is not so hard if the person knows what 
he/she is looking for. CSV format brings much smaller amounts of data, which is a big 
benefit for the performance of data load process. If the load runs without problems, 
it really is more beneficial, however, in the case of frequent occurrence of errors in 
the data, investigation and data analysis becomes harder. 
3.2.5 Data load errors 
The term “data load error” is next defined for all the possible errors and problematic 
situations, which can occur during running the Data Load utility. Those errors can be 
mainly distinguished into two groups – errors not caused by data and related to data. 
Data load errors not caused by data 
There are more causes of errors, which prevent the Data Load utility from the correct 
handling of the data. The first group is not caused by the data itself. The main groups 
of their causes can be:  
Wrong configuration – the Data Load utility provides many options for settings and if 
it is not set up properly, it can cause different problems.  
Incorrect customizations – different parts of the data load solution can be custom-
ized in many ways, and various problems can occur there, for example because of 
error in the code.   
Database problems – the database of IBM WebSphere Commerce is quite huge and 
the amount of data can be big. It can happen that some database process or script is 
17 
 
 
running at the same time as the data load, which can result in tables being locked 
and inaccessible for the data writers of the Data Load utility. In these cases, it is usu-
ally enough to re-run the data load with the same data a bit later.   
Data load errors related to data 
The amount, variety, and structure of data for web store are big and complex. There 
is different information related to single entities. For example, products can have 
relations between each other, multiple attachments, they can have more prices 
linked to themselves, etc. In such a complex data model, the occurrence of problems 
can be quite high.  
Data load utility is responsible for loading the data into the database and if there are 
any restrictions on the data which are not met, this is the place where problems will 
be found. Also, relations between entities are transformed into database relations 
between tables and if there is some incorrect reference (to not existing entity, etc.), 
it will not be possible to load the data correctly. Most typical problems with data are:  
Incorrect references - This problem will arise when some part of the data, which is 
just being loaded (processed by the Data Load utility), refers to something that does 
not yet exist in the database. As an example, the relation between a product and a 
category where it belongs can be used. If the category referred to by the product 
does not exist, it can be caused by a simple mistake in its name (or identifier) or a 
fact that even though the category has to be loaded before the product, it did not 
take place for some reason.   
Incorrect format of data - For some information, there are specific restrictions on 
their format. An example of this kind of error can be a date or an e-mail in an invalid 
format.  
3.2.6 Handling of data load errors 
Handling of data load errors can become a quite annoying manual activity in the life 
of a support team member; however, it is important to understand that these errors 
are usually alerts notifying that something is wrong and should be taken care of. In 
some situations, when for example some requirement is not met, it might not be 
18 
 
 
preventing loading the data totally but it is better when loading is stopped in order to 
prevent incorrect or inconsistent data from being moved to the web store and en-
dangering its correct working for the end customers. 
 
4 Assignment 
4.1 Current situation and background 
After the previous introduction into WebSphere Commerce, the Data Load utility and 
other information, this chapter describes a specific case in hosting company. What 
was the situation before the idea for the topic of this work, how was it handled, what 
were the possible areas for improvements, etc.  
After some web store is developed and finished, the support team starts being in 
charge of taking care of any problems when the store is “live”, i.e. it is accessible, 
used by end customers and real purchases are made there. This live usage brings the 
necessity to update the data in the store sometimes even on a daily basis. Any prob-
lem or delay with updating the data in running store is critical for the store owner 
and therefore, any data load errors have to be handled (investigated and solved) as 
soon as possible. However, there are many other responsibilities for the support 
team members as well and if the data load error is caused by data, sometimes some-
body else responsible for preparing or generating the data has to act and solve it, 
which means that the process of solving data load errors can take some time.  
The current process of handling the data load errors is as follows. Data is received 
from a data provider, certain translations and modifications are done and then the 
data load runs. When the Data Load utility finishes with some error status, an e-mail 
is generated and sent with information about the incident. A ticket is created in an 
issue tracking system. Some member of the support team has to start investigating 
what happened and what is the cause of the problem. It usually starts with reading 
the log file, searching in the source data and also in the database. After the cause is 
identified either some actions are made on the support team side or, in case it is 
19 
 
 
needed, the problem has to be explained to the data provider with explanation what 
happened, why and what needs to be done to fix it.  
From the point of view of a member of the support team, in some case data load 
problems are complicated, however, on the other hand, some of the problems often 
reoccur in very similar form. The time to solve the error (find its cause) then varies 
very much, however, there is still certain time spent regardless of the error itself just 
by processing related activities, which means that even without complicated logs and 
not very helpful error messages, handling these errors can be quite demanding espe-
cially from time consumption point of view and if there is any room for improve-
ment, it is more than welcome.  
For some of the errors, there is already an existing document, which basically con-
tains a table with information about error message written from the Data Load utility 
to the log and some instructions or information to that specific error; nevertheless, 
this document serves only as guidance for members of the support team and it has 
to be manually checked every time when needed.  
Look to the future shows that this situation will not get any easier as complexity and 
amount of data loads including related errors will most probably increase all the 
time. On a longer scale, this can result in higher workload on the support team 
members and longer times for solving problems in general. Anything that would 
make handling data load errors easier can, therefore, save time and costs.  
4.2 Initial idea for improvement 
The basics of the idea about some improvements were brought by two employees – 
a company´s data load specialist and a manager in charge of the support team. The 
perfect situation would be to have some sort of automation which would recognize 
errors and provide some more understandable error messages. Figure 6 illustrates 
the basic idea of translating the default error message. A wish or an idea of some 
kind of improvement existed already some time but a clear and precise idea what 
and how can be actually done was missing. It was also not clear if something can be 
done at all because the Data Load utility, as well as the whole WebSphere Commerce 
20 
 
 
solution, is really complex and many parts cannot be modified at all. However, both 
employees emphasized that any improvement which would make work of support 
team members easier is more than welcome.  
 
 
Figure 6. Initial idea for providing more understandable error reports 
 
The specialist of the company who is responsible for the data load area already did 
certain improvements, however, usually, they were only “ad hoc” (for the specific 
single case). He was also not aware of options how to bring some better solution, 
where exactly to place it, and so on.  
4.3 Initial requirements 
Because it was not clear what the possibilities of making any kind of improvement to 
the Data Load utility or the company´s processes are, no very precise requirements 
on the final solution were specified at the beginning of the work.  
The main tasks could be grouped into these phases:    
 studying and understanding the current process of handling data load errors,  
 understanding and exploring the Data Load utility and options of its extending 
or modifying,  
 trying to identify options for improvements (especially technical ones) in or-
der to achieve time and costs reductions,  
 evaluating the previous phases and identified improvements,   
 planning, designing and developing the technical improvements. 
21 
 
 
Except the above, discussions and consultations were expected during the phases to 
keep supervising persons up to date with the current progress and to eventually 
modify and add the requirements or change the approach.  
 
5 Research and planning 
5.1 Simple architectural overview  
The basic architecture of used data loading solution can be represented by the fol-
lowing diagram. (See Figure 7) 
 
 
Figure 7. Simple overview of the data load solution architecture 
 
The whole process of loading the data is started in custom data load scripts which do 
certain activities before the Data Load utility is started and then when everything is 
ready, they start the load via IBM original dataload.sh script. This bash script starts 
the Java process of the Data Load utility itself. The log file is constructed during the 
loading process and after the process is finished, it is given to the Python script which 
is responsible for post-processing of the log file´s content and also picking up infor-
mation, which will be placed into an e-mail informing about data load errors.  
22 
 
 
The places, which are marked as blue, were identified as fully accessible and modifi-
able. No major technical or license restrictions apply to them. The other parts might 
be modifiable to a certain extent, however, some restrictions or complications have 
to be taken into consideration.  
5.2 Choosing the best approach 
5.2.1 Available options 
To achieve the goal – having some solution providing more understandable messages 
from the Data Load utility – it was necessary to come up with a new solution in form 
of some kind of module, which would be connected to the Data Load utility and 
which would perform operations with identified errors. There were two basic ap-
proaches available from the beginning, based on the placement of the new solution 
and the basic way of working.  
5.2.2 Post processing the log file 
Taking a standard Data Load utility log file and processing its content probably with 
some scripting programming language is one of the options. Small basics in form of 
few ad-hoc solutions for specific errors and situations were already in place by using 
Python script to read the log file´s content line by line and trying to identify the pre-
defined text. Searched strings were specified directly in the source code as the pur-
pose of this solution was nothing more complex than taking care of few single cases.  
This Python script could have been modified so it would bring some improvements 
into the process, however, it would be quite limited to information contained in the 
log and it would be able to perform mostly only data extraction and translation. Nev-
ertheless, even translating the error messages contained within the log into more 
understandable explanations and instructions could bring some benefits if that in-
formation was separated and nicely structured.  
From a technical point of view having the new solution totally separated from the 
Data Load utility itself would be a big benefit as it could not affect the loading of the 
data in any way in case of some error or failure. Also, it would run after the data load 
23 
 
 
finishes and data are loaded into the database, so it would not affect data load run 
time at all. Data load utility always works with single messages and so the log file is 
built incrementally. It could be beneficial to have the option to access it as one final 
set and be able to make summaries, analysis, etc. and not just work on a level of 
simple messages being written into the log.  
The benefits of this approach are listed as follows: 
 Deployment and changes are much easier.  
 It would be a totally separated solution not influencing the Data Load utility. 
 It cannot affect the data load run time. 
 There is an option to not work only on the level of single error messages. 
The disadvantages of this approach:  
 There is a very limited amount of information available. 
 Queries into the database are a bit more complicated but possible. 
 
5.2.3 Making modifications already to Data Load utility 
From the first impression making some modifications to the Data Load utility or con-
necting some external solution to it would seem as a much better way to achieve the 
specified goals. Especially, it seemed to be possible to access some more detailed 
information than those provided in the error messages in the log file, e.g. precise 
name or identifier of the item, which failed as this information is usually missing in 
the error messages and is important for solving the cause.  
The main problem is that the Data Load utility has only a few extendable or modifia-
ble components. The core of the solution cannot be touched from a license point of 
view and its configuration is really weak. There are more things to configure or modi-
fy in the Data Load utility, however, they are only in few certain areas of the whole 
24 
 
 
solution. A major part of the utility was simply not meant to be modified, configured 
or extended because normally there is no need to do that.  
Data available in all parts of the utility varies very much, and primarily this would be 
the main aspect of choosing a place where to connect the new module for improved 
reporting. With the above mentioned strong limitations in place, finding a connection 
point gets very hard and can basically turn the whole idea into being impossible to 
implement or achieve much smaller improvements than originally expected.  
Being part of the Data Load utility, modifications of the module would become a bit 
harder because of the deployment processes and practices. If there was some prob-
lem with this solution, it could negatively affect run of the data load, thus, the re-
quirements for a safe implementation are much higher. Also, as a part of the utility 
and run of the data load, it could affect the run time. However, as the slowest part of 
the whole data loading process are many queries to the database, the effect should 
be minimal unless there are many extra database queries performed by the module.   
Components of the Data Load utility usually work with single items being loaded. If 
there is any error message, the main point of interest in this study, the message is 
processed as a single item going from the place of origin through some processing 
classes to the log.  
Benefits of this approach are listed as follows: 
 There is an option to access more detailed information. 
 It is easier to make queries into the database. 
Disadvantages of this approach are:  
 Deployment and changes are complicated. 
 The possibility that it can negatively affect data load run.  
 It is affecting data load run time. 
 It brings limitation of working on single error message level. 
25 
 
 
 It is very hard to connect into the data load solution. 
 There are many limitations because of non-extendibility of the Data Load util-
ity. 
 
The list of disadvantages is much longer than in the first option and also longer than 
benefits, however, the option to gather more precise data is very important and so it 
outweighs most of the negatives.  
5.2.4 Chosen solution 
As the precision and amount of provided information about the data load error, 
which occurred during loading the data, is a crucial aspect for evaluating the benefits 
of this module, it is necessary to connect the module in direct touch with the Data 
Load utility. On the other hand, some possible improvements can be done only by 
post processing the log file and also sending e-mails is handled on that side, there-
fore, it was also decided to keep some functionality on the post-processing side.  
5.3 Vision of the solution after exploring and planning phase 
After exploring the data load solution and available options it was possible to form its 
more precise vision.  
The solution with the capability of providing more detailed information about data 
load errors occurring during loading the data will be created in a form of a module, 
which will receive error messages from the Data Load utility in their original form. 
Different processes and activities will be performed in order to process and eventual-
ly enrich the contained information about the error. The target is to provide detailed 
information about the error to the end user analyzing the data load error reports.  
The module will be connected close to the Data Load utility in order to be able to 
perform its operations above the resulting error messages. It will be implemented in 
Java as the rest of the Data Load utility. A certain part of the solution will be placed in 
26 
 
 
Python script, which is post processing the log file after the data loading process 
ends.  
At first, the people in touch with the module and its outcomes will be only support 
team members. After testing and fulfilling the database of known errors, 
explanations, and instructions, also other people can start working with the module´s 
outcomes as they will already be precise and informative enough also for not so ex-
perienced people with strong technical knowledge.  
The whole module can save time and costs during the process of handling data load 
errors and later, it can eventually make the process simpler and reduce the amount 
of work to be done exclusively by the support team.  
5.4 Technical planning  
The module will receive single logging messages, check the presence of known error 
messages there and if some will be identified, it will make further steps to process 
the data and provide as much information as possible for this specific error. Basic 
processing will mean adding configured explanation to this error. If possible and ap-
plicable, any other more detailed information should be provided as well to make 
understanding and solving the cause of the error as simply and fast as possible.  
Settings will be kept in XML configuration files, and it must be possible to configure 
the database of known data load errors and also the module itself via these files.  
After the message is processed and more detailed information is written into the log 
file, post processing script then reads the file and picks certain parts of information 
to put it into the e-mail informing that some error has occurred during data loading 
process.  
 
27 
 
 
6 Implementation 
6.1 Used software development methodologies 
During the development, principles of some known software development method-
ologies were used. Some were used on a larger scale than the others and from some 
of them, only certain principles or practices were used based on this specific pro-
ject´s needs. 
Exploratory programming 
Exploratory programming is an important part of the software engineering cycle: 
when a domain is not very well understood or open-ended, or it's not clear what 
algorithms and data structures might be needed for an implementation, it's use-
ful to be able to interactively develop and debug a program. (Exploratory Pro-
gramming, 2016) 
As mentioned, the whole Data Load utility is a quite closed solution with only some 
areas modifiable or configurable. For the purpose of developing module reporting 
data load errors, these constraints presented critical obstacles which might have 
caused a fail of the whole development in any moment when some new critical ob-
stacle were to be identified. With this situation, exploratory programming seemed a 
very good and necessary approach to planning and developing the solution. 
At first, it was necessary to explore and understand the Data Load utility. After that it 
was needed to identify important points of the utility as steps in the process, places 
in the data flow and specific classes and test available options and data there. Even 
though some places are modifiable, it does not mean they are suitable for the 
planned purpose. The main problem except few modifiable places was availability or 
accessibility of the data at those places. As the documentation to non-modifiable 
places is very weak or not available at all, the only way how to find out what options 
and data are available was to use approaches of exploratory programming. 
 
28 
 
 
Agile methodologies 
As precise planning and specification of the solution in advance was almost impossi-
ble due to the aforementioned constraints and lack of knowledge of the domain, 
using non-flexible and non-adaptable methodologies such as Waterfall model could 
not be considered. It was necessary to be able to modify requirements, plans and 
specifications during the development process, which corresponds to one of the key 
ideas of agile methodologies – regular and expected adaptation to changing circum-
stances. 
Precise and complete requirements could not have been specified at the beginning 
of the assignment, which made it harder to deliver something that would really satis-
fy the needs, as there could have been some misunderstanding of those needs and 
some of them might have been forgotten. For that reason, it was necessary to be 
able to prepare and show a demonstration of the current status to have some base 
over which the discussions and further planning can be held. 
Agile software methodologies offer many methods and practices which perfectly fit 
into the specific needs of this project to make development possible and to make 
sure expected product is delivered in the end. 
Following steps in the development were gathered in a backlog, and the sprint back-
log was always derived from it according to the current situation. As the solution is 
not so huge and the use of exploratory programming approach was necessary very 
often, the sprints were usually very short compared to bigger projects. 
Iterative and incremental development 
Because of the need to deliver demo versions for presentations and consultations of 
the current status, developing in iterations and building the product incrementally 
was very useful. 
29 
 
 
6.2 Connecting module to Data Load utility 
6.2.1 Architecture analysis 
Connecting the module to the Data Load utility posed a critical obstacle. To connect 
the module to the Data Load utility, it was necessary to find a class with an access to 
the majority of information and especially, a possibility to modify in order to estab-
lish a connection to the module. A closer look at the architecture of the solution can 
be seen in the following diagram. (See Figure 8) 
 
 
Figure 8. Class diagram of important part of the data load solution 
 
The diagram is divided vertically into the Java part on the right and the script part on 
the left. Horizontal division separates generic and mediators’ level. At the mediators’ 
level, the flow of processes is distributed by using different mediators. It is then hard 
to have a central place to work with information, however, on the other hand, medi-
ators have access to the details of the data. The generic part then contains the core 
30 
 
 
part of the Data Load utility. For formatting the log messages, the utility uses defined 
formatter class. The blue color shows the modifiable parts of the solution.   
There were basically two options to consider: either to connect the module on the 
low level represented by mediators, or to some higher level, where the flow of pro-
cesses and especially information is united.  
6.2.2 Connecting in the mediators 
Business object mediators (described in the chapter about the architecture of the 
Data Load utility) are an ideal place from the information point of view as they are 
very close to the data, and it is possible to access more detailed information about 
the item, which caused some problem and its loading failed. The problem is that at 
this level, information and process flow is distributed in large scale into many differ-
ent places (mediators). Some of them are customized, however, in many cases, the 
default ones are used. These mediators have sometimes a complicated, however, 
mainly very different inheritance hierarchy. Parent classes are usually IBM default 
implementations, which cannot be modified. This means that it was impossible to 
identify a single common place for all the mediators. Any connection of the module 
at this level would require modifying basically all the custom mediators and 
overriding other ones, which would mean modification of a large scale affecting the 
whole data loading solution.  
The benefit is as follows: 
 Allows easy access to detailed data of a problematic item. 
The disadvantages are the following: 
 It would require modification of all custom mediators. 
 It would require overriding all the default mediators which are used. 
 
31 
 
 
6.2.3 Connecting in some central place of utility 
The ideal connecting point would be located in the central part of the diagram, 
where the information flow is united, however, there is still access to some addition-
al information. As the amount of changes in this common central place was to be 
very small, this modification would not have very big impact on the data loading so-
lution. However, as shown in the diagram, this area has no modifiable classes be-
cause modifying this part of the Data Load utility was never expected or meant to be 
allowed. 
Benefits of this solution are: 
 There is a possibility to catch all the error messages at one central place. 
 Only low amount of changes to the Data Load utility is needed. 
The disadvantage is as follows:  
 Access to information about the problem is reduced. 
 
6.2.4 Decided solution 
Extending the logging formatters has been identified as the best possible connection 
way. Exploration showed that the formatters are used to format all the messages 
written into the log files, which means that all the information contained in the log 
files goes through the formatter before being written there. That makes it the 
perfect place to catch and investigate all the messages. There are more such places, 
however, they are not accessible because their code is not modifiable. On the other 
hand, the formatter class used by the Data Load utility can be set in data load logging 
properties file.  
Connecting the module in this central place brings the capability of processing all the 
data load errors in one place, which is a very big advantage. However, the possibility 
to collect more detailed information in this place is strongly lower. For some errors, it 
is not necessary to provide much more additional information, however, sometimes 
32 
 
 
default error messages do not contain enough information, which is why it was de-
cided to also modify certain mediators as well. This step is to be performed only in 
case of an error occurring often, and manual work will be strongly reduced or totally 
eliminated when also more detailed information from the mediator will be provided. 
6.3 Functional analysis (algorithm) 
The implemented algorithm of the module is presented in the diagram in Figure 9. At 
first, the module receives the message, which should be written into a log file. This 
message is checked for the presence of known errors. If no error is identified, the 
algorithm ends. If one of the set errors is found, it is processed in order to get the 
reporting message ideally with as many details as possible. As the same error mes-
sages can be written into the log file multiple times, the algorithm checks if the same 
error was already identified and stored to be reported. After the whole data load 
ends and the log file is finished, its post-processing by a script starts in order to ex-
tract important and summary information. The steps of processing the message and 
script post processing of the log file are described separately in their own activity 
diagrams and description.   
33 
 
 
 
Figure 9. Activity diagram of the module´s algorithm 
  
34 
 
 
Sub-activity of processing of the message is illustrated in Figure 10. After the error 
message is identified, it is translated into a pre-configured report message. This mes-
sage is more understandable and explains the problem in a way that enables its fast 
solving. If there is some useful dynamic information (related to the specific occur-
rence of that error type) contained within the message, it is extracted and placed 
into the reporting message. It can be, for example, the ID of the item which caused 
the error. Both dynamic and static parameters are explained later. If there is a data-
base query specified for this error type, it is performed and the result can be written 
into the final reporting message.  
 
 
Figure 10. Processing of the message part of the algorithm 
 
 
35 
 
 
After the data load is finished and the log file is saved, the Python script goes through 
its content. See Figure 11 for the activity diagram of this process. During the reading 
process it collects statistical information (e.g. a number of processed objects, error 
counters, ...) and it also extracts the module´s reporting messages. After the file 
reading is finished, it utilizes all the collected information to form an informative e-
mail providing the most important information immediately after the first look.  
 
 
Figure 11. Log file script post-processing part of the algorithm 
 
6.4 Components 
6.4.1 Module overview 
Figure 12 represents schema of the module. Central component controlling all the 
other ones is ReportingModule. There are few exceptions, but usually other compo-
nents do not communicate directly to each other. Connecting components are 
36 
 
 
placed in the top part of the schema. Components related to the configuration are 
on the left. Processing and assisting components are then in the right and bottom 
part.  
 
 
Figure 12. Schema of the module 
 
6.4.2 Connecting components 
These components are used to allow the connection of the module to the Data Load 
utility.  
Data Load console formatter 
Class CustomDataLoadConsoleFormatter can be considered as a connection point for 
the new reporting module. This class extends the originally used DataLoadConsole-
Formatter. The purpose of this class is to receive a LogRecord object into its format 
method then responsible for formatting the message content into the desired form 
before it is written into a log file.  
37 
 
 
There have been no changes to the message formatting, which stayed completely in 
the competence of the inherited logic from the parent class. The only change was 
connecting the new module here and passing the processed message there. Eventual 
error handling logic was also placed here for a case that some problem will occur in 
the module. In any possible situation, error in the module cannot have any effect on 
the data loading process.  
Data Load file formatter 
Some of the exceptions and error messages are written into separate log files stored 
in a different location than the main data load log file. The responsible class for for-
matting these messages was originally the default Java SimpleFormatter class. The 
same operations as with CustomDataLoadConsoleFormatter were performed here in 
order to be able to catch and analyze also messages written to these different log 
files.  
6.4.3 Processing components 
Following components are responsible for the main activities of the error messages 
processing.  
Reporting module 
This is the central component of the whole solution. It controls the whole process 
and holds the whole module together with references to other components.  
Dynamic parameters 
Some of the error messages already contain certain important information beneficial 
to be transported into the final reporting message, which is why the basic process of 
translation of one message (error message) to another one (reporting message) is 
extended by functionality provided by component Dynamic parameters. This compo-
nent is responsible for identifying and extracting important information and placing it 
into a reporting message on specified places and in a specified order.  
See Figure 13 for examples of messages, which come into, or are produced by this 
component. “String to match” specifies positions, where the targeted information is 
38 
 
 
located. “Report message template” is a template for final reporting message with 
placeholders for information extracted from the error message. They can be placed 
anywhere in any order, which is controlled by indexes. “Final report message” is then 
a final message containing a static explanation of the error with instructions enriched 
by dynamic parameters for this specific case.  Figure 14 shows the processing of 
these messages.  
 
 
Figure 13. Examples of messages used in Dynamic parameters component.  
 
39 
 
 
 
Figure 14. Processing of messages by Dynamic parameters component 
 
The component receives an error message and “String to match”. Based on these 
two strings, the algorithm extracts information from the error message and provides 
them to another algorithm placing them into a report message template.  
Static parameters 
Basic information used for reporting is usually taken only from logging messages 
written into the log file. Extra information about a failed entity can be connected as 
well in certain cases. Nevertheless, sometimes it is necessary or beneficial to include 
it or use also other information. If they do not have relation only to certain cases or 
items, however, are common, for example for the whole data load process, they can 
be called “static parameters”.  
As an example a store identifier or store name can be named (in WebSphere Com-
merce Extended sites information model). This information can be loaded only once 
40 
 
 
and it is the same for all the processes within one data load. The values can then be 
printed into the final reporting message. Another usage can be to use them in SQL 
queries for the database, where the identifier of the store might be necessary in 
where clause for restricting the set of results.  
The Static parameters component contains the logic for loading the information of a 
certain static parameter and then also method replaceStaticParams, which receives a 
string with the possible presence of the static parameter´s placeholders and replaces 
them with the expected information. This method can be used also for preparing 
database SQL queries.  
Currently, loading the information strongly depends on the case. For example, the 
store name is available as a data load process parameter via DataLoadHelper class. 
On the other hand, the store identifier cannot be retrieved from anywhere else than 
from the database, therefore, it is necessary to perform a database query in order to 
get this information.   
Reports stack 
If, for instance, several products have the same problem causing data load error of 
one type, it would not be very clear and efficient if every occurrence of that error 
were to be printed separately. It is much better to collect all the information and 
then print only one report about this error type and simply list there all the products, 
which had the same problem (see Figure 15).  That is the purpose of this component. 
It stores information about the error and affected items.  
 
Figure 15. Error report with list of affected items 
 
 
41 
 
 
6.4.4 Assisting components 
Assisting components provide various supporting functions that are needed in the 
module.  
Database connector 
This component encapsulates the logic to perform queries to the database. 
Logging formatter 
This formatter provides options of various text modifications needed in the module.  
XmlLoader 
This is a loading component responsible for loading the data from XML files. It con-
tains the file paths and file names and the logic for opening and reading files. Also, 
file-type specific logic is implemented here in order to save the loaded data into cor-
rect structures.  
6.4.5 Configuration components 
The configuration of the module is stored in XML files and during the data loading 
process it is handled within the following components.    
Configuration of reporting module  
This component stores settings which can affect the module´s functionality, a way of 
working, etc. If it is possible and useful, all the values and configurations are prefera-
bly stored in this component rather than in the source code. Changing of any value 
then does not require any code change or a more complicated or long deploying pro-
cess.  
 
42 
 
 
 
Figure 16. Configuration of the module in the XML file 
 
An example (visible on Figure 16) of such a setting can be a debug switch, which con-
trols whether debugging messages should be written into the log. This can be useful 
during development and troubleshooting. Also, time expiration of the settings of the 
known errors is stored here and can be changed very quickly at almost any time.  
Reporting settings 
All the known errors, which should be somehow automatically processed by the 
module are configured in the XML file. A part of the configuration file can be seen in 
Figure 17. This component stores all the information.  
Class ErrorItem is used to represent a single item of the module´s reporting settings. 
It is constructed from the data contained within the Item tag in the settings. It con-
tains an error message by default provided by the Data Load utility. A report message 
is an explanation of the error providing easier understanding and eventually also 
guidance what to do in order to fix the problem. The tag serves only for marking the 
final message. If it can help to provide important information, a database query can 
be configured. If there are multiple stores on the same platform and it is required to 
target this error handling configuration only for the specific one, it can be specified in 
property stores. Finally, it can be configured if this message were to be printed into 
an e-mail or only to the log file.  
43 
 
 
 
Figure 17. Configuration of known errors, reporting messages and other related set-
tings for the specific error 
 
Sources of errors 
In some cases, default errors describe the certain problem but do not mention the 
entity where problematic or corrupted data is present. Of course, this applies only to 
data load errors caused by some problem in loaded data. In that case, it is not imme-
diately clear where to look for the cause of the data load error. Information about 
which entity was just being loaded when the error happened can be useful because 
after that it is easier to identify the source data file, where the problem can be pre-
sent.  
This component was developed with the purpose of identifying and providing infor-
mation about which entity was loaded when the error occurred. The identifying 
source is based on the manually set relation between class names occurring in the 
exception´s stack traces. Basically, it is a process of identifying a pre-set string in the 
message being written into a log file and providing information to which this string is 
mapped. This key-value pair mapping is configured in the XML file and then loaded 
and stored in this component.  
 
Figure 18. Error where the source of the error is not clear 
 
44 
 
 
An example of the benefit of this component can be seen in Figure 18. The reference 
to a non-existing category can come from the product (category, where the product 
belongs) or another category (a reference to a parent category). If information about 
the source is provided, it is clear which source data file to check.  
6.4.6 Configuration testing tool 
It is expected that the number of errors the module is able to recognize and process 
will grow every time when some new error occurs. In that case, it will be necessary to 
add a new item to the module´s errors configuration. As there can be some special 
characters, which need to be handled (in order to prevent their standard interpreta-
tion) or the dynamic parameters can be used, it can be quite unsure if the configura-
tion will work as expected. For this purpose, a simple testing tool was developed. It is 
enough to configure the settings there and provide a message from a log file.  
 
7 Results 
7.1 Current status and results 
The module´s development phase is now finished. The module is capable of auto-
mated processing based on configuration and provides reporting information about 
data load errors which occurred. The output of the module in case of data load with 
errors can be seen in Figure 19.  
 
Figure 19. Information from reporting e-mail 
 
45 
 
 
These example reports immediately provide necessary information about the error. 
All the texts are configurable and the most important thing is the data. Each of the 
reports contains tag which provides first brief information about the error. It also 
helps to order the reports. There is a configurable explanation to that specific error, 
which should tell the reader what exactly happened, why and what can be done to 
solve the issue. Then there is an information about where it happened – which entity 
was being loaded. In certain cases, also precise item identifiers or names are provid-
ed. The combination of all this information provides everything necessary to under-
stand the problem, locate it in the data and fix it. If more information is needed es-
pecially for support team members, they can easily find the log file as well.  
7.2 Putting into use 
The testing phase recently started, which means that the module will be tested by 
more people with much bigger and different sets of data. The main objectives of the 
testing phase are to confirm expected behavior during different situations, correct 
handling of error states, identify possible performance issues, etc. Also, the error 
configuration will be built to recognize different types of errors. It is expected that 
there will be some new requirements or change requests during the testing period. 
Certain problems with the solution can be also discovered. After everything has been 
tested and is working as expected, the real usage can start.   
 
8 Conclusion 
The original objectives were successfully met. The Data Load utility was studied and 
explored, and options for improvements were identified, designed and implemented. 
The developed module is capable of automated processing of the data load errors 
and presenting them in an understandable way enriched by additional data to allow 
the fast solving of the error causes. The module was welcomed by the company 
representatives as it solves a real issue which has been existing for a longer period of 
time. It successfully reduces the time and cost requirements of the solving process of 
the data load errors.  
46 
 
 
The process leading to the results comprised the research of the Data Load utility, 
analysis, design of the intended improvements and their implementation. 
Communication with the stakeholders in a form of consultations, presentations and 
feedback gathering was also necessary. The combination of all of these was a very 
interesting professional experience.  
There were many obstacles and limitations during all the phases. The main problems 
were that there was not enough information available about certain parts of the 
Data Load utility, and it was also impossible to modify many of its parts due to 
technical and license restrictions. The faced issues brought a constant risk of failure 
as at almost any point it could have happened that it would simply not be possible to 
achieve the specified goals due to some newly discovered issue or limitation.  
The main functionality is now finished, however, as mentioned in the previous 
chapter, certain requests can still occur during testing. Also, some other functionality 
can be added to the module later on when some new needs will be discovered.  
The whole thesis resulted in a solution which is beneficial for the company and meets 
a real need. From the personal point of view, challenges which came especially from 
the faced issues offered a big opportunity to learn and develop useful skills. I have 
already used the gained knowledge about the Data Load utility for tasks not related 
to the thesis.  
 
  
47 
 
 
References 
Data Load utility architectural overview. Page on IBM Knowledge Center. Accessed 
on 27.09.2016. Retrieved from 
http://www.ibm.com/support/knowledgecenter/en/SSZLC2_8.0.0/com.ibm.commer
ce.data.doc/concepts/cmldataloadovdev.htm  
Data Load Utility in WebSphere Commerce Introduction. Video on Youtube.com. 
Accessed on 12.10.2016. Retrieved from 
https://www.youtube.com/watch?v=jCVOwqH0Rhw  
Exploratory Programming. Exploratory Programming with Collaborative 
Programming Languages Accessed on 05.10.2016. Retrieved from 
http://steak.place.org/dougo/thesis/plan/  
Overview of the Data Load utility. Page on IBM Knowledge Center. Accessed on 
27.09.2016. Retrieved from 
http://www.ibm.com/support/knowledgecenter/SSZLC2_8.0.0/com.ibm.commerce.
data.doc/concepts/cmlbatchoverview.htm  
WebSphere Commerce Enterprise. Page on IBM website. Accessed on 29.09.2016. 
Retrieved from http://www-03.ibm.com/software/products/fi/websphere-
commerce-enterprise  
What Is E-Commerce. Business News Daily. Accessed on 12.10.2016. Retrieved from 
http://www.businessnewsdaily.com/4872-what-is-e-commerce.html