Automated Functional Testing based on the Navigation of Web Applications

Web applications are becoming more and more complex. Testing such applications is an intricate hard and time-consuming activity. Therefore, testing is often poorly performed or skipped by practitioners. Test automation can help to avoid this situatio…

Authors: Boni Garcia (UPM), Juan Carlos Due~nas (UPM)

Automated Functional Testing based on the Navigation of Web Applications
L. K ov acs, R.Pugliese, and F . T iezzi (Eds.): W orkshop on Automated Specification and V erification of W eb Systems (WWV 2011) EPTCS 61, 2011, pp. 49–65, doi:10.4204/EPTCS.61.4 © B. Garc ´ ıa & Juan C. Due ˜ nas This work is licensed under the Creativ e Commons Attribution License. A utomated Functional T esting based on the Na vigation of W eb A pplications Boni Garc ´ ıa Juan C. Due ˜ nas Departamento de Ingenier ´ ıa de Sistemas T elem ´ aticos ETSI T elecomunicaci ´ on - Univ ersidad Polit ´ ecnica de Madrid A vda. Complutense 30, 28040 Madrid, Spain bgarcia@dit.upm.es jcduenas@dit.upm.es W eb applications are becoming more and more complex. T esting such applications is an intricate hard and time-consuming acti vity . Therefore, testing is often poorly performed or skipped by practi- tioners. T est automation can help to av oid this situation. Hence, this paper presents a nov el approach to perform automated software testing for web applications based on its navig ation. On the one hand, web na vigation is the process of tra versing a web application using a browser . On the other hand, functional requirements are actions that an application must do. Therefore, the ev aluation of the cor- rect navigation of web applications results in the assessment of the specified functional requirements. The proposed method to perform the automation is done in four le v els: test case generation, test data deriv ation, test case execution, and test case reporting. This method is dri ven by three kinds of in- puts: i) UML models; ii) Selenium scripts; iii) XML files. W e have implemented our approach in an open-source testing framework named Automatic T esting Platform. The validation of this work has been carried out by means of a case study , in which the target is a real inv oice management system dev eloped using a model-dri ven approach. 1 Intr oduction The W orld W ide W eb (or simple the W eb) has become one of the most influential instruments not only in computing but in the history of mankind [22]. The dev elopment of web applications has been in general ad hoc, resulting in poor-quality applications. As the reliance on lar ger and more comple x web applications increases so does the need for using methodologies and guidelines to de velop applications that are deli vered on time, within b udget, and with a high le vel of quality . Software testing is the main technique to ensure quality and finding b ugs. It is in general a dif ficult and time-consuming task. W eb testing may be ev en more dif ficult, due to the peculiarities of such applications. A significant conclusion has been reached in the survey of web testing depicted in [18]: “further resear ch efforts should be spent to define and assess the effectiveness of testing models, methods, techniques and tools that combine traditional testing appr oaches with ne w and specific ones” . Following this statement, this piece of research presents an approach to test web applications by automating its navig ation. This approach allo ws saving time on testing ef fort, since it is automated in dif ferent levels. Firstly , unit test cases are generated for each path in the navigation of the web site. Secondly , depending on the data handled by the System Under T est (SUT), test data is generated and stored in a spread-sheet. These data is used later by the created unit test cases. Thirdly , test case ex ecution is performed using Selenium, which allows running the web interaction using a real bro wser . Finally , a complete test report is generated of each unit test case, which corresponds to each path in the SUT navig ation. The remainder of this paper is structured as follo ws. Section 2 introduces the context in which this research has been performed (i.e. automated softw are testing, web testing and modeling, and graph the- 50 Automated Functional T esting based on the Navig ation of W eb Applications ory). Section 3 shows a comparati v e study of the existing choices to find the paths in the web na vigation. This section also presents a laboratory e xperiment carried out in order to select the algorithm to be uses in the proposed approach. Section 4 presents the proposed methodology to perform automated web test- ing. Section 5 describes ho w the proposed method has been implemented in an open-source tool named Automatic T esting Platform (A TP). Section 6 details the validation of the presented approach by means of an industrial case study , in which A TP has been employed to test an in voice management web system. Finally section 7 presents the reached conclusions. 2 Backgr ound 2.1 A utomated Softwar e T esting Software testing is the main activity performed for e v aluating product quality , and for improving it, by identifying defects in software-intensi v e systems [1]. Manual testing is usually a hard and time- consuming and expensi ve activity . Some studies shows that testing is considered as one of the most costly de velopment processes, sometimes exceeding fifty per cent of total dev elopment costs [6]. Consequently , testing is often poorly performed or skipped by practitioners. This situation suggests industry-wide deficiency in testing, and automated testing is proposed as one possible solution to overcome this problem [20]. A definition of Automated Softw are T esting (AST) can be found in [12] as the “Application and implementation of software tec hnology thr oughout the entir e Softwar e T esting Lifecycle (STL) with the goal to impr o ve efficiencies and ef fectiveness” . AST is most effecti ve when implemented within a framework. T esting frame works may be defined as a set of abstract concepts, processes, procedures and en vironment in which automated tests will be designed, created and implemented. According to the Automated T esting Institute (A TI) 1 , there are three dif ferent generations of AST frame works. The 1 st generation is comprised of the linear approach to au- tomated testing. This approach is typically driven by the use of the Record & Playback (R&P) method. It is carried out firstly recording the linear scripts corresponding with actions performed in the application (record). After that, the automation stage can be done repeating the record while ex ercising the SUT (playback). The 2 nd generation comprises two kinds of framew orks: the data-driv en and functional decomposition. Frame works built on data-driven scripting use test data typically stored in a database of file e xternal. Functional decomposition refers to the process of producing modular components in such a way that automated test scripts can be constructed to achiev e a testing objectiv e by combining these existing components. The 3 rd generation includes the ke yword-dri v en and model-based frame works. The keyw ord-driv en framew orks process automated tests that are dev eloped with a vocab ulary of key- words. These k eyw ords are associated with functions that are interpreted with application-specific data. The automated scripts execute the interpreted statements in the SUT . The model-based frameworks go beyond creating automated tests in a semi-intelligent manner . Model-Based T esting (MBT) is the soft- ware testing technique where test cases are deri v ed in whole or in part from a model that describes some aspects of the SUT [24]. 2.2 W eb T esting and Modeling W eb applications follow a client-server application protocol. The web client (using a web browser , such as Explorer , Opera, Saf ari, Firefox or Chrome) sends an HTTP request through a TCP-IP network 1 http://www .automatedtestinginstitute.com/ B. Garc ´ ıa & Juan C. Due ˜ nas 51 (typically the Internet) to a web server . The serv er receiv es this request and determines the page, which usually contains some script language to connect which a database server . A middleware component connects the web server with the database to inform about query and get the requested data. This data is used to generate an HTML page, which is sent back to the client in form of a HTTP response. W eb testing consists of e xecuting the application using combinations of input and state to rev eal failures. These failures are caused by faults in the running en vironment or in the web application itself. The running en vironment mainly affects the non-functional requirements of a web application, while the web application is responsible for the functional requirements. W eb applications are difficult to test, due to their peculiarities [18]: i) A wide number of users distributed all ov er the world accessing concurrently . ii) Heterogeneous ex ecution en vironments (different hardware, netw ork connections, operating systems, web servers and browsers). iii) Heterogeneous nature because of different technologies, programming languages, and components. i v) Dynamic nature, since web pages can be generated at run time according to user inputs and server status. Regarding web modeling , in some cases new models ha ve been proposed while in other cases ex- isting modeling techniques have been adapted from other software domains [2]. The de-facto notation standard for modeling is UML (Unified Modeling Language). The standard diagrams in UML 2.0 are the following [19]: uses cases, activity , classes, sequence, interaction, communication, object, state ma- chine, composite, deployment, package, and timing. Hence, UML 2.0 does not provide in a standard way any diagrams to model some specific aspects of the web applications. For that reason, specific UML extensions ha ve been created, for e xample the • UML-based W eb Engineering (UWE) 2 is a software engineering approach aiming to cov er the whole life-c ycle of web application de velopment [16]. It is based on the Unified Process (UP) [17], and defines a UML notation by means of an UML profile. • W2000 is an approach that also extends UML notation to model multimedia elements. These multimedia elements are inherited from HDM (Hypermedia Design Model) [5]. • W eb Modeling Language (W ebML) 3 is a high-lev el specification language for designing com- plex web applications. It offers a visual both in Entity-Relationship and UML, although UML is preferred by the authors [9]. • Na vigational Dev elopment T echniques (NDT) 4 is a methodological approach oriented to the web engineering. It is mainly focused on the requirements and the analysis phases using the model- dri ven paradigm [13]. 2.3 Graph Theory In mathematics and computer science, graph theory is the study of graphs. A graph is the abstract representation of a set of v ertices (vertex or nodes) connected by arcs (edges or links). A graph is a pair G = ( V , E ) of sets such that the elements of V are vertex and the elements of E are the edges. The usual way to picture a graph is by drawing a dot for each v ertex and joining two of these dots by a line if the corresponding two v ertices form an edge [11]. On one hand, a graph in with the edges hav e no orientation is known as undirected graph. On the other hand, if the edges ha ve orientation, the graph is kno wn as directed graph (digraph) [4]. A digraph 2 http://uwe.pst.ifi.lmu.de/ 3 http://www .webml.org/ 4 http://iwt2.org/ 52 Automated Functional T esting based on the Navig ation of W eb Applications is acyclic if it has no c ycle. A digraph is strongly connected (or , just, strong) if e very v ertex is reachable from ev ery other verte x, i.e. there is a path from each verte x in the graph to e very other vertex. A multigraph is a graph in which is permitted ha ving multiple edges (two or more edges that are incident to the same two vertices) and/or loops (edge that connects a verte x to itself). If the multigraph is directed, then is kno wn as multidigraph . In a weighted graph a number is assigned to each edge. This number (weight) could represent costs, lengths and so on. A path is a graph such that from each of its v ertex there is an edge to the next v ertex in the sequence. If the start node is the same than the end node, then the path is known as c ycle. A walk is a path in which nodes or links may be repeated. A circuit is closed walk. 3 Finding the Paths in a Multidigraph W e will use graph theory to represent and work with the navigation of web applications. Therefore, a web site can be modeled by means of a finite multidigraph, that is, a finite directed graph (finite set of web pages and nodes) in which multiple edges and/or loops are allo wed. Gi ven a multidigraph, we need a method or algorithm to find its independent paths. The cov erage criteria in this path decomposition is that each edge is traversed at least once . This condition also implies that each vertex is visited at least once too. This section studies the algorithms and methods found in the literature to solv e this problem. Some discussion is provided in order to select the best option. Graph trav ersal is the f acility to move through a structure visiting e very vertex once. There are two possible trav ersal methods for a graph: Breadth-First Search (BFS) and Depth-First Search (DFS) [8]. BFS visits all the vertex, beginning with a specified start. No vertex is visited more than once. BFS makes use of a queue (First-In First-Out, FIFO) data structure. DFS works in a similar way , except that the neighbors of each visited vertex are added to a stack (Last-In, First-Out, LIFO) data structure. T raveling Salesman Problem (TSP) tries to find the most efficient (i.e., least total distance) cycle through each of each vertex of a graph [14]. TSP is a variation of the Hamiltonian tour problem (to find a cycle that visits each vertex exactly once in a graph), and it belongs to the class of NP-hard problems. The Shortest Path Problem (SPP) is the problem of finding a path between two nodes within a graph such that the sum of the weights of its constituent edges is minimized [8]. The main algorithms employed in the different categories of SPP are: Dijkstra, Bellman-Ford, A* (pronounced “ A star”), and Floyd-W arshall. The Chinese Postman Problem (CPP) , also known as the postman tour or route inspection problem, is the problem of finding a shortest circuit that visits ev ery edge of a graph at least once, i.e. the Chinese Postman T our (CPT). Finding an optimal solution of these problems is NP-complete [7]. Thimbleby proposes a solution for CPP in form of deterministic algorithm in [23], providing an ex ecutable Ja v a to solve this problem. The constraint imposed by this algorithm is that the input digraph has to be strongly connected with no negati ve weight cycles. It considers a graph as a collection of arcs < l abel , i , j , c > , where label is an identifier for an arc from verte x i to j , and c the cost associated with it. The node reduction algorithm [6] finds out the path between two nodes, typically the entry and exit nodes by reducing the rest of graph connecting these nodes. It employs graph algebra to achie ve this goal. The multiplicativ e operator in graph algebra means concatenation: if edge a is followed by edge b, their product is a · b (path product). The additi ve operator is selection: if either edge a or edge b can be taken, their sum is a + b . A path expression contains path products and zero or more additive operators, and are usually represented by upper case letters (e.g. A = a · b ) [3]. Finally , in graph algebra it is usually employed the graph matrix representation, which is a square array with one row and one column for ev ery node in the graph. Each row-column combination corresponds to a relation between B. Garc ´ ıa & Juan C. Due ˜ nas 53 Figure 1: Digraph Example (Original and Strongly Connected) the node corresponding to the row and the node corresponding to the column [6]. The node reduction algorithm has basically two steps: i) remove self-loops (any node n that has an edge to itself); ii) eliminate intermediate nodes and replacing it with a set of equiv alent links. An example of this algorithm is detailed belo w , and it has been employed for web navig ation in [21]. None of the methods before fits exactly in the problem at hand: to select the different path within a graph. BFS and DFS algorithms traverse each vertex within a graph, but they do not ensure that each edge is visited at least once. This applied to web na vigation is not acceptable, due to the f act that we need to visit each web link. The same issue happens with TSP: Hamiltonian tours have nothing to with edges but vertices coverage. The dif ferent algorithms of SPP are not useful in this domain due to the fact that it looks for the shortest path between nodes. CPP fits exactly with the objectiv e of 100% edge cov erage, but it has a strong constraint that cannot be ensured for any multidigraph modeling web navig ations: they should be strong connected. Node reduction could be an alternati ve, but we cannot suppose that web navig ation has always an e xit page. Ne vertheless, CPP and node reduction can be modified to solve the problem. Consider the digraph la- beled as “i) Original” in Figure 1. A simple and effecti v e way to con vert this graph in strongly connected is by adding virtual links from the leaf nodes (those with no out links), connecting them with the start node (“home” in the na vigation). These virtual links are labeled with “R”, which means “reset”. These links will be substituted by additi v e operator when reducing the graph to its paths. The ne w equi v alent digraph is sho wn in Figure 1, labeled as “ii) Strongly connected”. In this ne w situation, CPP suits the problem of checking web sites: a verte x is a page, an arc is a link, the label of the arc could be hot text or a URL, and the weight represents a cost (e.g. estimated in seconds of the user checking the link). The goal is to determine a list of labels that, in order , constitute an optimal CPT , i.e. the shortest tour with few repeated street visits to cover e very edge in the graph. The cost of a CPT is defined as the total arc weight, summed along the circuit. The optimal test sequence for this web site is therefore an open CPT . Assigning a weight of 1 for each link and applying the CPP algorithm gi ven in [23] to the proposed e xample, the resulting path expression is the follo wing: E 1 · E 3 · E 4 · E 5 · R · E 1 · E 7 · R · E 1 · E 2 · E 6 · R · E 0 · R = E 1 · E 3 · E 4 · E 5 + E 1 · E 7 + E 1 · E 2 · E 6 + E 0 That is, four dif ferent paths with a total cost of 10 links. Moreov er , node reduction can be applied to the strongly connected graph in order to reduce the equiv alence graph matrix. The complete explanation of ho w this process is done can be found in [6], and the resolution for this example it is illustrated in Figure 2. Therefore, the resulting path e xpression of this application of the node reduction algorithm to the proposed example is: 54 Automated Functional T esting based on the Navig ation of W eb Applications Figure 2: Node Reduction Solution for the Proposed Example Figure 3: Node Reduction vs. CPP Costs ( E 1 · E 7 · R + E 1 · E 2 · E 6 · R ) · ( E 0 + E 1 · E 2 · E 3 ) · E 5 · E 6 · R = E 1 · E 7 + E 0 · E 5 · E 6 + E 1 · E 3 · E 4 · E 5 · E 5 + E 1 · E 2 · E 6 That is, four different paths with a total cost of 13 links. It is a quite similar solution than the one provided by CPP (10 vs. 13 links). This fact suggests that CPP giv es better results than node reduction. In order to ensure this statement, we hav e made a laboratory experiment. The experiment will consist on the comparison between node reduction and CPP , using random multidigraphs (i.e., with loops and multiple edges). These graphs hav e been created using an incremental number of links (from 1 to 50). For each digraph node reduction and CPP will be executed, comparing its cost (number of links employed in the resulting set of paths), and also the computation time (milliseconds in achie v e the solution). This experiment has been carried out in a PC Intel Core2 Quad (2.66 GHz) with 4 GB of RAM memory . It has been repeated 100 times, and the mean of the v alues (cost and time) is shown in Figure 3 and 4. CPP has a better behavior than node reduction because it is more linear . It always has a better cost solution than node reduction (Figure 3). In addition, the resolution time is higher and higher in node reduction while CPP ends always in a few of milliseconds (Figure 4). All in all, CPP is the selected algorithm to find out the set of path in a digraph in our method. B. Garc ´ ıa & Juan C. Due ˜ nas 55 Figure 4: Node Reduction vs. CPP Time 4 Pr oposal Statement Di Lucca and Fasolino draw an important conclusion about functional testing for web applications [18]: “As to the functional testing, existing tools main contrib ution is limited to manage test case suites man- ually cr eated, and to match the test case r esults with r espect to a manually cr eated oracle. Ther efor e, gr eater support to automatic test case g eneration would be needed to enhance the pr actice of testing W eb applications” . Follo wing this statement, we propose an approach to perform AST for web applications based on its navig ation. T o achiev e automated functional testing, requirements should be described in a form that can be understood by software programs. Hence, the first way we propose to model web navigation in order to automate the testing process is by means of UML 2.0 diagrams , concretely the following: • Use case diagram. These diagrams offer a perspectiv e of the functional requirements of the appli- cation interaction with the actors. • Acti vity diagrams. These diagrams describe the flow within a use case. Therefore, each activity diagram will describe the navig ation structure of the SUT . • Presentation diagrams. These diagrams models the data which is handle by the web application. Due to the fact that UML 2.0. do no implement this feature, it is required a UML profile which enhances the syntax of standard UML 2.0 to achie ve modeling of web pages. As depicted in Section 2.2, presentation diagrams are not standard in UML 2.0. Therefore, we need to use one of the specific UML extensions for web applications depicted in that section. Recent research sho ws that software project success is directly tied to requirement quality [15]. Requirement Engineering (RE) in v olves all lifecycle activities dev oted to identification and analysis of user requirements, docu- mentation of the requirements as specification, and validation of the documented requirements against user needs. Thus, we are going to compare the presented UML-based technologies using which types of requirements are handled by each approach [10]. T able 1 presents this comparison. Each column sho ws whether or not the technology manages the following type of requirements: i) data requirements (also known as conceptual requirements, establishes how information is stored and administrated); ii) user interface (interaction requirements); iii) na vigation (users’ navigation needs); iv) personalization 56 Automated Functional T esting based on the Navig ation of W eb Applications T able 1: UML-Based W eb Modelling T echnologies Data UI Na vigation Personalization T ransactional Non-functional UWE X X X W2000 X X X X X W ebML X X NDT X X X X X X (customization, describing how requirements are dynamically adaptable); v) transactional; vi) non- functional. Having seen these results, NDT seems to be the choice to model web applications since it co vers each kind of the requirements studied. Therefore, the UML models to guide the MBT approach of this paper will be based on NDT . States in activity diagrams are connected by links. These links are characterized by a label called guard. This guard describes how web state changes and it will be used to describe the in v olved HTML elements in the transitions between states. A transition can be composed by se veral atomic actions. In order to be able to describe this beha vior , the guard of the acti vity diagrams will follow the notation depicted belo w: [ t arge t 1 , event 1 , < key 1 > ; t arget 2 , event 2 , < key 2 > ; . . . t arge t n , event n , < key n > ] The meaning of these fields is the follo wing: • T arget: Identifier of the HTML target element. In order to translate the target element, the follow- ing procedure will be used: Function LocateHTMLElement ( Target ) Found = nothing For each frame in the frameset ( if frames exist ) For each HTML element in the frame Found = Look for Target in the id / name / value attribute If Not Found Found = Look for Target as text If Not Found Found = Execute Target as XPath expression End If End If End For End For Return Found End Function • Ev ent: Literal that describes the action performed. These literals are based on the DOM ev ent specified by the W3C 5 , i.e. click , dblclik , ke ypr ess , ke ydown , ke yup , mousedown , mousemove , mouseout , mouseover , and mouseup . • K ey: Optional field containing the b utton that triggers the ke y e vents. The second way of modeling the web navigation will be using the R&P approach , which is a useful way to represent the structure of a web application by recordings interactions with the application trough 5 http://www .w3.org/TR/DOM-Lev el-2-Events/e vents.html B. Garc ´ ıa & Juan C. Due ˜ nas 57 Figure 5: XSD Graphic Representation the browser . This method is more agile than UML, since the application can be developed av oiding the formal design phase. Halfway between the UML models and R&P , we hav e created a syntax-neutral way of modelling the navigation using a specific created XML notation . XML (Extensible Markup Language) provides an easy way to store and share information. T o provide the formal declaration of this XML format, XML Schema language (also known as XML Schema Definition, XSD) will be employed to perform the formalization of the navig ation constraints. This XSD schema defines a website as a collection of states (pages) and transitions (links). The initial page is called home, and it is unique. In addition, there is a finite number or web pages connected by links, as depicted in Figure 5, which represents the XSD type for a web site. There is a mandatory XML attrib ute in the definition of a web site named base. This attribute is the starting URL for the navig ation. The automation of the bro wsing will be carried out from this URL. Each page is recognised by a unique identifier . Each state can contain a set of data fields, and each data field contains the follo wing information: • Id: Data field identifier . • Locator: Optional identifier used to make a reference to a specific path ( to element within a tr an- sition ). 58 Automated Functional T esting based on the Navig ation of W eb Applications • T ype: Data type. It corresponds to the following HTML input elements: text , textar ea , password , chec kbox , radio , file , select-one , and select-multiple . • Required: Boolean v alue than indicates whether or not the data field is mandatory . • V alue: Collection of v alues of the data field. • Stereotype: One of the follo wing types: email , date , name , surname , addr ess , string , inte ger . Moreov er , each state can contain a set of oracles, which perform assertions in this web page. These oracles are described using the follo wing attributes: • Id: Oracle identifier . • Locator: Optional reference. It has the same meaning as in a data field. • T ype: Oracle category . It can be one of the following literals: te xt (assertation for a te xt to be present in the locator element), notT ext (the opposite of text ), te xtPr esent (assertation for a text to be present in web page), te xtNotPr esent (the opposite of textPr esent ), value (assertation for a value to be present in the locator element), and notV alue (the opposite of value ). Finally , web transitions are composed by an attrib ute called from (which is the identifier of the web page source) and a collection of actions and web targets (attrib ute to ). The action attributes is composed by the fields target, key , and ev ent. The meaning of these fields is the same as in the guard of UML acti vity diagrams. A simple example of na vigation based on this XSD-schema is illustrated in the follo wing snippet: Administrador admin Welcome All in all, the approach we propose to automate the functional testing for web applications can be seen as an aggregation of the follo wing automated methods: B. Garc ´ ıa & Juan C. Due ˜ nas 59 Figure 6: Schematic Diagram of the T est Case Automation • R&P . Linear scripts using a record and playback method is used. This approach is considered the 1 st generation of AST frame works. • Data-dri ven approach (2 nd generation). This testing approach means that using a single test case dri ving the test with input and expected values from an e xternal data source instead of using the same hard-coded v alues each time the test runs. • MBT (3 rd generation). UML models from design phases (use cases, activity , and presentation diagram) will be reuse to guide the automation approach. In order to achie v e the data-driv en approach, the automation will mean the separation of the test case and test data/expected outcome generation. In order to store the test data and expected outcome a tabular data file will be used. This file will store test data (input) and expected outcomes (output). Therefore, this method has one strong prerequisite: there should be a model of the navigation beha viour of the web under test. As depicted before, this navigational model is one of these three notations: UML (using NDT), or XML, or R&P . This requisite is labelled as pre-automation in the red box illustrated in Figure 6. Once test cases for the na vigation paths are generated, additional input and output data can be man- ually added to driv e more test cases with the same test logic. These data (input and output) can be stored as new files in the tab ular file as depicted before. This process is shown schematically in the yellow box labelled as post-automation in Figure 6. 60 Automated Functional T esting based on the Navig ation of W eb Applications Regarding test case generation, the automation is done in three different stages: i) T est logic genera- tion; ii) T est data generation; iii) T est oracle generation. T est logic generation is illustrated in the green box in Figure 6. This step takes as input the model from the pre-automation stage, i.e. a model in UML, or XML or R&P . This logic generation pass through the follo wing steps: • White-Box Parser . This entity is in charge of translating the different models used for testers (UML, XML, or R&P) to the internal way of modelling web applications, that is, multidigraphs. • CPP . This module contains the deterministic CPP Ja v a algorithm created by Thimbleby [7], used to find the paths within the multidigraph. • P aths. As a result of applying the CPP algorithm, a set of independent paths should be found. These paths correspond to a sequence of web pages that should be exercised against the SUT to ensure the navig ation requirements. T est data generation is illustrated in the purple box on Figure 6. This stage is also fed with the navig ation model from the pre-automation stage. The process is as follows: • Black-Box Parser . This module extracts test data and e xpected outcome from the input model. XML models can include test data and R&P models can include test data and expected outcome. Regarding test data (input), this black-box parser should e xtract the v alue and the data type. • Data type will feed a test data dictionary . This dictionary contains a collection of data that can be used as input for test cases. For the selection of specific value, besides the type of data, a module that generates a random pointer will be used (randomizer). • Therefore, the data required for the test cases consist on the aggre gation of three dif ferent sources: i) Data from the XML and R&P models; ii) Randomly generated data from on a test data dictio- nary; iii) Manual data included as ne w rows in the tab ulated file (post-automation). T est oracle generation is illustrated in the green box on Figure 6. This module has the following parts: • Outcome analyser . This module collects data from the response of the SUT and extracts the fol- lo wing information: i) Navigation state; ii) Actual data returned by the application. In order to find out the real state navig ation, the aggregation of data field will be used. • White-Box Oracle. This module will establish verdicts by comparing the expected to the actual state. The expected state is set by the na vigation path pre viously extracted in the test logic module. The real state is e xtracted from the SUT’ s response by the outcome analyser using the procedure described before (aggregation of data fields). • Black-Box Oracle. This module will establish verdicts by comparing expected with actual data. The expected data comes from the black-box parser of the test data module. In addition, additional expected data can be added in the post-automation stage by adding ne w information in the tab ular data file. • V erdicts from white and black-box oracles will become the test report. B. Garc ´ ıa & Juan C. Due ˜ nas 61 T able 2: A TP Components Function Library URL Unit frame work JUnit http://www .junit.org/ W eb browsing Selenium http://seleniumhq.org/ T est case generation Freemarker http://freemarker .sourcefor ge.net/ Random data generation dgMaster http://dgmaster .sourceforge.net/ T est case ex ecution Ant http://ant.apache.org/ T est case reporting iT ext http://itextpdf.com/ Graph manipulation JUNG http://jung.sourceforge.net/ XML parsing JDOM http://www .jdom.org/ Spread-sheet access JExcelAPI http://jexcelapi.sourcefor ge.net/ Figure 7: A TP Process 5 A utomatic T esting Platform The tool which implements the testing proposed in this paper has been named Automatic T esting Plat- form (A TP) 6 and has been released as open-source under the terms of Apache license 2.0. A TP has been built using e xisting open-source components, summarized in T able 2. Therefore, A TP accepts three kinds of inputs: i) XML navigation; ii) NDT files (UML); iii) Selenium scripts in HTML format. Regarding NDT approach, it uses Enterprise Architect (EA) to build its models [13]. A TP accepts this EA models in XMI (XML Metadata Interchange) format. Regarding output, A TP creates a Jav a Eclipse project from the scratch with the follo wing components inside (see Figure 7): • JUnit test cases. One per path. • T abular test data (input and ouput). In an Excel spread-sheet per path. • Script runner (Apache Ant). This script starts the Selenium server before running the unit test cases. A TP is a command-line tool. It has fi ve main commands. T yping only “atp” in the shell, it sho ws the follo wing help: 6 http://atestingp.sourceforge.net/ 62 Automated Functional T esting based on the Navig ation of W eb Applications > atp [ INFO ] ATP ( Automatic Testing Platform ) v2 . 0 [ INFO ] [ http: / / atestingp . sourceforge . net ] [ INFO ] Copyright ( c ) 2 0 1 1 UPM . Apache 2 . 0 license . [ INFO ] [ INFO ] Use one of these options: [ INFO ] atp create [ INFO ] atp run [ INFO ] atp clean [ INFO ] atp list [ INFO ] atp set < k e y > < v a l u e > [ INFO ] atp report The explanation of these commands is: • Create: this command creates the test case and data for each path. The Eclipse project which contains these artifacts is also created with this command. • Run: this command executes the pre viously created test cases, by using the Ant script executor already created. Previous to the execution, a Selenium server is launched. As a result, test reports in dif ferent formats are created (XML, HTML, and PDF). • Clean: This command drops the Eclipse project pre viously created. • List: This command shows the configuration parameters of A TP . The most important parameters are: sut, which is the URL of the web under test; root, folder where the Eclipse project with all the output artifact to be created; navigation-type, type of input (xml, xmi or html); navigation-folder , root to the input file(s). • Set: This command change the value of any configuration parameter . For example: atp set navig ation-type xmi. • Report: This command opens the HTML reports pre viously generated. 6 Case Study: Management of Electronic In voices A TP’ s method and implementation has been validated using a complete web application developed for the Spanish compan y T elvent called “Factura” created in the context of IT Factur@ innovation project. This application is an electronic in v oice web management system which has been de veloped using a Model Dri ven Engineering (MDE) approach [21]. The Research Questions (RQ) which hav e driv en this case study are the follo wing: i) RQ1: Does the F actura application accomplish its functional requirements? ii) RQ2: Is A TP capable of finding defects in a finished web application? iii) RQ3: What are the advantages and disadv antages of different types of input (UML, XML, and R&P) to A TP? The input model for A TP will be the XMI models created using EA following the NDT approach. 5 use cases hav e been identified, and each use case has been refined using an activity diagram. A TP uses the information of each acti vity diagram as input to create an equi v alent na vigation graph. Figure 8 sho ws the use case diagram and an example of acti vity diagram: After that and using the CPP algorithm, it breaks the graph into its path, which is written as unit test cases (JUnit). T est data is also created using the information attached to the model when possible, and using random data from test data dictionaries otherwise. Finally , the test cases were ex ecuted. As a result, 6 navigation errors have been found. In addition, 8 Jav aScript notifications were reported by A TP . These notifications include Jav aScript alerts, prompts, and confirmations. An snapshot of the generated report is illustrated in Figure 9. B. Garc ´ ıa & Juan C. Due ˜ nas 63 Figure 8: Use Cases and Activity Diagrams T able 3: A TP Pros and Cons Input Pros Cons XMI (UML models in NDT) Analysis/design models are reused for assessment. Every possible path is de- picted in the models. It is not possible to attach test data nor oracle in the models. Post-automation step is mandatory . XML (based on XSD Schema) Every possible path can be depicted us- ing XML files. Data and oracles can be attached to XML files The XML files must be coded and maintained by hand. HTML (Se- lenium R&P scripts) The creation of the scripts is done using Selenium IDE against the real applica- tion. Data and oracles can be attached to HTML scripts Each recording is linear , therefore there isalways a single path by HTML script. Error paths should be defined in differ - ent scripts. Regarding RQ1, we can conclude that Factura is quite good implemented since the navigation error number is low . Regarding RQ2, it has been pro ved that A TP can be a useful tool to discover functional failures in web applications. RQ3 is about the w ay that A TP works. The pros and cons of these inputs are summarized in T able 3. 7 Conclusion This paper has presented a method for the automated testing of web applications based on their naviga- tion. The basic idea behind this approach is to exercise the SUT using a real browser , performing the navig ation from page to page by means of web links. During this na vigation, a functional validation is carried out since the correctness of the links and the underlying logic is executed. Each web page corresponds to a state in the navig ation picture, which is ensured to be correct during the navigation. The first kind of input for this method can be UML models of the SUT . Concretely , three kinds of 64 Automated Functional T esting based on the Navig ation of W eb Applications Figure 9: Case Study Report UML diagrams are needed: use case, acti vity , and presentation diagram. Since presentation diagrams are not standard in UML 2.0, we have done a study of the state-of-the-art in web modeling, and we ha ve decide to use NDT as our choice to model web application using UML. The second alternativ e in our method is using a R&P script recorded using Selenium IDE, since Selenium RC will be the tool in charge of the automation of the web navigation. Third and last, an XML-based file for the navigation can be employed. This XML file is a simple way of structuring the na vigation follo wing a XSD schema. This approach has been implemented in an open-source framework named Automatic T esting Plat- form (A TP). This tool automates the testing in four le vels: i) test case logic generation; ii) test data deri vation; iii) test oracle generation; iii) test case e xecution driv en by Ant scripts; i v) test case reporting. T o validate the proposed method, a web application for in v oice management system has been employed. A TP has pro ven to be capable to find functional defects in a real web application. Future work will extend the presented approach through the automation of testing and analysis of non-functional requirements such as performance, security , compatibility , usability and accessibility . Acknowledgment This paper has been performed in the context of the European project ITEA-MOSIS (project number 06035), under grant by Spanish Ministerio de Industria, T urismo y Comercio in the PR OFIT program. Refer ences [1] Alain Abran, Pierre Bourque, Robert Dupuis, James W . Moore & Leonard L. T ripp (2004): Guide to the Softwar e Engineering Body of Knowledge - SWEBOK , 2004 version edition. IEEE Press, Piscataway , NJ, USA. A vailable at http://www.swebok.org/ironman/pdf/SWEBOK_Guide_2004.pdf . [2] Manar H. Alalfi, James R. Cordy & Thomas R. Dean (2008): Modeling methods for web application verifi- cation and testing: State of the art . doi:10.1002/stvr .v19:4. [3] P aul Ammann & Jef f Offutt (2008): Introduction to Softwar e T esting , 1 edition. Cambridge University Press, New Y ork, NY , USA. B. Garc ´ ıa & Juan C. Due ˜ nas 65 [4] Jr gen Bang-Jensen & Gregory Z. Gutin (2008): Digraphs: Theory , Algorithms and Applications , 2nd edition. Springer Publishing Company , Incorporated. [5] L. Baresi, F . Garzotto & P . Paolini (2001): Extending UML for Modeling W eb Applications . In: Proceedings of the 34th Annual Hawaii International Conference on System Sciences ( HICSS-34)-V olume 3 - V olume 3 , HICSS ’01, IEEE Computer Society , W ashington, DC, USA, pp. 3055–. A v ailable at http://portal. acm.org/citation.cfm?id=820558.820674 . doi:10.1109/HICSS.2001.926350. [6] Boris Beizer (1990): Softwar e testing techniques (2nd ed.) . V an Nostrand Reinhold Co., Ne w Y ork, NY , USA. [7] P aul E. Black (1999): Chinese postman pr oblem . Algorithms and Theory of Computation Handbook A vail- able at http://www.nist.gov/dads/HTML/chinesePostman.html . [8] John Adrian Bondy (1976): Graph Theory W ith Applications . Else vier Science Ltd. [9] Stefano Ceri, Piero Fraternali, Aldo Bongio, Marco Brambilla, Sara Comai & Maristella Matera (2002): Designing Data-Intensive W eb Applications . Mor gan Kaufmann Publishers Inc., San Francisco, CA, USA. [10] Mar ´ ıa Jos ´ e Escalona Cuaresma & Nora Koch (2004): Requir ements Engineering for W eb Applications - A Comparative Study . J. W eb Eng. 2(3), pp. 193–212. [11] R. Diestel (2005): Graph Theory . Springer . [12] Elfriede Dustin, Thom Garrett & Bernie Gauf (2009): Implementing Automated Software T esting: How to Save T ime and Lower Costs While Raising Quality , 1st edition. Addison-W esley Professional. [13] Maria J. Escalona & Gusta v o Arag ´ on (2008): NDT . A Model-Driven Appr oach for W eb Requir ements . IEEE T rans. Softw . Eng. 34, pp. 377–390, doi:10.1109/TSE.2008.27. A vailable at http://portal.acm.org/ citation.cfm?id=1383055.1383293 . [14] Herbert Fleischner (1991): Eulerian Graphs and Related T opics : Eulerian Graphs and Related T opics . North-Holland. [15] Mayumi Itakura Kamata & T etsuo T amai (2007): How Does Requirements Quality Relate to Project Success or F ailure? Requirements Engineering, IEEE International Conference on 0, pp. 69–78, doi:10.1109/RE.2007.31. [16] N. Koch & A. Kraus (2002): The e xpr essive P ower of UML-based W eb Engineering . A vailable at http: //citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.7588 . [17] Per Kroll & Philippe Kruchten (2003): The Rational Unified Pr ocess Made Easy: A Practitioner’ s Guide to the R UP . Addison-W esley Professional. [18] Giuseppe A. Di Lucca & Anna Rita Fasolino (2006): T esting W eb-based applications: The state of the art and futur e tr ends . Information & Software T echnology 48(12), pp. 1172–1186. doi:10.1016/j.infsof.2006.06.006. [19] Russ Miles & Kim Hamilton (2006): Learning UML 2.0 . O’Reilly Media, Inc. [20] Bruce Pose y (2002): Just Enough Softwar e T est Automation . Prentice Hall PTR, Upper Saddle Riv er , NJ, USA. [21] Filippo Ricca & Paolo T onella (2001): Analysis and testing of W eb applications . In: Proceedings of the 23rd International Conference on Software Engineering , ICSE ’01, IEEE Computer Society , W ashington, DC, USA, pp. 25–34. A vailable at http://portal.acm.org/citation.cfm?id=381473.381476 . doi:10.1109/ICSE.2001.919078. [22] Gusta vo Rossi, Oscar Pastor , Daniel Schwabe & Luis Olsina, editors (2008): W eb Engineering: Modelling and Implementing W eb Applications . Springer , London, doi:10.1007/978-1-84628-923-1. [23] Harold Thimbleby (2003): The dir ected chinese postman pr oblem . In journal of Software Practice and Experience 33, p. 2003. doi:10.1002/spe.540. [24] Mark Utting & Bruno Legeard (2006): Practical Model-Based T esting: A T ools Appr oach . Morg an Kauf- mann Publishers Inc., San Francisco, CA, USA.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment