Benchmarking Web-testing - Selenium versus Watir and the Choice of Programming Language and Browser

February 23, 2026

Reading time: 6 minute

...

📝 Abstract

Context: Selenium is claimed to be the most popular software test automation tool. Past academic works have mainly neglected testing tools in favor of more methodological topics. Objective: We investigated the performance of web-testing tools, to provide empirical evidence supporting choices in software test tool selection and configuration. Method: We used 4*5 factorial design to study 20 different configurations for testing a web-store. We studied 5 programming language bindings (C#, Java, Python, and Ruby for Selenium, while Watir supports Ruby only) and 4 browsers (Google Chrome, Internet Explorer, Mozilla Firefox and Opera). Performance was measured with execution time, memory usage, length of the test scripts and stability of the tests. Results: Considering all measures the best configuration was Selenium with Python language binding for Chrome. Selenium with Python bindings was the best option for all browsers. The effect size of the difference between the slowest and fastest configuration was very high (Cohens d=41.5, 91% increase in execution time). Overall Internet Explorer was the fastest browser while having the worst results in the stability. Conclusions: We recommend benchmarking tools before adopting them. Weighting of factors, e.g. how much test stability is one willing to sacrifice for faster performance, affects the decision.

💡 Analysis

🇰🇷 한글로 읽기

📄 Content

Benchmarking Web-testing - Selenium versus Watir and the Choice of Programming Language and Browser

Miikka Kuutila, M3S, ITEE, University of Oulu, Finland

Mika Mäntylä, M3S, ITEE, University of Oulu, Finland

Päivi Raulamo-Jurvanen, M3S, ITEE, University of Oulu, Finland

Email: firstname.lastname@oulu.fi, Postal address: P.O.Box 8000 FI-90014 University of Oulu Abstract Context: Selenium is claimed to be the most popular software test automation tool. Past academic works have mainly neglected testing tools in favour of more methodological topics.
Objective: We investigated the performance of web-testing tools, to provide empirical evidence supporting choices in software test tool selection and configuration.
Method: We used 4*5 factorial design to study 20 different configurations for testing a web-store. We studied 5 programming language bindings (C#, Java, Python, and Ruby for Selenium, while Watir supports Ruby only) and 4 Google Chrome, Internet Explorer, Mozilla Firefox and Opera. Performance was measured with execution time, memory usage, length of the test scripts and stability of the tests. Results: Considering all measures the best configuration was Selenium with Python language binding for Google Chrome. Selenium with Python bindings was the best option for all browsers. The effect size of the difference between the slowest and fastest configuration was very high (Cohen’s d=41.5, 91% increase in execution time). Overall Internet Explorer was the fastest browser while having the worst results in the stability.
Conclusions: We recommend benchmarking tools before adopting them. Weighting of factors, e.g. how much test stability is one willing to sacrifice for faster performance, affects the decision.
Keywords: Software testing, Selenium, Watir, Webdriver, test automation, web-testing

Introduction If it could save a person’s life, could you find a way to save ten seconds off the boot time? -Steve Jobs Internet reached over 3 billion users in 2014 and the number of users has since grown (InternetLiveStats., 2016). With increased number of users and applications in the web, e.g. Social Media (van Dijck, 2013), Internet of Things (Gubbi et al., 2013) and Cloud based solutions (Gunawi et al., 2014), there comes the growing need for testing these services and applications.
A recent online survey states that Selenium is the most popular software testing tool in the industry (Yehezkel, 2016), a fact reflecting how the popularity of web-based solutions also affects the popularity of the testing tools. Similarly, our recent paper indicates that Selenium is the most popular pure testing tool when using combined criteria consisting of things like: number of survey responses, Google web hits, Twitter tweets and Stackoverflow questions (Raulamo-Jurvanen et al., 2016). Watir also appeared in the responses of those surveys, but based on references by the respondents it was less popular tool than Selenium (Raulamo-Jurvanen et al., 2016; Yehezkel, 2016).
Speed of software development and test automation is highly important in software development and in particular web-development success. This is supported by several reports. According to another industrial survey (Vaitilo and Madsen, 2016) investments in test automation “will be mandatory for coping with the growing demand for velocity”. With this demand for velocity, the speed of software testing becomes an important problem. Furthermore, testing is often part of development practice called continuous integration (CI) where the development team integrates their work frequently, and the build is automated along with the tests Fowler (2006). Continuous integration also makes testing continuous. Martin Fowler (2006) sees rapid feedback as one of the primary benefits behind CI, while in his experience testing is the bottleneck in behind increased build times. Therefore, faster performance in testing can lower the build times and enable more rapid feedback, or can allow for more time consuming and comprehensive test sets. Fowler’s post is ten years old but support for this notion is found in more recent advice from test automation professionals highlighting the importance of feedback, not only from the CI machine but also from the developers’ personal test environments:
“Fast feedback loops while you work are incredibly important. In many ways, the length of time to run a single test against my local changes is the biggest predictor of my productivity on a project” (McIver, 2016). Past work on software testing has mostly focused on more methodological issues in software testing. For example, plenty of academic work has focused on regression test selection and a survey by Yoo and Harman (2012) contains as many as 189 references on this topic. We claim that focusing on methodological issues alone is not enough if we want to do industrially relevant software engineering resear

View Original ArXiv

This content is AI-processed based on ArXiv data.

Benchmarking Web-testing - Selenium versus Watir and the Choice of Programming Language and Browser

📝 Abstract

💡 Analysis

📄 Content

Table of Contents

Table of Contents

📝 Abstract

💡 Analysis

📄 Content

Start searching

No results found