A Study of Language Usage Evolution in Open Source Software

Reading time: 5 minute
...

📝 Original Info

  • Title: A Study of Language Usage Evolution in Open Source Software
  • ArXiv ID: 1102.2262
  • Date: 2011-02-14
  • Authors: ** - Siim Karus (University of Tartu, Estonia; University of Zurich, Switzerland) - Harald Gall (University of Zurich, Switzerland) **

📝 Abstract

The use of programming languages such as Java and C in Open Source Software (OSS) has been well studied. However, many other popular languages such as XSL or XML have received minor attention. In this paper, we discuss some trends in OSS development that we observed when considering multiple programming language evolution of OSS. Based on the revision data of 22 OSS projects, we tracked the evolution of language usage and other artefacts such as documentation files, binaries and graphics files. In these systems several different languages and artefact types including C/C++, Java, XML, XSL, Makefile, Groovy, HTML, Shell scripts, CSS, Graphics files, JavaScript, JSP, Ruby, Phyton, XQuery, OpenDocument files, PHP, etc. have been used. We found that the amount of code written in different languages differs substantially. Some of our findings can be summarized as follows: (1) JavaScript and CSS files most often co-evolve with XSL; (2) Most Java developers but only every second C/C++ developer work with XML; (3) and more generally, we observed a significant increase of usage of XML and XSL during recent years and found that Java or C are hardly ever the only language used by a developer. In fact, a developer works with more than 5 different artefact types (or 4 different languages) in a project on average.

💡 Deep Analysis

📄 Full Content

A Study of Language Usage Evolution in Open Source Software Siim Karus University of Tartu, Estonia University of Zurich, Switzerland siim.karus@ut.ee Harald Gall University of Zurich Switzerland gall@ifi.uzh.ch

ABSTRACT The use of programming languages such as Java and C in Open Source Software (OSS) has been well studied. However, many other popular languages such as XSL or XML have received minor attention. In this paper, we discuss some trends in OSS development that we observed when considering multiple programming language evolution of OSS. Based on the revision data of 22 OSS projects, we tracked the evolution of language usage and other artefacts such as documentation files, binaries and graphics files. In these systems several different languages and artefact types including C/C++, Java, XML, XSL, Makefile, Groovy, HTML, Shell scripts, CSS, Graphics files, JavaScript, JSP, Ruby, Phyton, XQuery, OpenDocument files, PHP, etc. have been used. We found that the amount of code written in different languages differs substantially. Some of our findings can be summarized as follows: (1) JavaScript and CSS files most often co- evolve with XSL; (2) Most Java developers but only every second C/C++ developer work with XML; (3) and more generally, we observed a significant increase of usage of XML and XSL during recent years and found that Java or C are hardly ever the only language used by a developer. In fact, a developer works with more than 5 different artefact types (or 4 different languages) in a project on average. Categories and Subject Descriptors D.2.7 [Software Engineering]: Distribution, Maintenance, and Enhancement – Restructuring, reverse engineering, and reengineering, version control; D.3.2 [Programming Languages]: Language Classifications – object-oriented languages, extensible language; K.2 [Computing Milieux] History of Computing – Software, People General Terms Management, Measurement, Documentation, Design, Experimentation, Human Factors, Languages. Keywords Programming language, Open source software, evolution, software archives.

  1. INTRODUCTION There has been a lot of effort put into studying the use of procedural languages such as C and object-oriented languages such as Java. Even less common languages such as Perl, Python, or Ruby have received their fair share of attention. However, when looking at the statistics of most used languages, a language far more common than any of the ones mentioned earlier, strikes out. According to ohloh.net1 which tracks more than 400,000 open source software (OSS) repositories, about 15% of actively developed OSS projects contain XML while less than 10% contain HTML, and other languages are present in less than 8 % of projects. Even more, XML is also the language with the most lines of code changed per month. The use of XML in OSS projects, however, has not received considerable attention so far. As XML is a mark-up language, having only little meaning on its own, it would be interesting to understand, what other language it is being used with. Looking at co-evolving file types, we could investigate that issue. Even more general, the question of which languages and file types are used together and, therefore, are co- evolving in OSS projects can be formulated. To address this research question, we studied 22 OSS software repositories over 12 years. Our study focused on two levels of file type couplings: developer and commit level. On the developer level, developers in the projects were studied regarding their language experience in the projects. For that, we addressed the following questions:  Which languages and artefacts are commonly used in OSS development and in what proportions?  How many file types does a developer typically work with and are there some usage patterns for file types?  How has the language usage and, as a consequence, the language expertise requirements for developers changed during the observation period? At the commit level, co-changing files appearing together in commits were studied. For that, we addressed the following questions:  Which co-evolution patterns can be observed in OSS projects (e.g., are there distinct dependencies between languages or artefact types commonly edited together)?  How have the dependencies between file types used in the projects changed during the observation period? Additionally, on a more general level of OSS projects studied, we were interested in what are the most common languages or artefact

1 http://www.ohloh.net

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republ

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut