Learning Program Component Order

Reading time: 6 minute
...

📝 Abstract

Successful programs are written to be maintained. One aspect to this is that programmers order the components in the code files in a particular way. This is part of programming style. While the conventions for ordering are sometimes given as part of a style guideline, such guidelines are often incomplete and programmers tend to have their own more comprehensive orderings in mind. This paper defines a model for ordering program components and shows how this model can be learned from sample code. Such a model is a useful tool for a programming environment in that it can be used to find the proper location for inserting new components or for reordering files to better meet the needs of the programmer. The model is designed so that it can be fine- tuned by the programmer. The learning framework is evaluated both by looking at code with known style guidelines and by testing whether it inserts existing components into a file correctly.

💡 Analysis

Successful programs are written to be maintained. One aspect to this is that programmers order the components in the code files in a particular way. This is part of programming style. While the conventions for ordering are sometimes given as part of a style guideline, such guidelines are often incomplete and programmers tend to have their own more comprehensive orderings in mind. This paper defines a model for ordering program components and shows how this model can be learned from sample code. Such a model is a useful tool for a programming environment in that it can be used to find the proper location for inserting new components or for reordering files to better meet the needs of the programmer. The model is designed so that it can be fine- tuned by the programmer. The learning framework is evaluated both by looking at code with known style guidelines and by testing whether it inserts existing components into a file correctly.

📄 Content

Learning Program Component Order Steven P. Reiss and Qi Xin Brown University, Providence, RI {spr,qx5}@cs.brown.edu Abstract—Successful programs are written to be maintained. One aspect to this is that programmers order the components in the code files in a particular way. This is part of programming style. While the conventions for ordering are sometimes given as part of a style guideline, such guidelines are often incomplete and programmers tend to have their own more comprehensive orderings in mind. This paper defines a model for ordering program components and shows how this model can be learned from sample code. Such a model is a useful tool for a programming environment in that it can be used to find the proper location for inserting new components or for reordering files to better meet the needs of the programmer. The model is designed so that it can be fine- tuned by the programmer. The learning framework is evaluated both by looking at code with known style guidelines and by testing whether it inserts existing components into a file correctly. Keywords—Program style, component ordering, programming environments. I. MOTIVATION Programming style is key to program maintenance. Reading and understanding code for maintenance is signifi- cantly easier if the code follows a consistent style. Part of this style is the way the various program components, i.e. fields, functions, methods, classes, interfaces, etc., are ordered. Using a consistent program order can greatly sim- plify understanding code. For example, Google notes that “the order you choose for the members and initializers of your class can have a great effect on learnability” [10]. For these reasons, the order of program components is often included in the style guidelines that are developed by individuals, projects, and companies. While concentrating on local style, indentation, and naming conventions, these guidelines also specify how files should be organized. They might specify, for example, that files start with a particular style of block comment; that imports are ordered in a par- ticular manner; that the first item in a class is the main pro- gram if the class has one; that the next items are field definitions and the field definitions are preceded by a block comment of a certain form and that public fields precede private ones; that any constructors follow the fields again preceded by a particular block comment; and so on. Exam- ples of such standards include [6,9,14,19,21]. Even when the standards do not prescribe an order, as with the Google Java Style Guide [10], which notes that “what is important is that each class uses some logical order, which the main- tainer could explain if asked”. What is in the various standards, however, is not par- ticularly comprehensive or complete. Programmer’s typi- cally follow their own, more detailed ordering conventions [2]. For example, while the above guidelines provide gen- eral ordering information, they do not include all types of components (e.g. enumerations, factory methods, annota- tions). Moreover, they do not differentiate between order- ings of components in an interface versus in a class, and they do not differentiate between the orderings and com- ment styles for inner classes versus outer classes. Both our own code and code retrieved from open source repositories shows that these orderings are not generally the same. Programming environments benefit from understanding and being able to use program orderings. Today’s environ- ments provide a number of facilities that insert code, for example for automatically fixing errors or doing refactor- ings. If the environment does not do the insertion the way the programmer might, it can make more work for the pro- grammer than it was designed to save. Programming envi- ronments typically include a simplified program ordering model. IntelliJ has the most complex one [11]. This model lets one define sections with before and after comments and define the order of element in that region using element type and modifiers. It includes the abilities to group getters and setters together and to group methods implementing a com- mon interface together. But even this complex model does not meet the needs of programmers. It does not differentiate between classes and interfaces or outer and inner classes as programmers do. It does not provide the semantic-based ordering options that programmers often use, for example, grouping a private method used only once with its caller. Moreover, setting up the model is a complex process and requires considerable interaction on the part of the programmer. The goal or our research is to develop a model of pro- gram ordering that is flexible enough to order components in the same way that the programmer might. At the same time we want to be able to define this model automatically by learning it from the programmer’s existing code base. II. OVERVIEW AND CONTRIBUTIONS In order to determine and use program order, one

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut