Revisiting the Panko-Halverson Taxonomy of Spreadsheet Errors
The purpose of this paper is to revisit the Panko-Halverson taxonomy of spreadsheet errors and suggest revisions. There are several reasons for doing so: First, the taxonomy has been widely used. Therefore, it should have scrutiny; Second, the taxono…
Authors: ** Raymond R. Panko (University of Hawaii, hawaii.edu) **
Revisiting the Panko–H alversonTaxonomy of Spreadsheet Errors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " 1 99 ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.euspri g.org ) and Author Revisiting the Pan ko–Halverson T axonomy of S preadsheet Errors Raymond R. Pank o University of Hawaii Panko@hawaii.edu 1. Introduction The purpose of this paper is to revisit the Panko–Halv erson [1996] taxonom y of spreadshee t errors and suggest revisions. There are several reasons for doing so. First, the taxonomy has been widely used. Therefore, i t should have scrutiny . Second, the taxonom y has not widely available in its original form [Panko & Halverson, 1996]. Consequently, m ost users refer to secondary sources. Consequen tly, they often equate the taxonomy w ith the simplified extracts used in pa rticular experim ents or field studies. Third, perhaps as a consequ ence, most users use only a fraction of the taxonom y. In particular, they tend not to use the taxonomy ’s life-cycle dimension. Fourth, the taxonomy has been tested against spread sheets in experim ents and spreadsheets in operational use. I t is time to review how it h as fared in these t ests. Fifth, the taxonomy was based on the types of spreads heet errors that were k nown to the authors in the mid- 1990s. Subsequent experience has shown that the taxonomy needs to be extended for situations bey ond those original expe riences. Sixth, the omission categor y in the taxonomy has proven to be too narrow. Although this paper will fo cus on the Panko–Halv erson taxonomy , this does not mean that that it is the only possible error taxo nomy or even the best er ror taxonomy . 1.1 Taxonomies Taxonomies have long been used in science. Sende rs and Moray [1991], writing about hum an error, wrote that: … a taxono my is a fundame ntal req uirement for the foundatio n of empirical sc ience. If we want a deep understanding o f the nature, o rigins, and ca uses of human e rror, it is necessary to have an unamb iguous clas sification sc heme for d escribing the phenomenon we ar e stud ying. [Senders and Mo ray, 19 91, p. 82.] For our purposes, we will d efine a taxonom y as the division of a large num ber of entities into a number of related categori es whose differences a re useful for a part icular purpose. The first emphasis is the or dering of many entities in to categories . I deally, the categories will be comprehensive, encom passing all entities. I n addition, the categories ideal ly should be m utually exclusive, without over lap. In m athematical terms, there should be a on e-to-one correspondence between entities and categories. The second emphasis is use fulness for a particul ar purpose [Sende rs and Moray, 1991]. There is no such thing as “best” error ta xonomy for spreadsheets [Grossman and Özlük , 2003] or any othe r type of human cognitive activ ity. Researchers and profess ionals with different focus es m ay need different things from error taxonom ies. For instance, des igners need error taxonomies that distinguish be tween Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 200 types of errors that need dif ferent amelioration strat egies. The legal system , in contrast, needs distinctions that help assig n responsibility for dam ages [Senders & Moray, 1991]. Researche rs with different purposes need diff erent things from taxonom ies and so may need different taxonomies. In addition, several taxonom ies are needed because ea ch taxonomy will illuminate some aspects of the phenomenon while blindin g the researcher or practit ioner to other aspects. This occurs because theories in general illum inate tend to some things w hile ignoring others. For exam ple, Graham Allison and Philip Zelikow [1999] analyzed the Cuban m issile crisis from the viewpoint of several different theories about decision m aking. They showed how e ach theory was sh ockingly obliviou s to certain types of evidence. 1.2 Phenomenological ve rsus Theory-Based Taxon omies Senders and Moray [1991] distinguished between di fferent levels of ta xonomy . The most superficial level consists of phenomen ological taxonomies that are based on s imple descriptions of er ror manifestations. For instan ce, typing errors at th is level would be des cribed by suc h things as key transpositions and other vis ible manifestation of er rors. At the level of p henomenolog ical errors, there is no explanation for why different errors occur, but taxonomies at this lev el may spur research into why specific types of errors occur. Although one would prefe r deeper taxonom ies, phenom enological taxonomies can be very useful. Most obviously, they can focus subsequent resea rch. In the hum an error field, if a particular cer tain type of error proves to be p articularly frequent, then it m ay merit stronger attention. C onversely, if a type of error that was cons idered to be im portant actually turns out to be infrequent, then shif ting resources to the study of o ther errors may be important. Deeper taxonomies are informed by theory. This is esp ecially v aluable if theory predicts manifestations of results. I n error research, for instance, theory may suggest differ ent error occurrence rates for different types of e rrors, different detection ra tes, or different m echanisms for am elioration. Unfortunately, there is no com plete theory for hum an error, so creating full deep taxonom ies for spreadsheet errors is no t possible. Nearly all spreadsheet error resear ch uses the post hoc analysis of spreadsheets that have already been developed. As a consequen ce, all of the error evide nce is phenomenologica l. This would sugg est that we should only be able to h ave phenom enological taxonom ies. However, the human m ind wants explanations. For better o r worse, nearly all pub lished taxonomies of err or and spreadsheet er ror try to explain observed post hoc e rrors in terms of und erlying theories, bo th formal and inform al. While this may be fundamentally unde sirable, it is also undesir able to use taxonom ies that describe errors but give no clues as to why different types of errors occur or how they can be redressed. 1.3 Reliability in Classifica tion Taxonomies, like any other research methodolog y can be judged on a num ber of m ethodological criteria. Every taxonom y should face the entire b attery of tests r equired to assess its internal and external validity. We will only m ention one of these methodologic al issues, reliability. Reliab ility m eans that if different people use the tax onomy to classify the sam e events or item s, they will classify ind ividual items in the same way . A taxonomy that cannot be applied reliably by different people is a failed taxonomy. The simplest way to test rel iability is to conduct an inter-rater reliability study. I n these studies, two (or preferably m ore) people conduct a classification, an d their consistency is com pared statistically . There are several statistica l tests available for te sting inter-rater rel iability. In g eneral, an inter- rater reliability of 90% or hig her is the goal, although an inter-rater reliability of 70% m ay make a study publishable as an explorato ry study. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 201 In error analysis, using multiple raters has another ben efit. It can allow us to es timate statistically how many errors remain undisc overed in the spreadshee t. Howev er, this only works if a reliable error taxonomy is used because d ifferent types o f errors have differen t detection rates. I n general, ignoring error types will give an und erestimate for the num ber of errors rem aining, but even doing an estim ate for number of error rem aining based on all erro rs can be eye- opening for anyone who believ es that they have found most or all errors in a spreadsheet. 2. General Human Erro r Taxonomies There have been many taxonomies of human errors. W e cannot cover them all. Howev er, we will cover a few that appear to be of particular im portance for building a taxonom y of spreadsheet err ors. 2.1 What is an Error? Senders and Moray [1991] defined an error as an a ction that is “not intended by the actor; not desired by a set of rul es or an external obse rver; or that led the task or system outside its accep ted limits” Senders and Moray (1991), p. 25. Note that there needs to be an external standard for d etermining whether a result is an erro r. This can be general consensus that s omething is an error, or there can be a definitiv e test of whether a requirement has been satis fied or not. Not all e rrors hav e good external standards. Th is is particularly true for qualitative errors that violate good pract ice but do not, in the cas e of spreadsheets, gene rate an incorrect numerical answer. Along with Senders and M oray [1991], we distingui sh between errors a nd accidents. Accidents typically involve a series o f errors and may even occur when no erro r has been made. In programm ing, there is a distinction between faul ts and errors. A fault is a p roblem in the program. An error is a hum an action that leads to the fault. Most spreadsheet er ror taxonomies require the p ost- hoc analysis of spreadshee t m odels. For consistency, we could refer to pro blems found in post- hoc analysis as faults. Howev er, outside of programm ing, this is not com mon terminology . We will use the term “error” as som ething that is incorrect in a spreadsheet m odel rather than as a hum an action that causes the problem . 2.2 Mistakes versus Slips a nd Lapses In his book, Hum an Error , Reason [1990] presented th e taxonomy of human error types based on prior work by Reason and Mycielska [1982] and N orman [1981, 1984]. This taxonomy , which is shown in Figure 1, begins with a basic distinction bet ween planning and im plementation. I f the plan is wrong, this is a mistake, re gardless of how g ood the im plementation is. However, if the plan is cor rect but the implem entation is wrong, this is a slip or lapse. Figure 1 : Mistakes versus S lips and Lapses The distinction between sl ips and lapses was propos ed by Norman [1984]. A slip is an e rror during a sensory-motor action, such as typing the wrong number or pointing to the wrong cell. I n contrast, a lapse occurs within the pers on’s head. Typically , a lapse is a fai lure in memory , and this failure is often caused by ov erloading the limited hum an memory capacity. This taxonomy has possib le implications for autom ated spreadsheet analy sis, which only work s on final spreadsheet artifacts. I t is likely that error s involving planning and storage that occur “ off the spreadsheet” will leave few if any artifacts in th e spreadsheet for autom ated analysis too ls to find. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 202 Even slips during execution may or m ay not leave artifacts for autom ated spreadsheet analysis programs to find. For human error hunters, to o, the three types of error s suggest that looking only at the spreadsheet is likely to miss many or even most errors. It is im portant to exam ine requirements, designs, and algorithms to understand i f they have been exec uted properly in the sp readsheet. 2.3 Rasmussen and Jensen Rasmussen and Jenson [19 74] observed hig hly experienced electri cians doing troubleshoo ting. They used protocol analysis in o rder to understand why errors occurred. This provided data below the phenomenological errors. They observed that the t roubleshooters used diff erent strategies to solving di fferent types of problem s. Figure 2 shows these three strategie s. Figure 2 : Rasmussen's Taxo nomy o f Cognitio n and Errors Much of the time, the troub leshooters were sim ply applying sensory-m otor skills, such as using a voltmeter. When this was not sufficien t, they applied one o f many rules they had learned over time. A typical rule might be, “First , check if the device is plugged in and turned on.” W hile this rule is obvious, many rules are quite subtle and are develo ped only after years of pra ctice. When no existing rules wer e applicable, the troub leshooters had to use their general knowledge about the specific devic es being studied an d about electronics in gene ral. This taxonomy is useful in expanding our understandi ng of the m ental processes that come in to play when people work and, ther efore, how they m ake errors. Mistakes, in oth er words, can occur when doing rule-based thinking or knowledge- based thinking. Errors during rule-based and know ledge- based activities m ay be very different and may require different error reduc tion strategies. Although the Rasmussen a nd Jensen typolog y is attractive, apply ing it tends to run into two significant problem s. First, subjects must be expe rienced. Unless the sub ject is experienced, he or s he is not likely to have develo ped many rules. In addition, if subjects are com parative nov ices, they may not yet have developed the under standing of the k nowledge domain sufficient to allow them to do knowledge-based work. D ue to these problem s, any researcher who wishes to use the Rasmussen typology needs to use sui table subjects. The second problem is that the Rasmussen and Jensen study observed work as it was being done. (I t was a form of protocol anal ysis.) To use this taxonom y post hoc, based on ob served errors in a completed spreadsheet, wo uld require a great dea l of justification, if such justification was po ssible at all. How is it possible, for example, to tell whe ther an activity was rule-based or kn owledge- based when an error was concerned, sim ply on the basis of a rtifacts? 2.4 Allwood To build his taxonomy , Allwood [1984], did a protoco l analysis with students sol ving m athematical problems. This is som ewhat more specific than th e other hum an error taxonomies we have seen because it only deals with mathem atical errors. Of course, many spreadsheet errors a re likely to be mathematical errors. Figure 3 shows that Allwoo d’s students made 327 e rrors as they work ed. Six out of ten errors w ere execution errors, which inv olved something lik e doing a calculation incor rectly. The subjects spontaneously caught 83% of these errors as they work ed. Consequently, execut ion errors accounted for only 29% of final errors . Figure 3 : Allwood 's Study of Mathematic al Errors Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 203 Errors that involved m athematical thinking , namely solution m ethod errors and hig her-level math errors, only accounted for a quar ter of all errors m ade, but their relativ ely low error detection rate s— 48% and 25%, respectiv ely, resulted in their accoun ting for 40% of all errors. Skip errors involved sub jects skipping a step in a solu tion process. These errors we re comparatively rate, accounting for only 9% of all errors. However, none were detec ted spontaneously, so they resulted in 29% of all final errors. Allwood [1984] showed th at different types o f errors in this taxonom y had radical ly different occurrence rates and detect ion rates. That certainly is attract ive in error taxonomies. I n addition, of course, spreadsheets g enerally perform m athematical com putations. It may be possible to build on the Allwood framework to spre adsheet error taxonom ies. Panko and Halver son [1996] did exac tly that. 2.5 Flowers and Hayes Another limited but intrigui ng error taxonom y comes from Flower and Hayes [Flower and Hayes, 1980; Hayes and Flower, 1 980], who studied th e process of writing. In another protocol analy sis, Gould [1980] noted tha t writers spend 40 % to 70% of their time thinking instead of actually writing . He found that they were reviewing what they had just written (sim ilar to Allwood’s [2004] standard check during in m athematics) and planning where they would go next. Gentner [1988] also noted that people spend time pausing when they are doing typing. Flower and Hayes looked i n depth at the non- writing time in the writing process. They found that their subjects had to work at several levels of abstra ction simultaneously . They had to select specific words while generating sen tences, and sentence produ ction had to fit into the author’s plan for the paragraph, for larger uni ts of the docum ent, and for the document as a who le. Planning had to be d one at all levels of abstraction, and it had to be done s imultaneously. Each level of abstract ion created constraints that had to be o beyed when consider ing other levels. Figure 4 shows that the Flo wer and Hayes taxonom y of concerns can be viewe d as a context pyram id that is inverted, placing all of the weight of all con text levels on the wr iting of a word. Th is can create enormous overload on the w riter’s mem ory and planning resources. Figure 4 : The Flower an d Hayes Contex t Pyramid In spreadsheet developm ent, the same m ental load is generated. Whenev er a developer ty pes a formula, he or she has to be cognizant of the algor ithm for the formula, the alg orithm for a larger section of the spreadshee t, and for the spreadsh eet as a whole. 2.6 Jambon The final human error taxo nomy we will mention was created by Jam bon [1998]. I n contrast to other taxonomies we have seen, the Jambon technology focuses on testing and rem ediation after development rather than during dev elopment. Jam bon noted that testing and remediation is a fairly complex process involving two stages: Error diagnosis consists of error detection, followed by error explana tion. At the end of this stage, the tester knows tha t a problem exists and w hat it is. Error recovery consists of t he actions needed to fix an error. Jam bon [1998] divided error recovery into planning an e xecution. In addition to noting the com plexity of testing and rem ediation, Jambon [1 998] noted that there are two different approaches d uring error explanation. The first is forward erro r correction, which consist s of doing things to get the c orrect results. The other consists of backw ard error detection, wh ich consists of working from the error back to it s cause. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 204 2.7 Perspective on Hu man Error Taxonomies The research that has been done to date on hum an error taxonomies sugg ests that hum an error is a complex process. The No rm an–Reason taxonomy of m istakes, lapses, and slips appears to be very widely accepted and is ba cked by both theory and experimental data. Howev er, each error taxonom y that we have seen (and m any more) provides important insights into hum an error issues. 3. Spreadsheet Error Taxo nomies So far, we have been look ing at general hum an error taxonom ies. Building on this, w e will now look specifically at spreadsheet error taxonomies. 3.1 Galletta When Galletta et al. [1993 ] conducted an exper iment in which MBA stu dents and accountan ts working on their accredita tion examined spreadshee ts looking for errors, they divided errors into two types. First, there were dom ain errors that occu rred when a form ula required knowledg e of accounting. Second, there were device errors , which occurred when the e rror involved using the computer and the spreadshe et program—ty ping errors and pointing errors . 3.2 Panko and Halverson For their research on errors in spreadsheet developm ent and inspection, Pa nko and Halv erson [1996] created a taxonomy of spreadsheet research risks as a t hree dimensional cube. The three sides of this cube were research issue, l ife cycle stage, and m ethodology (experiment, survey, etc.) for addressing the research issues. Figure 5 : Panko an d Halverson S preadshee t Risks Research Cube Research Issues Research issues included structura l concerns (poor s tructure), actual errors, user w ork practices, assumptions, and spreadshe et model characteristics (s ize, percentage of cells that are formulas or data, complexity of formulas, on e-tim e versus many-time use, number of peop le who use the spread sheet, purpose, and so forth), and control policies. Measuring Error Rates Under “actual errors,” the taxonomy noted several ways to quantify errors and noted that each ha s advantages and disadvant ages. The metrics l isted were: Percentage of m odels containing errors Number of errors per m odel Distribution of errors by m agnitude Cell error rate Figure 6 : Panko an d Halverson Metrics fo r Measurin g Errors For error mag nitude, Panko and Halverson noted that “Som e errors are important, other unim portant. One measure is the size of the error as a percentag e of the correct bot tom-line value. Anoth er is whether a decision would h ave been different had the error no t been made. We sus pect that quite a few errors are either too small to b e important or st ill give answers that lead t o the correct decis ions.” Many field audits have fou nd that significant errors (such as errors that are m aterial in financial statements or that can af fect a decision) are very w idespread but that “ show stoppe r” errors only occur in about 5% of all spreadsheets. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 205 In terms of the cell error rat e, which is the percen tage of cells that cont ain an error, Pank o and Halverson were taking a cu e from software developm ent research, w hich has long m easured the faults per thousand lines of nonco mm ent source code (faults/KLOC). Within limits, the rate of faults/KLOC is reasonably the sam e across program s. This allows software develope rs to get a roug h count of the number of errors they can expect to find when inspec ting a module of code wi th known length. In their taxonomy , Panko and Halverson did no t discuss how to cou nt errors. In the resea rch in which this taxonomy was created, Panko and Halverson [1997 , 2001] argued that errors should be counted in the cells in which they occu rred. Even if this error is repeated in copied cells or mak es dependent cells incorrect, it should only be counted as a single cel l. Basic Error Types Figure 7 shows the Panko a nd Halverson [1996] taxon omy of error types. The taxonomy first divides errors into quantitativ e and quantitative errors. Their dem arcation of the two types of errors was very simple. If som ething made a final value (bottom- line value) incorrect, then i t was a quantitative erro r. If it did not, it was a qualita tive error. Figure 7 : Panko an d Halverson Tax onomy of E rror Types The most comm on qualitative error is putting a number into a formula inst ead of a cell reference. I t does not cause errors then, but it makes errors m ore likely later, say when as sumptions have to be changed in what-if analysis. I n fact, Teo and Tan [199 7] did find later that student s who did jamm ing (hardcoding num bers in formulas) did mak e more errors in subsequent wha t-if analy ses. Reason [1990] calls such errors latent errors because they cause no errors at the time they are made but increase the likelihood of a n error occurring later. P ryor [2003] gives an exc ellent list of qualita tive errors. Following Allwood [1984] broadly, Panko and Halver son divided quan titative errors into three b asic types: mechanical, log ic, and omission errors. They have the following definitions for these error types: Mechanical errors are ty ping errors, pointing errors, and other sim ple slips. Mechanical erro rs can be frequent, but they ha ve a high chance of bei ng caught by the person mak ing the error. Logic errors are incorrect formulas due to choos ing the wrong A lgorithm or creating the wrong formula to im plement the algorithm . Omissions are things left o ut of the model that shoul d be there. They often result from a misinterpretation of the situ ation. Human factors r esearch has shown that omission errors are especially dangerous becau se they have low error ra tes. Logic errors occurred frequ ently, and Panko and Ha lverson used two d ifferent ways to dis tinguish between them. First, Lorge and Solom on [1955] had talked about erro rs that are obv ious when pointed out to the person who m ade the error. Lorge and So lomon called these E ureka errors. I n their first study, Panko and Halv erson observed students as they worked. They found that for certain errors, even if one of the team mem bers warned about the error, he or she wou ld often be ig nored. The researchers called these Ca ssandra errors, after th e Hom eric character who was cursed to wa rn of disasters but never be belie ved. As Lorge and Solom on noted, groups are very good at reducing Eureka errors. Panko and H alverson showed that group s were very poor at reduci ng C assandra errors. Panko and Halverson also distinguished betwee n pure logic errors and d omain logic errors. Dom ain logic errors stemm ed from the builder’s m isunderstanding of the knowledge domai n for the spreadsheet, such as accoun ting. I n contrast, pure logic errors resulted in the incorrect use of mathematics or logic in g eneral. Panko created the Wall task , which was sim ple and free of domain knowledge requirements, in order to avoid the com plications of requirem ents for dom ain knowledge. Panko and Sprague [1998] conducted an experim ent using the Wall task . Subjects still m ade logic errors, indicating that error s in spreadsheets wer e not sim ply due to poor domain knowledge. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 206 Using the Taxono my Panko and Halverson’s fir st study using the taxonom y was a developm ent experiment in which subjects developed a spread sheet working alone, in g roups of two, or in g roups of four [Panko and Halverson, 1997]. The auth ors conducted an inter- rater reliability test on the taxonomy ’s definition of quantitative errors (errors t hat changed a final valu e), the tripartite d istinction between m echanical, logical, and om ission errors, and the distinct ions between Eureka and Cas sandra errors. The subjects made the same 209 quantit ative errors according to both researchers, for a 100% reliability ra te. Within these quantitative er rors, the researchers in itially disagreed on the class ification of a single error that occurred in three spreadsheets. This represented 99.6% reliabi lity. The point of disagreement was a mistak e made by three subjects who all added e xpenses to rev enues to get income, instead of subtract ing expenses. One rese archer classified th is as a logic error, the other as a mechanical error. After a d iscussion the author s agreed to call it a m echanical error. Teo and Tan [1997] reported no problem s when they used the taxo nomy in a duplicate of the Panko and Spr ague [1998] Wall task. Panko [1999] later conduc ted an inspection stu dy, using a modification of the Galletta et a l. [1997] inspection task. This tim e, Panko tested the distinct ion between om ission errors and other types of errors (mechanical and log ical). The data showe d that om ission errors were indeed detected m uch less frequently than other ty pes of errors. Hicks [1995] used the Pank o [1999] inspection m ethod and the Panko and Halverson [1996] error taxonomy to inspect a large capital budgeting spr eadsheet abou t to be operational in a m ulti-billion dollar company . They used three inspectors work ing apart and then together to compare their results. (Unfortunately, they did no t report the amount o f time spent and the num ber of cells in the spreadsheet, contrary to the Panko and Halverson [1 996, 1997] m ethodology. They reported that th e taxonomy work ed well for them , although they did not report inter-rater reliability. Errors by Life Cycle The third dimension in the spre adsheet risks cubes was life cycle stage. Based on t he prior spreadsheet literature, they divided the spreadsheet life cy cle (not just the spreadsh eet developm ent life cycle) into 5 stages: Requirements and Desig n Cell Entry The Draft Stage (before testing) Debugging Operation Panko and Halverson [1996 ] suggested that the e rror rate varies strong ly across this life cy cle. For early stages, they note that experiments indicate that when people enter form ulas in cells, the error rate is about 10%, but tha t most of these errors ar e caught (Olson and O lson, 1990). Consequently, when people finish develop ment (this is called the d raft stage), the erro r rate is half of tha t or less. Third, when subjects inspe ct spreadsheets to look for errors, they find a majority of them , further reducing the error rate. Problems with the Panko a nd Halverson Taxono my Although the Panko and H alverson taxonom y has been fairly w ell validated by exp eriments, some limitations have becom e obvious over time. First, although the taxonom y has both a erro r type dimension and a sp readsheet life cy cle perspective, this was not fle shed out until later. For instance, Pank o and Halverson focused on development and inspec tion. They did not look at the ty pes of errors that occur during initial analysis and requirem ents. More concretely, probab ly because they did not s tudy ong oing use they were not aware until la ter of overwriting error s, in which a user ov erwrites a form ula with a numerical value. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 207 Second, they focused on o mission errors because these were the subject of ea rlier hum an error research. However, giv en the work of Flower and Hay es, omission errors are o nly one type of error that is likely to occur as people with lim ited memory resources m ust cope with simultaneous needs to enter a formula, keeping the full algorithm in m ind, and keeping the flow of the entire spreadshe et in mind. Third, the taxonomy did not recognize the important d istinction between s ensorymotor slips and memory lapses. Particu larly in the wall task [Pa nko and Sp rague, 1998], divid ing mechanical errors into slips and lapses would chang e the picture conside rably. 3.3 Rajalingham Shortly after the Panko and Halverson [1996] taxonomy, Rajalingham , et al. [2000] created m ore complex taxonomy of spreadsheet errors. This taxonom y is shown in Figure 8. Figure 8 : The Rajalin gham, Cha dwick, Knigh t, and Edwards 2 000 Taxonomy This taxonomy also begin s with the distinction b etween qualitat ive and quantitativ e errors. It then gets into the distinction between accidental errors and re asoning errors. This is sim ilar to the Pank o and Halverson [1996] m echanical versus logical disti nction, but its term inology (accidental v ersus reasoning) may be better connotatively. An important addition in this taxonomy is the distinction between d eveloper and end- user errors. Panko and Halverson [1996 ] only focused on dev eloper errors. They did not consider the types of errors that end users would make after developm ent. Most obviously, th ey failed to consider da ta entry errors, which can be v ery important. These errors can include inpu tting incorrect data or ev en overwriting a number wi th a formula. Ra jalingham, et al. [2000] a lso considers errors that users m ake in interpreting the results o f spreadsheets. I f a spreadsheet giv es the correct result but this correct result is misinterpreted, say because of poor ou tput labeling , this is just as bad as a development error. Later, Rajalingham [2005] revisited the taxonom y. He actually cam e up with two follow-up taxonomies. Figure 9 show s his “bushy” taxonom y. He gave it this nam e because it often branches into three or more alternativ es. Rajalingham arg ued that this approach m ade it difficult to dec ide where to place and error, an d it also tended to requ ire an error to be pl aced in two or m ore end nodes. Figure 9 : Rajalingha m's 2005 "Bush y" Taxonomy Rajalingham, in the sam e paper, also presented his “b inary” taxonom y. He argued that having to make a binary choice at each step m ade classification easier and m ore predictable. Figure 1 0: Rajalingh am's 2005 "Bin ary" Taxo nomy 3.4 Howe and Simkin In a code inspection experi ment, Howe and Simk in [2006] created a new taxonomy of error. Figure 11 shows this taxonomy . Figure 1 1: Howe an d Simkin Taxono my Obviously, this taxonom y is very different from earlier taxonomy . Its clerical and non-m aterial errors are such things as spelling errors in labels, incor rect dates, and so for th. Most prev ious studies ignored such errors. There is an especially im portant addition in the ru les violations category . These are basically parts of the model that violate requi rements. Om ission errors do this, but so do m any other types of er rors. Giving evidence for the u sefulness of the taxonom y, subjects had different detection rates for differen t types of errors. I n another code inspection study, Bishop and McDai d [2007] used the Howe and Simkin [2006] taxonom y. They also found difference s in error detection rates, and they found th at experienced spreadsheet de velopers from industry had a higher detection rate than students for rule s violations and formula erro rs. However, the B ishop and McDaid su bjects had a fa r lower detection Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 208 rate for clerical /nonmate rial errors, perhaps becau se they had not been explicitly instructed to look for them. 3.5 Powell, Lawson, and B aker For their series of pro jects involving the creation, testing , and use of a code inspection (auditing ) methodology, Pow ell, Lawson, and Baker [2007] dev eloped another taxonom y of errors. Figure 12 illustrates this taxonom y Figure 1 2: Powell, Lawson , and Ba ker Taxonom y Note that this taxonomy ’s use of om ission errors is very different from the use of om ission errors in Panko and Halverson [1996 ]. In the Panko and Ha lverson, something in the requirem ents is left out of the spreadsheet. This is no t likely to be detec table by look ing at the spreadsheet. I n contrast, in the Powell, Lawson, and Baker [2007] taxonomy , it means pointing to a blank cell. Hard coding is described as a qualitative erro r in the Pank o and Halverson [1996] a nd the Rajalingham [2005] bushy taxonomy. I n the Powell, Lawson, and Bak er [2007] taxonomy , hard coding is usually not a qua ntitative error bu t sometimes is, “if it is suff iciently dang erous” (Page 60). The spreadsheets studied to develop this taxonom y were operational spre adsheets in use for som e time. However, there is no category for ov erwriting a form ula with a constant. No r is there any indication that this has hap pened. 3.6 Madahar, Cleary, and Ball In the EuSpRIG conference during which Powell, Lawson, and Baker [2007] pres ented their taxonomy, Madahar, Cleary , and Ball [2007] also pre sented their taxonom y. I n contr ast to other taxonomies, this was a taxo nomy of spreadsheets rath er than of error in the spreadsheet. Write rs have long argued that different ty pes of spreadsheets need g reater or lesser deg rees of control (e.g ., Schultheis and Sum ner, 1994). Madahar, Cleary, and Ba ll [2007] considered three models for descr ibing the different ty pes of spreadsheets they found in the one departm ent of a univ ersity. They described Model 3 as their best model. This model had thre e dimensions. Figure 1 3: Madahar, Cleary, and Ball Taxon omy of Spreadsheets Dependency m eans how fundamentally the o rganization depends on the spreadshe et. Values can be operational, tactica l, or strategic. Magnitude is the sev erity of consequences for po tential errors. Time/Urgency refers to de adlines that have to be met using the spreadsheet. 4. Revising the Panko and Halverson Taxono my Although the Panko and H alverson [1996] taxonom y has worked relat ively well, it is more than a decade old and is showing i ts age. In particular, i t has two obvious problem s. First, it was developed for a specific purpose— to classify quantitative e rrors in spreadsheet development and inspec tion that reflects hum an differences in comm ission rate and detection rate. However, because it w as limited in its approach, it did not co nsider errors that occu r in other stages of the spreadsh eet systems life cycl e. In addition, because it was developed in an effort to prove that quan titative errors are in fact common and difficult to de tect, it paid little attention to qualitativ e errors which are arg uably more im portant. Second, even the taxonom y’s view of quantita tive errors was too l imited. One spec ific problem is that its definitio n of mechanical erro rs included slips bu t not lapses. This is a Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 209 serious problem. I n addition, following Flower and Ha yes [1980], the taxonom y was too focused at the cell lev el. However, errors can al so occur if the dev eloper loses focus on the broader flow of the spreads heet. 4.1 Violations and Errors Figure 14 shows our rev ised taxonomy of violations and errors. I t has many components Figure 1 4: Violation s, Errors, an d Context Le vels Violations versus Errors In software testing, B eizer’s [1990] advice to hold dev elopers blameless is wid ely followed. This reflects the realization that nobody is imm une to error and that using testing to a ssign blame is counterproductive. However, in the study of au tomobile accidents, researchers hav e long used a distinction be tween errors and violations. Vio lations are acts that break the law, such as spe eding or dri ving while under the influence. While error s are inevitable and ar e not considered bl ameful, violations a re considered blameful, even if they do n ot lead to accidents. This distinction between e rrors and violations m ay be useful in spreadsheet devel opment. I n spreadsheet development, v iolations would norm ally consist of not com plying with the org anization’s policies for spreadsheet dev elopment. Of course, th is assum es that the organization is mature enoug h to have policies. I n cases where the company is subject to external com pliance regulations, then a violation would exist if a sp readsheet created a v iolation of the externa l compliance regulation. Qualitative versus Quan titative Errors This taxonomy continues to the distinction betwe en qualitative and quan titative errors. Quan titative errors, quite simply , are incorrect formulas and dat a cells that cause s ubsequent dependent cel ls to have the wrong v alues. If an error does not cause a s ubsequent value to be wrong, then the er ror is not a quantitative error. I t is a qualitative error. Nor is the issue seriousness of the error. Qualitativ e errors can be extremely serious. Mistakes, Slips, and Lapse s Given the widespread use o f the Reason and Norm an distinction between m istakes, slips, and laps es, the taxonomy should be rev ised to reflect this se t of categ ories. For mistakes, it is im portant to realize that m istakes in formulas can com e from many sources, including dom ain misunderstandings, log ic failures, mathem atical errors, and errors in us ing the software (usually by misusing built-in functions). The impact of dividing the P anko and Halverson m echanical error category into slips and lapses is shown in Figure 15. I n a corpus of spreadsheets descr ibed by Panko [2000 ], 82 sub jects each developed a spreadsheet to provide a decision m aker with a pro-form a income statem ent. In the corpus, 28% of the errors w ere logic errors, 21% we re om ission errors, and 41% were mechanica l errors. Figure 1 5: Mechan ical Errors, Slip s, and Lap ses The high percentage of m echanical errors was good ne ws for error detec tion, because som e mechanical errors such as p ointing errors leav e discoverable artifacts on the spreadsheet. Howev er, the figure shows that when a classification based on sl ips and lapses is used, slips only account for 19% of the total errors, wh ile 22% of the errors we re lapses. Many of the lapses, by the way, occurred in reading requirements fo r unit costs for two y ears for labor and m aterials. This series of num bers seemed to create frequent o verloads on the m emory capacities of the devel opers. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 210 4.2 Level of Analysis Flower and Hayes [1980] n oted that developers con stantly m ust take into account m ultiple levels of context. Figure 14 notes that the same is true in sp readsheet developm ent. When som eone works at the cell level, they also nee d to keep in mind what is happening at the algorithm level (most algorithms require a group of formula and data cell s), at the module level, and a t the level of the spreadsheet as a whole. The developer also needs to take into account the entire business sy stem in which the spreadsheet wi ll be used. This includes m anagement, organiza tion, procedures, ha rdware, sources of data, a nd other matters. 4.3 Life Cycle Stages and Roles Spreadsheets generally g o through a system life cycle t hat begins with the analysi s of the current situation and needs and end s when the spreadsh eet is terminated or replace d. Figure 16 shows th e main stages in the system life cycle. Figure 1 6: The Sp readsheet Life Cycle a nd Type s of Errors The first part of this life cy cle is the system dev elopment life cycle, which incl udes initial analysi s, the specification of requirem ents, the developm ent of modules, the developm ent of the full spreadsh eet (by combining m odules), and implementation. Howev er, most of a spreadsheet ’s life is spent in operational use, and main tenance also has to be don e occasionally. Final ly, the spreadsheet is rep laced or simply term inated. The original Panko and Ha lverson [1996] taxonom y noted that the num ber of errors typically v aries over a spreadsheet’s life cy cle. During development, many errors will exist. Ho wever, testing, inspection and use experien ce tend to reduce the num ber of errors during dev elopm ent. By the time a spreadsheet is released for use, g ood practice should substantially dec rease the num ber of residual errors. During operational u se, however, error s may increase if people if people input incorre ct data or overwrite formulas with num bers. More importantly, as Figur e 16 attempt to illustra te, different types of e rrors will occur at di fferent stages of the systems life c ycle. Panko and Halv erson [1996] only focused on d evelopment and testing. Consequently, they only considered the err ors that occur during that stage of the sp readsheet life cycle. Although the main v iolation and error categor ies are lik ely to occur over the entir e life cycle, their specific manifestations wil l be very different at each stage. One of the m ain jobs of spreadsheet researchers must be to enu m erate the kinds of errors that can and do e xist at each stage. Arguably the most im portant stage to unde rstand is operational use. Many specific errors, such as entering the wrong num ber for a variable or inco rrectly im porting data, occur primarily during operational use. Violations also must be anticip ated, such as violations of privacy or the us e of spreadsheets to comm it fraud. Figure 17 shows another aspect of life cycle think ing. This is the fact that there are several possible organization roles involved . We need to think abou t violations and errors for each of these ro les during each stage of the life cycle. Although these roles m ay be combined in many cases, it m ay still make sense to think in te rms of logical roles to env ision errors. Figure 1 7: Life Cycle S tages and Roles Perspective This paper has considerabl y revised and expanded the Panko and Halver son [1996] taxonom y of spreadsheet errors. The pur pose of that taxonom y was to support quantitativ e research studies to demonstrate that quantitativ e spreadsheet erro rs are frequent, tha t quantitative spre adsheet errors are difficult to detect, and that many spreadsheet erro rs are significant. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 211 To some extent, these ideas have been broadly accepte d. In any case, people who still reject that experimental evidence reg arding them are not lik ely to hav e their opinions changed by further quantitative research. It is now time to shift our focus toward ident ifying the large number o f different ty pes of errors that are possible in different life cycle stages and by people with diffe rent roles to play. For this, we do not need tight taxonom ies as much as we did prev iously. Althoug h experiments and other quantit ative research must have well- formed and tight logical ta xonom ies, as we move from proof or danger to providing guidance to co rporations, we will need more expansive taxonom ies that sug gest issues rather than tight taxonom ies to confirm issues. References Allison, Graham T. and Zeli kow, Phillip. (1999). Essence of Decision: Explaining the Cuban Missile Crisis (2nd Edition) (Paperback), Longman Pub lishers: Englewood Cliffs, NJ. Allwood, C. M. (1984). “Erro r Detection Processes in Statistical Problem Solving.” Cogni tive Science, 8(4), 413-437. Ayalew, Yirsaw; Clermont, Markus; & Mi ttermeir, Roland T. (2000, July 17-18). “Detecting Errors in Spreadsheets,” Symposium Proceedings EuSpRIG 2000 , University of Greenwich, London, UK, European Spread sheet Risks Interest Group, 51-62. Beizer, B. (1990). Software Testing Techniques . 2 nd ed. New York: V an Nostrand. Bishop, Brian and McDaid, Kevin, “An Empirical Study of End-User Behaviour in Spread sheet Error Detection and Correction,” Proceedings o f the European Spreadsheet Risks Interest Group, EuSpRIG 2007 Conf erence , University of Greenwich, London, July 2007, pp. 165-176. Croll, G. The importance and criticality of spreadsheets in the City of London. Pro ceedings of the EuSpRIG Conference , London (2005). Flower, L. A., & Hayes, J. R. (1980 ). “The Dynamics of Composing: Making Plan s and Juggling Constraints,” Cognitive Processes in Writing. Eds. L. W. G regg & E. R. Steinberg. Hillsdale, NJ: Lawrence Erlbaum Associates. 3 1-50. Galletta, D. F.; Abraham, D.; El Louadi, M.; Lekse, W.; Pollailis, Y.A.; & Sampler, J.L. (1993, April-June). “An Empirical Study of Spreadsheet Error-Find ing Performance.” Journal of Accounting, Management, and Information T echnology, 3(2), 79-95. Gentner, D. R. (1988) “Expertise in Typewriting” in Chi, M. T. H., R. Glaser, and M. J. Faar (eds.) (198 8), The Nature of Expertise, Hillsdale, NJ: Lawrence Erlbau m A ssociates, pp . 1-22. Gould, John D. “Experiments on Composing Letters: Some Facts, Some Myths, and Some Observation s, Chapter 5 in Lee W. Gregg and Erwin Steinberg (eds.) Cogn itive Processes in Writing , Lawrence Erlbaum: Hillsdale, NJ, 1 980, pp. 97- 127. Grossman, Thomas A. and Özlük, Özgür, “Research Strategy an d Scoping Survey on Research Practices,” Proceedings o f EuSpRIG 2003 , European Sp readsheet Risks Interest Group, July 24-25, Trinity College, Dublin, Ireland, pp . 23-32. Hayes, J. R. & Flower, L. (1980 ). “Identify ing the Organization o f Writing Processes,” Cognitive Processes in Writing . Eds. L. W. Gregg & E. R. Steinberg. Hillsd ale NJ: Erlbaum. 31-50. Howe, Harry & Simkin, Mark F. (2006 , January). “Factors Affecting the Ability to Detect Spreadsheet Errors,” Decision Sciences Journal of Inno vative Education , 4(1), 101-122. Jambon, Francis. “Taxonomy for Human Error and System Fault Recovery fro m the Engineering Perspective,” Proceedings of the International Conference on Hu man–Computer Interaction in Aeronomics , Montreal, Canada, May 1998, 55-60. Madahar, Mukul; Cl eary, Pat; and Ball, David. (2007). “Categorisation of Spreadsheet Use within Organisations, Incorporating Risk: A Progress Report, ” Proceedings of the European Spreadsheet Risks Interest Group, EuSpRIG 2007 Conference, University of Greenwich, Lond on, July 2007, pp. 37-45. Norman, Donald A., “Categorization of Ac tion Sl ips,” Psychological Review , 88, 1981, 1-15. Panko, R. R. (2008a). Hu man Error Website. (http://panko.shidler.hawaii.edu/humanerr.htm). Hono lulu, HI: University of Hawai`i. Panko, R. R. (2008b ). Spreadsheet Research (SSR) Website. (http://panko.sh ilder.hawa ii.edu/p anko/ssr/). Honolulu, HI: University of Hawai`i. Panko, Raymond R., “Applying Code Inspection to Spreadsheet Testing,” Journal of Management Information Systems , 16(2), Fall 1999, 159-176. Panko, Raymond R. & Sprague, Ralph H. , Jr. (1998, April) “Hitting the Wall: Errors in Developin g and Code Inspecting a ‘Simple’ Spreadsheet Model,” Decision Support S ystems , 22(4), 337-353. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 212 Panko, R. R. (1988). End User Comp uting: Management, Applications, and Technology . New York: Wiley. Panko, Raymond R. and Halverson, Richard P., Jr., “An Experiment in Collaborative Spreadsh eet Deve lopment,” 2(4) Journal of the Associat ion for Information Systems , July 2001.Panko, Raymond R. & Halverson, Richard Jr. , “An Experiment in Team Development to Reduce Spreadsheet Errors” Journal of Management Information Systems 15 (1), Spring 1997, 21-32 Panko, Raymond R. & Halverson, Richard Jr., “Are Tw o Heads Better th an One? (At Reducing Errors in Spreadsheet Modeling),” Office Systems Research Jou rnal 15 (1), Spring 1997, 21-32. Panko, Raymond R. and Halverson, R. H., Jr. “Spreadsheets on Trial: A Framework for Research o n Spreadsheet Risks,” Proceedings of the Twen ty-Ninth Hawaii International Conference on System Sciences , Volume II , Kihei, Maui, January, 1996, pp. 326-335 . Powell, Stephen G.; Lawson, Barry; and Baker, Kenneth R. (2007). “Impact of Errors on Operational Sp readsheets,” Proceedings of the Europ ean Spreadsheet Risks Interest Group, EuSpRIG 2007 Conference , University of Green wich, London, July 2007, pp. 57 -68. Pryor, Louise, “Corr ectness is not Enough,” Proceedings of EuSpRIG 2003 , Eu ropean Spreadsheet Risks Interest Group, July 24-25, Trinity College, Dublin, Ireland, pp. 117-122. Rajalingham, Kamalasen; Chadwick, David; Knigh t, Brian; and Edwards, Dilwyn, (2000a, January)" Quality Contro l in Spreadsheets: A Software En gineering -Based Appro ach to Sp readsheet Develop ment, " Proceedings of the Thirty-Third Hawaii International Con ference on System Sciences, Maui, Hawaii. (.pdf format) Rajalingham, Kamalasen; Chadwick, David R.; & Kn ight, Brian. (2000b, July 17-18). “Classification of Spreadsheet Errors, Symposium Proceedings EuSpRIG 2000 , University of Greenwich, London, UK, European Spreadsheet Risks Interest Group , pp. 23-34. Rajalingham, Kamalasen. (2005, July). “A Revised Classification of Spreadsheet Errors,” Pro ceedings of the 2005 European Spreadsheet Risks Interest Group, EuSp RIG 2005 , Greenwich, London, 185-199. Rasmussen, J. & Jensen, A. “Mental P rocedures in Real-Life Tasks: A Case Study of Electronic Troub leshooting,” Ergonomics , 1974, 17, 293 -307. Reason, James P. (1990 ). Human Error , Cambridge University Press: Cambridge, En gland, 1990. Reason, James T. & Mycielska, K., A bsent-Minded? The Psychology of Menta l Lapses and Everyday Errors , Prentice Hall, Englewood Cliffs, N.J., 1982. Senders, John W. and Moray, Neville P . (1991). Human Error: Cause, Prediction, and Reduction , LawrenceErlbaum: Hillsdale, NH. Teo, T.S.H. & Tan, M. , "Quantitative and Qualitative Errors in Spreadsheet Development," Proceed ings of the Thirtieth Hawaii International Conference on System Sciences, Kihei, Hawaii , January 1997. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 213 Figures Figure 1 : Mistakes versus S lips and Lapses Stage of Error Type of Error Error in Planning Mistake Logic or mathemati cal error, etc. Slip Sensory-motor error Error in Execution Lapse Error cause by memory overload Figure 2 : Rasmussen's Taxo nomy o f Cognitio n and Errors Type of Work Characteristics Knowledge -Based Applies knowledge of the system when no rules ex ist. Rule-Based Applies rules. Ex ample: Things wor k better when they are plugged in and turned on. Skill-based Applies well-lear ned sensory-motor skills. Example: Measuring a voltage on a volt meter. Figure 3 : Allwood 's Study of Mathematic al Errors Execution Solution Method Higher level math Skip Other Total Total Errors 202 67 16 29 13 327 % of Errors Made 62% 20% 5% 9% 4% 100% Found 168 32 4 0 6 210 % Found 83% 48% 25% 0% 46% 64% Not Found 34 35 12 29 7 117 % of Final Errors 29% 30% 10% 25% 6% 100% Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 214 Figure 4 : The Flower an d Hayes Contex t Pyramid Figure 5 : Panko an d Halverson S preadshee t Risks Research Cube Figure 6 : Panko an d Halverson Metrics fo r Measurin g Errors Percentage of models containing err ors Number of errors per model Distribution of error s by magnitude Cell error rate Note: Errors are re corded in the c ell in which they originally o ccur. Consequent inaccura cies in copied cells or descendent cells due to this error are not counted as errors. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 215 Figure 7 : Panko an d Halverson Tax onomy of E rror Types Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 216 Figure 8 : The Rajalin gham, Cha dwick, Knigh t, and Edwards 2 000 Taxonomy System-Generat ed User-Generate d Quantitative Accidental Developer Omission Alteration (makes an in correct change) Duplication End-User Data inputter (input) Omission Alteration Duplication Interpreter (output) Omission Alteration Duplication Reasoning Domain Knowledg e Real-world know ledge Mathematical representa tion Implementation Syntax Logic Qualitative Semantic Structural Temporal (based on infor mation tha t has not been updated) Maintainability Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 217 Figure 9 : Rajalingha m's 2005 "Bush y" Taxonomy Software Errors User Errors Qualitative Errors Formatting errors Update errors Hard-coding erro rs Semantic errors Quantitative Errors Mechanical Errors Overwriting Errors Unreferenced data Referenced data Data Input Errors Unreferenced data Referenced data Logic Errors Errors in enabling skills Errors in planning skills Omission Errors Figure 1 0: Rajalingh am's 2005 "Bin ary" Taxo nomy Quantitative Accidental Structural Insertion Update Modification Deletion Data input Insertion Update Deletion Modification Reasoning Domain knowledge Real-world know ledge Mathematical representa tion Implementation Logic Syntax Qualitative Temporal Structural Visible Hidden Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 218 Figure 1 1: Howe an d Simkin Taxono my Type of Error Seeded Errors Percentage Found Description Data Entry Errors 5 72% Out of range values , negative values, one value entered as a label Clerical and Non- Material Errors 10 6 6% Spelling errors, incorre ct dates, etc. Rules Violations 3 60% Cell entries which violate a stated company policy for an ineligib le employee Formula Errors 25 5 4% Inaccurate range referen ces, embedded constants, illogical for mulas Total Errors 43 6 7% Figure 1 2: Powell, Lawson , and Ba ker Taxonomy Error Type Desc ription Logic Formula is used incorre ctly, leading to an incorrect result. Reference A formula contains one or more incorrect references to other cells. Hard-Cod ing One or more number s appear in formula s, and the practice is suffi ciently dangerou s. Copy/Paste A formula is wrong do to an incorrect cut and paste. Data Input An incorrect data inpu t is used. Omission A formula is wrong because one of its input cells is blank. Figure 1 3: Madahar, Cleary, and Ball Taxon omy of Spreadsheets Dimension Description Dependency How fundamentally the org anization dep ends on the spreadsheet . Values can be operationa l, tactical, or strateg ic Magnitude The severity of consequen ces for potential error s Time/Urgency Deadlines that have to be met using the spreadshee t Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 219 Figure 1 4: Violation s, Errors, an d Context Le vels Errors Quantitative Mistakes Slips/Lapses Level Violations Qualitative Domain Logic Math Software Slips Lapses Business System Spreadsheet Module Algorithm Cell Figure 1 5: Mechan ical Errors, Slip s, and Lap ses 1996 Taxono my Modification Type of Error Mechanical Slip Lapse Pointing errors 8 8 0 Year 1 and Year 2 sale s salaries tran slated into two salespeople instead o f two years 1 1 Owner salary = 60,000 instead of 80,000 1 1 Typing incorrect value for unit mate rials and labor cost (usually due to a transpo sition) 12 12 Units sold value for Year 2 used in Year 1 1 1 Units sold value 32,000 instead of 3 ,200* 1 1 Sign incorrect 2 2 Parenthesis error 1 1 Rent = 3,600 instead o f 36,000* 1 1 Total Mechanical/Slip /Lapse Errors 28 13 1 5 Percentage of errors 41% 19% 22% Note: * means could be categorize d either a s a slip or as a lapse. Revisiting the P anko–Halvers on Ta xonomy of Spread sheet Err ors Raymond R. Pank o Proce edings of EuSpRIG 2 008 Conference " In P ursuit of Sprea dsheet Excellence " ISBN : 9 78-90561 7-69-2 Copyright © 20 08 Euro pean Sprea dsheet Risks Interest Gr oup ( www.eusprig.org ) a nd Author 220 Figure 1 6: The Sp readsheet Life Cycle a nd Type s of Errors Violations Qualitative Errors Mistakes Slips and Lapses Analysis Requirements Develop ment Module Dev elopment Spreadsheet Develop ment Implementation Operation Maintenance Termination/Replace ment Figure 1 7: Life Cycle S tages and Roles Development Oper ators Manager Developer Tester Owner Customer Operator Analysis Requirements Develop ment Module Dev elopments Spreadsheet Develop ment Implementation Operation Maintenance Termination/Replace ment
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment