A New Approach to Spreadsheet Analytics Management in Financial Markets
Spreadsheets in financial markets are frequently used as database, calculator and reporting application combined. This paper describes an alternative approach in which spreadsheet design and database technology have been brought together in order to …
Authors: Brian Sentence
A N ew A pp roach to S preadsheet A n al ytics Management in F inancial Market s Brian Sentence, Xenomorp h Software Ltd bsentance@xe nomorph.co m ABSTRACT Spreadsh eets in fin ancial markets are freque ntly used as da tabase, calcula tor and reporting applica tion combined. This paper describes an alternative approach in which spreadsheet design and database tec hnology have been brou ght togeth er in o rder to a lleviate ma nagem ent and regulatory concerns over the ope rational ris ks of spreadshe et usage. I n particula r, the paper focuses on the ra pid creation a nd centralised dep loyment o f statistical a nalytics within a software system no w in use by major investment ban ks, a nd p resents a nov el techniq ue for th e manipula tion in spreadsh eets of hig h volumes of intra day ma rket data. 1 INTRODUCTION Whilst spreadsheets are use d extensively in many p rofessions, nowhere is this u sage more pervasive or more critical than in the financial mark ets [ Croll, 2005 ]. There are m an y factors that contribute to w hy this is the case. Client, r egulatory and competitive pressures are forcing trading desk s to analyse ever- higher volumes of market data in orde r to demonstrate custom er value, to show adherence to m arket rules and to identify new trading opportunities. The complexity and breadth o f the data being ana lysed is also increasing, due to the innov ative nature of the new fin ancial products that are now b eing created. Another key fac tor driving the usage of spreadsheets i n finan cial markets is the r elative extremes of specialisation within the industry. End- use rs such as derivativ es traders, product controllers, risk managers and quantitative analy sts ordinarily need to have a very firm foundation in the understanding of mathematics, financial theory and market behaviour. This specialisat ion of end-user knowledg e, combined with extremely short commercial delivery t imeframes, pre sents a huge challenge to even the most advanced of system designers. This na turally leads to increased spreadsheet usage as t he spreadsheet becomes the only platform t hat can meet the deadlines imposed by market and competitive pressures. The background described above would only be of passing importance if it were not f or the huge sums of money bein g managed out of spreadsheets on a day-to-day ba sis. One seemingly s mall error in a trader’s spreadsheet could potentially cause (and has caused) very significant losses for a fi nancial institution [ Wilmott, 2005 ]. Even in the absence of any spreadsheet errors, due to a spreadsheet’s lack of transparency then a financia l institution is open to t he operational risk t hat an individual t rader may deliberately mis- quote or mis-represent the instrum ent pricing and risk levels being undertaken [ Mittemeir, 2005 ]. It is therefore no surprise that regulators are now paying very direct attention to the use of spreadsheets by banks [ Buckner, 2004; PW C, 2004 ] This paper d escribes a ne w approach to thi s problem of spreadsheet m anagement i n financial markets. It combines the best of spreadsheet produc tivity with the best of database technology to provide c onsistent, centralised and transparent access to data f or all users. Additiona l ly, an ob ject spreadsheet approach is described which can greatly reduce the number of spreadsheet formulas r equired to manipulate large amounts of array data. The approach taken puts spreadsheet design at the heart of t he data management process for a financial ins titution rather than as an “ad- hoc” or tactical add-on solution. 2 SOLUTION DESIGN GOAL S Whilst many of the spreadsheet issues described above are as much procedural as technical, it is technically possible to address many of the negative sides of their usage whilst also le aving many of their positives aspects in place. This paper describes a very recent spreadsheet- related enhancement to a data managem ent system that it is currently in use at some of the ma jor global investment banks. T raders and risk managers us e the system, known as Ti meScape, to perform statistical analysis on hist orical market data and to perform derivatives valuation in b oth pre- and post-trade decision support. What follows below are some of the key design considerati ons of th is centralised spreadsheet environment known to the author as a “Formula Grid” and illustrated in Figure (1). 2.1 Spreadsheet Interface One major design goal was to ensure that users could still benefit from the productivity of using a spreadsheet interface to define calculations and analytics. T his is particularly vital in financial markets where s pecialist business knowledge com bined w ith co mmercial timeframes often preclude t he transfer of this knowledge to system designers. Hence Formula Grid calculation s can be edited by end-users alongside of their usage of spreadsheets a nd o ther application s. This is shown in Figure (1) as the Fo r mula Grid Editor. A f urther st ep was also taken to make the Formula Grid spreadsheet calculation a n in- line, core part of the process of market data managem ent and a nalysis, rather than an ad - hoc tool into which data is imported, analysed and exported out i nto the business process. This was implemented b y means of mapping a nd hi ding s preadsheet c alculations beh ind analytical functions and dat a fields, more of which will be seen later in this paper. 2.2 Data Centralisation Here the approach taken was to move the data out of t he spreadsheet and to locate the data within a centralised database as shown in Figure (1). There are a few key considerations when doing this. The first is to mak e sur e that the database can support the typical data types used in spreadsheets such as arrays, li sts and matrices. These datatypes are not typically foun d o ut of t he box with an y traditional database management systems. The second is to make it easy for t he user t o get data from existing spreadsheets into the new centralised s preadsheet environment. Once again, dat abases tend to be very t echnical in nature so this naturally alienates the comm uni ty of end-users that we wish to a ssist. Given that these two a spects are in place it i s then possible to reduce the amount of data actually stor ed “within” the spreadsheet environment (e.g. “ce ll = data value” ) and all ow more of this data available outside of t he s preadsheet for other non-spreadsheet users. Once the data is contained within a centralised database it is then easy to make us e of the usual things that databases do well such as user access permissioning, backing up and restoring data, providing tr ansparent programm ing access t o data and so o n. Figure (1) – Server-Side S pre adsheet Calculation with Client-Side Editing 2.3 Calculation Centralisation Whilst end-users like to use spreadsheet interfaces to define data and calculations, this in itself does not preclude t he possibility of the calculation being defined by the user being run in a ce ntralised, server- si de manner. This is approach taken here, where spreadsheet operations defi ned by t he user are stored centrally in the database in Figure (1) and a lso run centrally in the calculation server of Figure (1). This means t hat these spreadsheet calculations are available not just to end users of spreadsheets but to all users t hroughout an organisation, as again illustrated by Figure (1). Calculation centralisation also leaves the architecture o pen to further i mprovem ent as software and hardwa re infrastructure improve, al l without t he end user needing to be aware of any material ch anges other than improved perform ance. 2.4 Data Objects in Cells In order to cope with large am o unts of array data found in financial markets, an additiona l behaviour was introduced. T his a llowed array and other more complex data types to b e contained within a sin gle spreadsheet cell. For exam ple, a time series of bond or equity prices could be contained within a single spreadsheet cell (e.g. cell “A1 = Closing Pri ce Series” or cell “B1 = Hist oric FX Series”). This t hen allows vector arithmetic, such as converting t he historic prices of an equity from one currency into another, t o be defined through simple cell operations s uch as cell C1 = A1 * B1. Such an example operation would take all of the historic equity prices contained in cell A1, all the historic FX ra tes stored in cell B1 and multiple them together throughout all time to produce an array result of the correct currency in cell C1. 3 SOLUTION EXAMPLE What follows is an example of calculating a V olume We ighted Average Pric e (VWAP) measure which is often used by portfolio traders to demonstrate to clients how wel l a client’s order t o s ell or buy st ock has been placed in the market against average price levels observed. The example shown is greatly s implified in order to best illustrate the principles o f how a centr alised sprea dsheet calcu lation is defi ned and execut ed as a Formula Grid within the data manag ement environment utilised. V WAP practitioners should note t hat much more com plex, flexib le and param e terised versions of VWAP calculation can also be implemented in t he Formula Grid, but these are outside of the scope of this introductory p a per. Figure (2) Creating a Formula Grid Data Attribute for Equity Instruments Figure (2) above shows the schema (standard data attributes) f or all “Equity” instruments contained in a database of market and static data known as LSE. Already set up for equity instruments are data attributes such as “TradePrice ” and “TradeSize”, both of which are numeric time series stored externally to the For mula Grid calculation we ar e about to design. In particular, Figure (2) illustrates how a new dat a attribute is being c reated called “VWAP” a nd its data type is being assigned t o type Formula Gri d fr om the dropdown shown. After the a ttribute has been created, a tab appears containing a spreadsheet environment upon which calculations can be defined. In t he example s hown i n Figure (3) below, an equity attribute of “ TradePrice” has been pla ced in cell A1. Figure (3) Entering the S preadsheet Calculation in a Formula Grid Given that this Formula Grid calculation i s centralised and i s going to apply to all equity instruments contai ned in t he database, then it is useful to prev iew the outpu t for an example equity. Boots PLC has been chosen as the example equity and in Figure ( 4) below it shows how 25,000 prices from the “TradePric e” attribute of Boots PLC are being accessed in one Form ul a Grid spreadsheet cell. Figure (4) 3D Preview of Array Data Containing in Cells Whilst the preview in Figure ( 4) may be a useful way of r epresenting d ata as pa rt of manipulating it, t hen Figure (5) shows how t he data contained in cell A1 can be unfolded to present a more traditional and indeed “human-readable” representation of “TradePrice” as a two column array of tim es and r ecorded prices for Boots PLC. Figure (5) Unfolded 2D Preview of Array Data Con t aining in Cells Now that we can see how arrays c an be accessed within single spreadsheet cells in the Formula Grid, it is now possible t o define a simple sprea dsheet formula for the VWAP calculation as shown in F igure (6) below. Figure (6) VWAP Calcula tion Showing Vector Mu ltiplication Figure (6) sh ows how the “TradePrice” has bee n referenced in cell A1 and the “TradeSize” in cell A2. In cell A3 we multiple the price and the volume cells together to produce a vector result. In cells A 4 we sum all o f the histo ric array values in A3 to produce a scalar result, and sim ilarly we sum the total volum e o f all transactions from cell A2 in ce ll A5. Finally, we divide the product of price and volume by the total volume to give us the VWAP result in cell A6. Figure (7) 3D Preview V WAP Calculation Results Figure (7) a bove shows t he results preview (in a rray fo rm) o f executing the spreadsheet calculation defined in Figur e (6) for Boots PLC. The first th ree cells contain vector result s and the last three contain sc alars, as expected from the formula definition. Figure (8) Hiding Inter mediate VWAP Calculation s Figure (8) shows that the intermediate calculations in the first five cells , A1 t o A5, can be hidden from the end user and this i s shown by the cells being greyed to mark their “hidden” status. One passing poin t of note is that when m ultiplying time vectors to produce the result in cell A3, it may prove necessary to align data through time which can be handled by an extensiv e set of data rules built into the Form ul a Grid Engine. Figure (9) Viewing VWA P Results For Any Equit y In Figure (9) above we move away f rom the da tabase schema and the preview of a single instrument with Boots PLC shown in pr evious screen shots, in order to browse and view an entire data universe of e quity instruments c ontained within a n equity database. O n browsing to find the equity AstraZenica PLC we then select the data a ttribute called “VWAP”, which causes the Formula Grid spreadsheet calculation defined in Figure (6) to execute i n t he context of the particular equity being viewed. Hence AstraZeneca PLC’s “TradePrice” and “ TradeSize” series hav e b een loaded i n background a nd the V WAP calculated as a value of 271 7. All of t his has occurred without the end user needing to be a ware of the complexity of t he calculation or the way in which it was defined, and is a vailable to all users of the system whether browsing equity da ta as above or loading da ta through programm ing interfaces. 4 CONCLUSION This paper has presented an alternative method t o the us age of spreadsheets as database, calculator and reporting application combined. The approach described is already in use within a commercial data manag ement software system, and combines spreadsheet design with database technology t o achieve ce ntralised and trans parent deploym ent of financial analytics. In particular it has illustrated a novel technique for the manipulation of large amounts of array data, which is becoming more important i n financial m arkets a s practitioners become increasing l y interested in the analy sis high-frequency intraday market prices and quotes. 5 REFERENCES Buckner, D (2004),. “Why Banks Use Spr eadsheets” EuSpRIG pr oceedings 2004 pp 49- 66 PWC (2004) “The Use of S preadsheets: Consideration s for Section 404 of the Sar ba nes- Oxley Act”, PriceWaterhou seCoopers, London, http://www.pwc.com/im ages/gx/eng/fs/insu/rt5.pdf , A ccessed 20 Feb 2008 19: 54 Croll, Grenville J. ( 2005), “T he Importance and Criticality of Spreadsheets in t he City of London”, EuSpRI G Proc eedings 2005, arXiv:0709.4063v 1 Mittermeir, Roland & Clerm ont, Markus & Hodnigg , K arin (2005), “Protecting Spreadsheets Against Frau d”, EuSpRI G Proceedings 2005, arXiv:0801.4268v 1 Wilmott, Paul., (2005), www.wilmott.com , General Forum, “Spreadsheet Erro rs” thread
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment