Corporate insolvency can have a devastating effect on the economy. With an increasing number of companies making expansion overseas to capitalize on foreign resources, a multinational corporate bankruptcy can disrupt the world's financial ecosystem. Corporations do not fail instantaneously; objective measures and rigorous analysis of qualitative (e.g. brand) and quantitative (e.g. econometric factors) data can help identify a company's financial risk. Gathering and storage of data about a corporation has become less difficult with recent advancements in communication and information technologies. The remaining challenge lies in mining relevant information about a company's health hidden under the vast amounts of data, and using it to forecast insolvency so that managers and stakeholders have time to react. In recent years, machine learning has become a popular field in big data analytics because of its success in learning complicated models. Methods such as support vector machines, adaptive boosting, artificial neural networks, and Gaussian processes can be used for recognizing patterns in the data (with a high degree of accuracy) that may not be apparent to human analysts. This thesis studied corporate bankruptcy of manufacturing companies in Korea and Poland using experts' opinions and financial measures, respectively. Using publicly available datasets, several machine learning methods were applied to learn the relationship between the company's current state and its fate in the near future. Results showed that predictions with accuracy greater than 95% were achievable using any machine learning technique when informative features like experts' assessment were used. However, when using purely financial factors to predict whether or not a company will go bankrupt, the correlation is not as strong.
This is the century of data. Harvard Business Review recently published an article which named Data Scientist the "sexiest job" of the 21st century (Davenport & Patil, 2012). With big data about consumers, marketing, operations, accounting, economics, etc. already widely available to most corporations, the last piece of the puzzle appears to be extracting valuable information that can be interpreted by humans, using techniques such as data-mining or machine learning. Corporations are already restructuring their company strategies to reap the benefits from machine learning. A survey done by the Accenture Institute for High Performance indicated that more than 40% of large corporations are already using machine learning to boost their marketing and they can attribute approximately 38% of their sales improvement to machine learning. In addition, 76% of these corporations believe that machine learning will be a key component of their future sales growth (James Wilson, Mulani, & Alter, 2016).
The applications of machine learning to business are broad; aside from targeted sales and market segmentation, it can be used for inventory optimization based on demand forecasting, personalized customer service and customer segmentation, and many more (Chen, Chiang, & Storey, 2012). The domain that will be studied extensively in this thesis is financial credit risk assessment. Terminology such as credit rating/scoring, bankruptcy prediction, and corporate financial distress forecast will be used interchangeably and together they will be referred to as “financial credit risk assessment” (Chen, Ribeiro, & Chen, 2016). The reason for such simplification is that (from a probabilistic machine learning perspective) all these problems can be cast into a binary classification problem in the final stage, e.g. Will this company be bankrupt by next quarter? Answer: Yes or No.
Bankruptcy prediction dates back more than two centuries where most assessments were done qualitatively (Bellovary, Giacomino, & Akers, 2007;Li & Miu, 2010;de Andrés, Landajo, & Lorca, 2012). It was not until the 20th century that more quantitative (and less subjective) techniques became popular; some examples include the seminal univariate analysis work of Beaver (Beaver, 1966) and multiple discriminant analysis work of Altman in the 1960s (Altman, 1968). Their work demonstrated the ability to predict a company’s failure up to five years in advance. Such information is an asset not only to creditors, auditors, stockholders, senior management, etc. because it can have a direct effect on them, but also to many other stakeholders such as suppliers and employees (Wilson & Sharda, 1994).
To understand the significance and possible impacts of corporate bankruptcy on the rest of society, it is worthwhile to revisit the largest bankruptcy in world history, Lehman Brothers Holdings Inc. Caused by social irresponsibility in management and triggered by their exposure to the subprime mortgage crisis in the United States, on September 15, 2008, the fourth largest investment bank in the United States declared bankruptcy (Williams, 2010). The global economy went from bad to worse. Almost six million jobs were lost (the U.S. unemployment rate doubled), Dow Jones industrial average dropped 5000 points, and an estimated $14 trillion of wealth was destroyed (Shell, 2009). On the same day that Lehman Brothers declared bankruptcy, The European Central Bank and The Bank of England in London injected more than $50 billion into the market to calm the world economy (Ellis, 2008). American Broadcasting Company (ABC) News described it as a “financial tsunami” and even compared it to the Great Depression in the 1930s. To many people, this event may have seemed sudden, but such financial disaster did not happen overnight; there were patterns in the data months (even years) prior to this incident that most people failed to recognize (Demyanyk & Hasan, 2010).
A reliable financial distress forecasting system could have identified financial issues and challenges prior to the actual bankruptcy. Such a system would be beneficial to companies in various industries worldwide, as company failures are certainly not exclusive to the American economy.
As stated in a recent business article from Forbes: “Machine learning is redefining the enterprise in 2016” (Columbus, 2016). Therefore, it is critical for business managers, banks, investors, and other stakeholders to develop an understanding and intuition about how these algorithms can be beneficial to their decision-making process. This thesis will investigate the value and limitations of machine learning techniques for businesses, with a focus on financial credit risk assessment. Popular machine learning techniques such as logistic regression, support vector machines, decision trees, AdaBoost, artificial neural networks, and Gaussian processes will be explored. The efficacy of such tools for expressing business well-being and improving business
This content is AI-processed based on open access ArXiv data.