Editing Knowledge in Large Mathematical Corpora. A case study with Semantic LaTeX (sTeX)

Reading time: 5 minute
...

📝 Original Info

  • Title: Editing Knowledge in Large Mathematical Corpora. A case study with Semantic LaTeX (sTeX)
  • ArXiv ID: 1010.5935
  • Date: 2015-06-01
  • Authors: Michael Kohlhase, Michael Kohlhase, Christoph Lange, Michael Kohlhase

📝 Abstract

Before we can get the whole potential of employing computers in the process of managing mathematical `knowledge', we have to convert informal knowledge into machine-oriented representations. How exactly to support this process so that it becomes as effortless as possible is one of the main unsolved problems of Mathematical Knowledge Management. Two independent projects in formalization of mathematical content showed that many of the time consuming tasks could be significantly reduced if adequate tool support were available. It was also established that similar tasks are typical for object oriented languages and that they are to a large extent solved by Integrated Development Environments (IDE). This thesis starts by analyzing the opportunities where formalization process can benefit from software support. A list of research questions is compiled along with a set of software requirements which are then used for developing a new IDE for the semantic \TeX{} (\stex{}) format. The result of the current research is that, indeed, IDEs can be very useful in the process of formalization and presents a set of best practices for implementing such IDEs.

💡 Deep Analysis

Figure 1

📄 Full Content

I used to come up to my study, and start trying to find patterns. I tried doing calculations which explain some little piece of mathematics. I tried to fit it in with some previous broad conceptual understanding of some part of mathematics that would clarify the particular problem I was thinking about. Sometimes that would involve going and looking it up in a book to see how it's done there. Sometimes it was a question of modifying things a bit, doing a little extra calculation. And sometimes I realized that nothing that had ever been done before was any use at all. Then I just had to find something completely new; it's a mystery where that comes from.

Solving Fermat, Sir Andrew John Wiles [wil10] One of the distinctive features of Mathematics is its intrinsic dependence on existing knowledge [Soj10,Bou10]. As part of everyday scientific life, mathematicians use books, journals, the internet etc., to look up formulas, methods and tools and use them to discover new patterns, formulate conjectures and establish truths. Reporting results back to the community by writing papers, participating in conferences or even blogging are ways to get community feedback as well as recognition. Hence consulting and contributing to sources of mathematical knowledge is a vital part of a mathematician’s research life.

The most recent estimate for the volume of mathematical knowledge produced each year is about 3 million pages [Bou10]. Obviously there is no chance for a person to even read (not to mention digest), such volumes of information. Of course not all 3 million pages are relevant for a particular researcher, thus mathematicians select only certain conferences or journals which they follow closely. A serious drawback of such an approach is that mathematical results discovered in 1. INTRODUCTION some branch of mathematics stay unknown in other communities solving similar problems. There is no simple way of solving this problem because in many cases, even a mathematician familiar with both topics might not see how problems are related, as the conceptual mapping involved may be not-trivial. Yet, I conjecture that there is a lot of potential to be uncovered by getting better computer support in structuring mathematical knowledge and searching it for relevant documents.

Given the importance of mathematical knowledge for the whole scientific community, it is not a coincidence that there is an emerging research field known as Mathematical Knowledge Management (MKM) which defines its objective “to develop new and better ways of managing mathematical knowledge using sophisticated software tools”. As the topic of this thesis fits well in this objective and often refers to the results achieved in the MKM community, I will dedicate the section 1.1 to introduce the main research directions and identify which of them are important to current research. In section 1.2, one of the main long term objectives of MKM is presented, namely the creation of a Universal Digital Mathematical Library (UDML). In sections 1.3 and 1.4 I will present what paradigms are used today to work with mathematics and identify some weak points which could hinder the successful implementation of UDML. I will finish this chapter (section 1.5) with presenting changes to current ways of thinking about mathematical structures and formality which alleviate the weak points identified in section 1.4.

In this section I would like to specify the scope of the MKM research field by giving an overview of the challenges it is trying to address. As the set of challenges is quite big, only the questions relevant to current research will be mentioned. In the next section (1.2) I will also introduce the “Grand Challenge” of MKM which is a project integrating all the aspects of MKM.

There are 4 levels at which the subject of Mathematical Knowledge Management can be addressed: document level -addresses low level document issues like format, level of formality, context and representation. Questions relevant for current research are: D1 What software support is needed to convert informal mathematical documents to formal? D2 When do benefits of formalizing mathematical knowledge outweigh the costs? D3 How should be context of mathematical knowledge expressed? organization level -concentrates on knowledge reuse, inter-document linking, dealing with theoretically infinite size of mathematical knowledge. On this 1.2 Universal Digital Mathematics Library level I am interested in the questions: O1 how should formal as well as informal documents be linked to avoid redundancy? O2 what tools are needed to deal efficiently with highly interconnected structures? dissemination level -deals with administrative questions like certification of knowledge, effective dissemination of mathematical knowledge, ownership of data. These questions are irrelevant for the scope of current thesis. end-user tools level -establishes end-user requirement for tools and services to efficiently work with MKM c

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut