Title: MRA - Proof of Concept of a Multilingual Report Annotator Web Application
ArXiv ID: 1704.01748
Date: 2017-04-13
Authors: ** - 논문에 명시된 저자 정보가 제공되지 않았습니다. (GitHub 저장소 lasigeBioTM 팀이 개발에 참여했을 가능성이 높음) **
📝 Abstract
MRA (Multilingual Report Annotator) is a web application that translates Radiology text and annotates it with RadLex terms. Its goal is to explore the solution of translating non-English Radiology reports as a way to solve the problem of most of the Text Mining tools being developed for English. In this brief paper we explain the language barrier problem and shortly describe the application. MRA can be found at https://github.com/lasigeBioTM/MRA .
💡 Deep Analysis
📄 Full Content
Radiology reports describe the results of radiography procedures and have the potential of being an useful source of information (1), which can bring benefits to health care systems around the world. However, these reports are usually written in free-text and thus it is hard to automatically extract information from them. Nonetheless, the fact that most reports are now digitally available makes them amenable for using Text Mining tools. Another advantage of Radiology reports is that even if written in free-text, they are usually well structured. A lot of work has been done on Text Mining of Biomedical texts, including health records (2), but although Radiology reports are usually written in the native language of the radiologist, Text Mining tools are mostly developed for English.
English reports that depends on RadLex (4), a lexicon for Radiology terminology, which is freely available in English. Given this dependence, the system cannot be easily applied to reports written in other languages. And even if the system was not dependable on an English lexicon, it is not certain that the results would be the same if another language was used, because of, for example, differences in syntax (other examples of tools developed focused on English include ( 5), ( 6) and ( 7)). This have been an obstacle in the sharing of Radiology information between different communities, which is important to understand and effectively address health problems.
There are mainly two possible solutions to this problem. One is to translate the lexicon itself (8,9) and the other is to translate the reports. MRA (Multilingual Radiology Annotator) is a web application that explores the implementation of this last solution. It translates and annotates Radiology text with RadLex terms, a Named-Entity Recognition (NER) task. This is a relevant task since the outputs from NER systems can be used in Image Retrieval (10) and Information Retrieval (11) systems and can be useful for improving automatic Question Answering (12). Basically, MRA is a prototype of what can be done when integrating translation in medical applications.
The flow of the application goes like this. The user uploads a text file containing Radiology text to the application (Figure 1), and, if the text is in a non-English In a real-life scenario, human translation could also be used for more reliable results. So, the text is sent to translation and after a while (approximately 2 minutes, to simulate a real human and machine translation) the translation is ready. Then, the translated text is sent to BioPortal 2 annotation services. After this is done it is possible to explore the annotations in the translated text (Figure 2). The interface of the annotations was partly inspired by a similar project called LexMap (13).
The software, as it is, its dependable on the user having an Unbabel’s and a BioPortal’s API keys. The BioPortal API key is easy to obtain. If you do not have and cannot obtain an Unbabel API key, it will not be hard do adapt the software so that you can use another translation service.
The back-end was developed using Python’s Flask 3 web-framework. It makes use of Celery 4 to handle the requests for translations and annotations. The software code and a guide for its installation can be found at the following GiHub repository: https://github.com/lasigeBioTM/MRA
.
The idea is that this application can be used to bootstrap other, more useful applications. It can also be shown to clinical institutions to pique their interest on what can be done when you integrate translation in medical practice.