Citation Analysis with Microsoft Academic
📝 Abstract
We explore if and how Microsoft Academic (MA) could be used for bibliometric analyses. First, we examine the Academic Knowledge API (AK API), an interface to access MA data, and compare it to Google Scholar (GS). Second, we perform a comparative citation analysis of researchers by normalizing data from MA and Scopus. We find that MA offers structured and rich metadata, which facilitates data retrieval, handling and processing. In addition, the AK API allows retrieving frequency distributions of citations. We consider these features to be a major advantage of MA over GS. However, we identify four main limitations regarding the available metadata. First, MA does not provide the document type of a publication. Second, the ‘fields of study’ are dynamic, too specific and field hierarchies are incoherent. Third, some publications are assigned to incorrect years. Fourth, the metadata of some publications did not include all authors. Nevertheless, we show that an average-based indicator (i.e. the journal normalized citation score; JNCS) as well as a distribution-based indicator (i.e. percentile rank classes; PR classes) can be calculated with relative ease using MA. Hence, normalization of citation counts is feasible with MA. The citation analyses in MA and Scopus yield uniform results. The JNCS and the PR classes are similar in both databases, and, as a consequence, the evaluation of the researchers’ publication impact is congruent in MA and Scopus. Given the fast development in the last year, we postulate that MA has the potential to be used for full-fledged bibliometric analyses.
💡 Analysis
We explore if and how Microsoft Academic (MA) could be used for bibliometric analyses. First, we examine the Academic Knowledge API (AK API), an interface to access MA data, and compare it to Google Scholar (GS). Second, we perform a comparative citation analysis of researchers by normalizing data from MA and Scopus. We find that MA offers structured and rich metadata, which facilitates data retrieval, handling and processing. In addition, the AK API allows retrieving frequency distributions of citations. We consider these features to be a major advantage of MA over GS. However, we identify four main limitations regarding the available metadata. First, MA does not provide the document type of a publication. Second, the ‘fields of study’ are dynamic, too specific and field hierarchies are incoherent. Third, some publications are assigned to incorrect years. Fourth, the metadata of some publications did not include all authors. Nevertheless, we show that an average-based indicator (i.e. the journal normalized citation score; JNCS) as well as a distribution-based indicator (i.e. percentile rank classes; PR classes) can be calculated with relative ease using MA. Hence, normalization of citation counts is feasible with MA. The citation analyses in MA and Scopus yield uniform results. The JNCS and the PR classes are similar in both databases, and, as a consequence, the evaluation of the researchers’ publication impact is congruent in MA and Scopus. Given the fast development in the last year, we postulate that MA has the potential to be used for full-fledged bibliometric analyses.
📄 Content
1 Hug, S. E., Ochsner M., and Brändle, M. P. (2017): Citation analysis with Microsoft Academic. Scientometrics. DOI 10.1007/s11192-017-2247-8
Submitted to Scientometrics on Sept 16, 2016; accepted Nov 7, 2016
Citation Analysis with Microsoft Academic Sven E. Hug1,2,*, Michael Ochsner1,3, and Martin P. Brändle4,5
1 Social Psychology and Research on Higher Education, ETH Zurich, D-GESS, Muehlegasse 21, 8001 Zurich, Switzerland 2 Evaluation Office, University of Zurich, 8001 Zurich, Switzerland 3 FORS, 1015 Lausanne, Switzerland 4 Zentrale Informatik, University of Zurich, 8006 Zurich, Switzerland 5 Main Library, University of Zurich, 8057 Zurich, Switzerland
- Corresponding author. Tel.: +41 44 632 46 85, Fax: +41 44 634 43 79, Email: sven.hug@gess.ethz.ch
Abstract: We explore if and how Microsoft Academic (MA) could be used for bibliometric analyses. First, we examine the Academic Knowledge API (AK API), an interface to access MA data, and compare it to Google Scholar (GS). Second, we perform a comparative citation analysis of researchers by normalizing data from MA and Scopus. We find that MA offers structured and rich metadata, which facilitates data retrieval, handling and processing. In addition, the AK API allows retrieving frequency distributions of citations. We consider these features to be a major advantage of MA over GS. However, we identify four main limitations regarding the available metadata. First, MA does not provide the document type of a publication. Second, the “fields of study” are dynamic, too specific and field hierarchies are incoherent. Third, some publications are assigned to incorrect years. Fourth, the metadata of some publications did not include all authors. Nevertheless, we show that an average-based indicator (i.e. the journal normalized citation score; JNCS) as well as a distribution-based indicator (i.e. percentile rank classes; PR classes) can be calculated with relative ease using MA. Hence, normalization of citation counts is feasible with MA. The citation analyses in MA and Scopus yield uniform results. The JNCS and the PR classes are similar in both databases, and, as a consequence, the evaluation of the researchers’ publication impact is congruent in MA and Scopus. Given the fast development in the last year, we postulate that MA has the potential to be used for full-fledged bibliometric analyses.
Keywords: normalization, citation analysis, percentiles, Microsoft Academic, Google Scholar, Scopus
2 Introduction Microsoft Academic (MA) is a new service offered by Microsoft since 2015 and was introduced to the bibliometric research community by Harzing (2016). She assessed the coverage of this new tool by comparing the publication and citation record of her own oeuvre in Web of Science (WoS), Scopus, Google Scholar (GS), and MA. The Publish or Perish software (Harzing, 2007) was used to collect data from MA. Harzing (2016, p. 1646) finds that of the four competing databases “only Google Scholar outperforms Microsoft Academic in terms of both publications and citations” and concludes that MA is, with some reservations regarding metadata quality, an “excellent alternative for citation analysis” (p. 1647). She also conducted a citation analysis and calculated both the h-index and the hIa (Harzing, Alakangas, & Adams, 2014) for her oeuvre yet did not explore if other bibliometric analyses are feasible with MA. Hence, in this paper, we will explore if and how MA could be used for further bibliometric analyses. We will focus on Microsoft’s Academic Knowledge API (AK API), an interface to access MA data. First, we will describe advantages and limitations of the AK API from the perspective of bibliometrics and compare it to GS, the closest competitor of MA. Second, we perform a citation analysis of researchers by normalizing data from MA and compare the results to those obtained with Scopus, an established database for bibliometrics.
Academic Knowledge API The AK API enables users to retrieve information from Microsoft Academic Graph (MAG). MAG is a database that models “the real-life academic communication activities as a heterogeneous graph consisting of six types of entities” (Sinha et al., 2015, p. 244). These entities are paper, field of study, author, institution (affiliation of author), venue (journal or conference series), and event (conference instances). Each of these entities is specified by entity attributes, which will be discussed below. Data for MAG is primarily collected from metadata feeds from publishers and web pages indexed by Bing (Sinha et al., 2015). MAG has grown massively from 2015 to 2016 and, according to Wade, Kuasan, Yizhou, and Gulli (2016), it contains approximately 140 million publication records (83)1, 40 million authors (20), 3.5 million institutions (0.77), 60,000 journals (22,000), and 55,000 fields of study (50,000). Ribas, Ueda, Santos, Ribeiro-Neto, and Ziviani (201
This content is AI-processed based on ArXiv data.