An exploratory study of Google Scholar

Reading time: 5 minute
...

📝 Original Info

  • Title: An exploratory study of Google Scholar
  • ArXiv ID: 0707.3575
  • Date: 2019-01-15
  • Authors: Researchers from original ArXiv paper

📝 Abstract

The paper discusses and analyzes the scientific search service Google Scholar (GS). The focus is on an exploratory study which investigates the coverage of scientific serials in GS. The study shows deficiencies in the coverage and up-to-dateness of the GS index. Furthermore, the study points up which Web servers are the most important data providers for this search service and which information sources are highly represented. We can show that there is a relatively large gap in Google Scholars coverage of German literature as well as weaknesses in the accessibility of Open Access content. Keywords: Search engines, Digital libraries, Worldwide Web, Serials, Electronic journals

💡 Deep Analysis

Deep Dive into An exploratory study of Google Scholar.

The paper discusses and analyzes the scientific search service Google Scholar (GS). The focus is on an exploratory study which investigates the coverage of scientific serials in GS. The study shows deficiencies in the coverage and up-to-dateness of the GS index. Furthermore, the study points up which Web servers are the most important data providers for this search service and which information sources are highly represented. We can show that there is a relatively large gap in Google Scholars coverage of German literature as well as weaknesses in the accessibility of Open Access content. Keywords: Search engines, Digital libraries, Worldwide Web, Serials, Electronic journals

📄 Full Content

As is now customary for new Google offerings, the launch of Google Scholar (http://scholar.google.com/) generated a great deal of media attention shortly after its debut in November 2004. Its close relation to the highly discussed topics of open access and invisible web (Lewandowski and Mayr, 2006) ensured that many lines were devoted to this service in both the general media (Markoff, 2004;Terdiman, 2004) and among scientific publishers and scientific societies (Banks, 2004;Butler, 2004;Payne, 2004;Sullivan, 2004;Jacsó, 2004;Giles, 2005). While the initial euphoria over this new service from Google has since quieted down, the service is currently being utilized by academic search engines to integrate results that are available free of charge.

Google Scholar stands out not just for the technology employed but for the efforts made to restrict searches to scientific information. As stated on the Google Scholar webpage:

“Google Scholar enables you to search specifically for scholarly literature, including peerreviewed papers, theses, books, preprints, abstracts and technical reports from all broad areas of research. Use Google Scholar to find articles from a wide variety of academic publishers, professional societies, preprint repositories and universities, as well as scholarly articles available across the web.” (Google 2005, see http://scholar.google.com/scholar/about.html ) Above all, it appears that Google is attempting to automatically index the totality of the realm of scientifically relevant documents with this new search service Google Scholar. As Google does not make any information available with regard to coverage or how current the content it offers is, this study has been undertaken with the goal of empirically exploring the depth of search in the scientific web. We have measured the coverage of the service by testing different journal lists. The types of results and which web servers are represented in the result are also analyzed.

The paper first describes the background, functions and unique features of Google Scholar. A brief literature review will bring together the current research results. Results of the second Google Scholar study from August 2006 will be presented in the second part. An initial analysis of journals in Google Scholar was conducted by the authors in the period April/May 2005 (Mayr and Walter, 2006). The results of this study were compared with certain parts of the current analysis in August 2006. This is followed by a summary of our observations on this new service.

The pilot project CrossRef Search (http://www.crossref.org/crossrefsearch.html ) can be seen as a test and predecessor of Google Scholar. For CrossRef Search Google indexed full-text databases of a large number of academic publishers such as Blackwell, Nature Publishing Group, Springer, etc., and academic/professional societies such as the Association for Computing Machinery, the Institute of Electrical and Electronics Engineers, the Institute of Physics, etc., displaying the results via a typical Google interface. The CrossRef Search interface continues to be provided by various CrossRef partners (e.g. at Nature Publishing Group).

Similar in approach, but broader and less specific in scope than Google Scholar, the scientific search engine Scirus (http://www.scirus.com ) searches, according to information they provide, approximately 300 million science-specific web pages. In addition to scientific documents from Elsevier (ScienceDirect server, see http://www.sciencedirect.com/ ) freely accessible documents, many from public web servers at academic institutions, are provided. Among these are, for example, documents placed by students that do not fulfil scientific criteria such as peer review which often lead to their exclusion in searches. In our experience there is more than a negligible fraction of records from non-academic web spaces in the Scirus index. Scirus’ coverage of purely scientific sources in addition to Elsevier’s ScienceDirect full-text collection is low by comparison (compare the selection of hosts in the Scirus advanced search interface, http://scirus.com/srsapp/advanced/) . What Scirus declares as the ‘rest of the scientific web’ is too general, non-specifically filtered and makes up the majority of hits in any query.

As seen in the pilot project CrossRef Search, the chosen Google Scholar approach is to work in cooperation with academic publishers. What is significant about the Google Scholar approach?

First and foremost, what stands out is that Google Scholar, as previously mentioned, delivers results restricted to exclusively scientific documents and this constraint has yet to be consistently implemented by any other search engine. Google Scholar is a freely available service with a familiar interface similar to Google Web Search. Much of the content indexed by Google Scholar is stored on publishers’ servers where full-text documents can be downloaded for a fee, but at least the abstracts of

…(Full text truncated)…

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut