A content analysis of google scholar: coverage varies by discipline and by database
Date
2007
Authors
Wilson, Virginia
Journal Title
Journal ISSN
Volume Title
Publisher
University of Alberta Learning Services
ORCID
Type
Article
Refereed Paper
Review
Refereed Paper
Review
Degree Level
Abstract
Abstract
Objective - To ascertain the coverage by discipline, publication date, publication
language, and upload frequency of the
scholarly articles found in Google Scholar.
Design - Comparative content analyses.
Setting - Electronic information resources accessible via the internet (both freely
accessible and for-fee databases). Subjects - Forty- seven online databases and
Google Scholar.
Methods - The study compared the content of 47 databases (21 Internet resources freely
available to the general public; 26 restricted access databases) covering a variety of
subjects with the content of Google Scholar.
Each database was assigned to one of the following discipline categories: business,
education, humanities, science and medicine, social science, and multidisciplinary. From
April through July 2005, researchers generated random samples of 50 article titles
from each of the 47 databases and searched the titles on Google Scholar to determine inclusion.
Related studies were conducted for
publication date and publication language analysis, and for the Google Scholar upload frequency study. For the publication date
study, random samples from one database (PsycINFO) with a high degree of variability in Google Scholar coverage were searched for
1990, 2000, and 2004. For the publication language study, Google Scholar coverage of PsycINFO articles in English was compared
to coverage of PsycINFO articles published in non-English languages. For the upload
frequency study, two databases chosen for their high degree of coverage (BioMed Central and PubMed) were monitored to
determine how often the new content was uploaded to Google Scholar. Main Results - This study revealed that content covered by Google Scholar varies
greatly from database to database and from discipline to discipline. Of the 47 databases studied, coverage ranged from 6% to 100%.
Mean and median values of coverage for all databases were both 60%. The mean discipline category scores varied from the
humanities databases at 10% coverage, to the social sciences and education at 39% and 41% respectively, to science and medicine databases at 76% coverage. Mean coverage was 77% for the multidisciplinary databases.
Mean coverage of open access journal databases was 95%, freely accessible databases had 84% mean coverage, and single publisher databases had 83% mean coverage The publication language study found a bias towards English language publications. As well, a publication date bias was found-coverage of earlier dates was not as thorough as coverage of more recent publications. In the upload frequency study, for BioMed Central and PubMed there appears to be an approximately 15-week
delay in the uploading of new material to Google Scholar.
Conclusions - The results of this study serve to alert researchers and information professionals that Google Scholar (in beta test
mode at the time of the study) has poor coverage in certain areas. To those with access to commercial databases, this serves as a cautionary tale. To those with a dearth of commercial databases, Google Scholar is a welcome site and can provide at least some
information. The researchers state that the search engine itself could make future content studies unnecessary if it decides to
make its content collection methodology transparent to users. Upload frequency, Google Scholar's linking services, the advanced search option, and the "cited by"
feature could all be subjects of future studies. For its first year in operation, Google Scholar
offers a broad range of discipline coverage with substantial depth in some areas. At the time of the study, Google Scholar was
working with libraries and vendors to connect search results to library-licensed full
text.
Description
A review of: Neuhaus, Chris, Ellen Neuhaus, Alan Asher, and Clint Wrede. "The Depth and Breadth
of Google Scholar: An Empirical Study." portal: Libraries and the Academy 6.2
(Apr. 2006): 127-41.
Reviewed by:
Virginia Wilson
SHIRP Coordinator, Health Sciences Library, University of Saskatchewan
Saskatoon, Saskatchewan, Canada
E-mail: virginia.wilson@usask.ca
Keywords
Citation
Evidence Based Library and Information Practice, 2, no.1, (2007) 134-136