Christian Kreibich
ICIR ICSI
ICSI » ICIR » Christian Kreibich » scholar.py
scholar.py
A parser for Google Scholar, written in Python

Introduction

Google Scholar is great resource, but it's lacking an API. Until there is one, scholar.py is a Python module that implements a querier and parser for Google Scholar's output. Its classes can be used independently, but it can also be invoked as a command-line tool. It could definitely use a few more features, such as detailed author extraction and multi-page crawling. If you're interested in adding features, do send patches! (Thanks to those of you who have—you know who you are.)

Features

  • Can extract publication title, main online URL, number of citations, number of online versions, link to Google Scholar's main cluster for the work, and Google Scholar's cluster of all works referencing the publication.
  • Can print entries in CSV format or plain text.

Example

Try scholar.py --help for all available options. A simple example:

$ scholar.py -c 1 --txt --author einstein quantum
         Title Physics and reality
           URL http://www.sciencedirect.com/science/article/pii/S0016003236910475
     Citations 322
      Versions 5
Citations list http://scholar.google.com/scholar?cites=6799563874330167610&as_sdt=2005&sciodt=1,5&hl=en
 Versions list http://scholar.google.com/scholar?cluster=6799563874330167610&hl=en&as_sdt=1,5&as_subj=eng
	 

Download

The code used to be available here, but as of November 2013 resides over on GitHub for all your fork and pull request needs. Thanks!

updated on 06 November 13 | yummy spam, yesss... built with TT | (cc) Christian Kreibich