Research News  |  07/26/2019

pybliometrics – A New Software for Research With Big Bibliometric Data

pybliometrics opens up new ways for users to get more quickly to large and growing amounts of data. At the same time, the software fosters the verifiability of research results, which is a hallmark of good scientific practice.

Illustrative examples using data obtained through pybliometrics: Co-author network, word cloud of terms used in scientific abstracts, geographic center of publications of a scientist, and citation distribution for three papers.

Research organizations and researchers studying science itself are reliant on bibliometric databases. These collect data on scientific publications, which allow to “measure” scientific output. The larger a database, the more scientific activity can be captured, but the more difficult it is to extract research data.


One of the largest bibliometric databases operated by the scientific publisher Elsevier is Scopus. The pybliometrics software, developed by Michael E. Rose, Senior Research Fellow at the Institute, in interdisciplinary collaboration with John R. Kitchin (Professor for Chemical Engineering at the Carnegie Mellon University), now allows researchers working with a Scopus license or Scopus Custom Data to use this database without major hurdles and automatically download data.


The software is written in Python, a programming language, which is becoming increasingly important among scientists. pybliometrics opens up new ways for users to get more quickly to the large and growing amounts of data they need.


At the same time, pybliometrics fosters the verifiability of scientific results, as it makes it transparent for everyone, according to which definitions research data were drawn. This facilitates the replication of research results, which is a hallmark of good scientific practice.


For the publication, see here.