back
Articles in Refereed Journals
Innovation and Entrepreneurship Research

Finding Doppelgängers in Scopus: How to Build Scientists Control Groups Using Sosia

Rose, Michael; Baruffaldi, Stefano Horst (2025). Finding Doppelgängers in Scopus: How to Build Scientists Control Groups Using Sosia Scientometrics 2025, forthcoming.

The construction of control groups of scientists is often a daunting effort. This paper presents sosia, an open-source Python-based software designed to efficiently query the Scopus database via RESTful API. sosia searches for researchers with publication profiles similar to a given researcher up to a given year based on all main standard bibliometric indicators. The user can choose flexibly a set of parameters to restrict the search to more or less narrow boundaries upfront and obtain additional similarity indicators to select a subset of authors after the search. Advanced settings also allow narrowing the search to a list of affiliations and to minimize the possible errors arising from ambiguous author profiles. One basic search can be set up in a few command lines and the average time of computation goes between 60 and 300 minutes. We discuss the functioning, characteristics, limitations and possible extension of the software.

External Link (DOI)

Also published as: Max Planck Institute for Innovation & Competition Research Paper No. 20-20