Study  |  01/26/2023

Allegations of Sexual Misconduct, Accused Scientists, and Their Research

Does the scientific community sanction sexual misconduct? While scientific work, according to Merton’s norm of universalism, should be judged regardless of who created it, the scientific community should also encourage “good citizenship” to promote an inclusive environment. The findings of a new study raise a number of ethical questions that the scientific community will need to answer going forward. 

The goal of science is to produce knowledge. To facilitate the prolificacy of the process, science is organized around a set of principles known as the “Mertonian Norms”. One tenet, among others, is that ideas are evaluated on their own merit, regardless of who created them. Yet, at the same time, science is also a social system, and the community of scientists may rely on additional norms to create an inclusive environment and to regulate itself. Sometimes, these norms are in conflict.


There is evidence that the community gives less attention to (i.e., cites less) the works of scientists who had one of their articles retracted. Such a penalty may be conceived as consistent with the Mertonian norms, as a retraction casts doubt on the validity of one’s work. However, applying a similar penalty to the contributions of scientists who have egregiously violated social norms runs afoul.


In a new study, Rainer Widmann, Michael E. Rose and Marina Chugunova try to answer the question of whether the scientific community not only sanctions “bad science”, but also “bad citizenship”. They focus on sexual misconduct, which sadly is a prevalent form of social norm violation in academia as in other fields.


In their analysis, they track citations to articles of alleged perpetrators that were published prior to allegations, and compare them to the citations received by other articles that stem from the same journal issue. They find that the scientific community cites prior work of alleged perpetrators less after allegations of sexual misconduct surface. Researchers that are very close to the perpetrator in the coauthorship network (e.g., former coauthors) react the strongest and reduce their citations the most. Comparing the results of the new study to previously found citation penalties for scientific misconduct, the magnitudes appear similarly sized. Finally, the authors document that alleged perpetrators face palpable career consequences: they publish and collaborate less following the allegations, and they are more likely to quit academic research altogether.


There may be several reasons why authors may withhold citations. First, they may do so to penalize. This occurs even when there is a cost associated with punishing, which in the present context would be the deviation from the usual norm in referencing relevant prior work. Second, authors may not cite to avoid being seen as condoning sexual misconduct. This motive may be particularly relevant for researchers who are close to the alleged perpetrator. Third, peers may not separate academic and non-academic misconduct, or perceive that misconduct in the two domains is correlated.


The present study is the first to provide systematic evidence on the consequences of sexual misconduct for perpetrators. The findings raise a number of ethical questions that highlight the tension between the advancement of knowledge and the advancement of science as a social institution. Is the decline in citations to the perpetrator’s body of prior work an undue distortion of the scientific process or an appropriate penalty? Is the loss of scientific output due to excluding or penalizing alleged perpetrators acceptable? Are the documented career consequences adequate, taking also into account possible deterrence benefits for (future) victims? The results of the new study provide a new basis for a discussion of these important issues.


Further information:


Directly to the publication Allegations of Sexual Misconduct, Accused scientists, and Their Research

Max Planck Institute for innovation and Competition Research Paper No. 22-18

Michael E. Rose scanning a directory
Study  |  07/22/2022

How Amateur Genealogists Support Research – A Citizen Science Project

Together with Germany’s largest association for family research, the Verein für Computergenealogie, the Institute is conducting a digitization project to collect data with the help of amateur genealogists. The data from over 100 volumes of annual directories of writings published at German universities and higher education institutions open up many new, exciting research questions.

Michael E. Rose scanning a directory
Senior Research Fellow Michael E. Rose, Ph.D., scanning a directory
Eintrag der Dissertation von Fritz Haber in der Bearbeitungsmaske
Entry of the dissertation of Fritz Haber in the editing mask
Original entry of the dissertation by Hilde Mangold
Original entry of the dissertation by Hilde Mangold

Citizen Science thrives on the interaction between citizens and researchers. The interest in cooperation is steadily growing. Well-known projects in the environmental area include bird counting and bee observation.


Since December 2021, the MPI for Innovation and Competition has now been cooperating with the German Association for Computer Genealogy (CompGen) in a data project to record the annual directories of publications at German universities and higher education institutions. The directories, which were published between 1885 and 1987, first by the Royal Library in Berlin and later by the German Library in Leipzig, cover 103 volumes. They list mainly dissertations and postdoctoral theses written at German universities and higher education institutions. Afterwards, the directories were discontinued in this form. A digital continuation failed.


For citizens who conduct genealogical or family research, the lists, some of which contain rich biographical information, are interesting because they hope to meet ancestors, bearers of the same name or people from their town or region. Birgit Casper, who is working on the project, reports on her motivation for collecting the data: “I know two doctors in my family. Of one, born in 1891, I know pretty much where he studied and that he submitted his dissertation ‘On cases of poisoning with American worm seed oil’ to the medical faculty in Rostock in 1920. Of the other, born in 1892, I only know where he practiced medicine as of 1924. I do not know where he studied, nor when and on what subject he did his doctorate. Here I am waiting for the corresponding volume.”


For scientists, the lists are intriguing because they provide a complete overview of researchers who were educated at German universities since 1885 and some of whom were internationally important. Since German universities were internationally leading in almost all disciplines at the turn of the century, the project promises particularly interesting insights. We find the dissertations of numerous later Nobel Prize winners, such as Walther Nernst, who received the Nobel Prize in Chemistry in 1920 and was on the board of directors of the Kaiser Wilhelm Institute (KWI) for Physics, as well as Werner Heisenberg, who later gave his name to the subsequent MPI for Physics, and also Maria Goeppert-Mayer, who was awarded the Nobel Prize in Physics in 1963 as the first female German Nobel Prize winner.


The first regular doctorate for a woman will also be found in the lists. In fact, women were severely underrepresented at first. Only a few were allowed to earn a doctorate before 1900, and only with special permission. It was only between 1901 and 1908 that the German states successively admitted women to their universities. The right to pursue doctoral studies, however, was awarded by the faculties themselves. A systematic recording of all dissertations will thus generate a complete overview of when, at the latest, women were allowed to earn a doctorate at which universities and faculties. The right to habilitation – the path to professorship – was given to them even later: Here, too, the lists can help shed light on the situation.


How does the collaboration between science and amateurs work in the project?


To find volunteers who want to work on the project, CompGen publishes calls and updates on Twitter and in the blog on the Compgen website. On a special wiki page for the project, volunteers can register, learn about the editorial guidelines, and start editing data right away.


Michael E. Rose, Senior Research Fellow at the MPI for Innovation and Competition, who leads the project and is active in the field of Science of Science, is gradually scanning the directories.


Then the lists are captured with a text recognition program and roughly segmented: What are first name, last name, title of the dissertation, the date of the defense, other details? The volunteers use the infrastructure provided by the association (input mask and data repository) to proofread the entries and add to them manually. The entered records are immediately available for search queries. So far, seven annual directories have been processed. After completion of the project, the lists, which are interlinked, for example, to the German National Library, Wikipedia, and Scopus, a multidisciplinary abstract and citation database for research literature, will be publicly available as research data.


One of the best-known personalities recorded so far is Fritz Haber, who, as founding director, headed the KWI for Physical Chemistry and Electrochemistry in Berlin for 22 years, which is now named after him. His dissertation, „Ueber einige Derivate des Piperonals“ (On some derivatives of piperonal), a fungicidal fragrance, is found in volume VI (1890/91). Fritz Haber received the Nobel Prize in 1919, awarded for the year 1918, for his research on the catalytic synthesis of ammonia, i.e., in a different field of research from his dissertation.


Max von Laue, on the other hand, who completed his doctorate with Max Planck in 1903 „Über die Interferenzerscheinungen an planparallelen Platten“ (On the interference phenomena in plane-parallel plates), continued to pursue the research begun with his dissertation – until he was awarded the Nobel Prize in 1914 for his work on X-ray interferences.


However, not all doctoral graduates were able to receive the recognition they deserved. Hilde Mangold’s research in the field of embryology led to a Nobel Prize in 1935 for her doctoral advisor Hans Spemann, who was director at the KWI for Biology in Berlin-Dahlem during the First World War. Mangold herself died in a fire shortly after defending her dissertation in 1924. After all, the prize-winning discovery, the Spemann organizer, is sometimes called the Spemann-Mangold organizer.


Due to their depth of detail and completeness, the data digitized in the project allow for numerous exciting research questions. Can we read problems of an era from law dissertations? How do the demographics and social origins of doctoral students change over time and at individual universities? Who were the women who were the pioneers in earning a doctorate? What is the relationship between dissertations and patent activity?


However, before that, the dataset must be completed, and every hand and pair of eyes is still needed to accomplish this. More information under https://wiki.genealogy.net/Hochschulschriften.

Study  |  05/01/2022

Ruled by Robots – How Do Humans Perceive Technology-Assisted Decision-Making?

Algorithms and Artificial Intelligence (AI) have become an integral part of decision-making. Would people prefer to have moral decisions that affect them made by a human or an algorithm? In a new study, this and other questions were investigated in a laboratory experiment.

As technology-assisted decision-making becomes more prevalent, it is important to understand how the algorithmic nature of the decision-maker influences how affected people perceive these decisions. The application of algorithmic aids spans from prediction decisions of various kinds, for example, whom to hire, and what salary to offer, to moral decisions with no objectively correct solution, such as how to distribute a bonus within a team fairly.


The authors Marina Chugunova and Wolfgang J. Luhan (University of Portsmouth) use a laboratory experiment to study the preference for human or algorithmic decision-makers in redistributive decisions. Redistributive decisions can be seen as a type of moral decisions, where the definition of correct or fair depends on the observer’s personal ideals and beliefs. In particular, the authors consider whether an algorithmic decision-maker will be preferred because of its unbiasedness. Defining which decision-maker is preferred and whose decisions are perceived to be fairer can potentially improve the acceptance of decisions or policies, and with it, the compliance.


The Experiment


In the experiment, the main aim was to create a situation where participants’ preference for either a human or an algorithmic decision-maker to redistribute income was observable. First, participants individually earned their initial income by completing three tasks. The three tasks mimicked three potential determinants of income that are central to major fairness theories: luck, effort and talent. Then, the players were matched into pairs and had to choose a decision-maker: either an algorithm or a third party human. The decision-maker decided how to redistribute the total earnings of the pair between the two members. To test the role of bias, a laboratory-induced source of potential discrimination for the human decision-maker was introduced. Then, the participants learned the decision and had to report their satisfaction and their rating of how fair a particular redistribution decision was.


The Findings


Contrary to previous findings, the authors find that the majority of participants ‒ with over 60% ‒ prefer the algorithm as a decision-maker over a human. Yet, this is not driven by concerns over biased decisions of a human. Despite the preference for algorithmic decision-makers, the decisions made by humans are regarded more favorably. Subjective ratings of the decisions are mainly driven by own material interests and fairness ideals. As far as fairness ideals are concerned, the players in the experiment show a remarkable flexibility: they tolerate any explainable deviation between the actual decision and their own ideals. They are satisfied and consider any redistribution decision that follows fairness principles to be fair, even if it does not correspond to their own principles. Yet, they react very strongly and negatively to redistribution decisions that do not fit any fairness ideals.


The Conclusion


The results of the study suggest that even in the realm of moral decisions algorithmic decision-makers might be preferred over human decision-makers, but the actual performance of the algorithm plays an important role in how the decisions are rated. To “live up to the expectations” and increase the acceptance of these AI decisions, the algorithm has to consistently and coherently apply fairness principles.


Directly to the publication of the study:


Marina Chugunova, Wolfgang J. Luhan
Ruled by Robots: Preference for Algorithmic Decision Makers and Perceptions of Their Choices
Max Planck Institute for Innovation & Competition Research Paper No. 22-04

Study  |  02/04/2022

Does Copyright in the Academic Sector Need to be Redefined?

In a new comparative study, Valentina Moscon and Marco Bellia examine the copyright regulations for academic publishing in Italy, Germany and the USA. In their work, they introduce well-known approaches and arrive at a proposal to make the scientific publication system fairer and more efficient.

A long-running discussion about copyright in academic publishing has shown the role of copyright and its dysfunctional effects.


The interests of commercial publishers and other information providers differ from those of academic authors, with the former usually pursuing a strategy of profit maximization, while the latter want to ensure broad access, open and timely dissemination, and reuse of scientific results. Moreover, third parties usually fund research, so academic authors do not primarily rely on income from publications – researchers publish primarily to enhance their reputation and advance their careers.


In this context, the contribution by Valentina Moscon (Max Planck Institute for Innovation and Competition) and Marco Bellia (Università Cattolica del Sacro Cuore) draws attention to new models that promise a fairer and more efficient scholarly publishing system. After reviewing the legal background in Italy, Germany, and the United States, the authors consider various possible interventions, some of which have already been adopted at the national level. These measures may be private interventions, such as university contracts and policies, or public, i.e., legislative interventions. The latter include measures outside or inside the copyright system.


“International Instrument” as a model for a fair solution


The authors conclude that the best solution is to redefine the boundaries of copyright by broadening the scope of permitted uses while defining them more precisely. This would lead to a more balanced functioning of the academic publishing system. One proposal in this direction comes from a group of copyright experts, including Valentina Moscon, who have drafted the International Instrument on Permitted Uses in Copyright Law. This instrument, conceived in the form of an international treaty, aims to create a more balanced system for the scope of international copyright protection. Among other provisions, it contains explicit rules for permissible uses in academia, including uses in the context of research, data analysis, educational purposes, and for the processing of works by libraries, museums, and archives.


To the publication:

Marco Bellia, Valentina Moscon
Academic Authors, Copyright and Dissemination of Knowledge: A Comparative Overview
Max Planck Institute for Innovation & Competition Research Paper No. 21-27

Study  |  10/01/2021

What Do Lab Disasters Tell Us about the Importance of Physical Capital in Knowledge Production?

Prior research has largely focused on the important role of human capital in the production of knowledge. Now, a new study investigates the role of physical capital in knowledge production using lab disasters, like explosions, fires, and floods, as a natural experiment. The results provide important insights for science and innovation policy.

The authors establish the importance of physical capital in knowledge production. To this end, they exploit adverse events (explosions, fires, floods, etc.) at research institutions as exogenous physical capital shocks. Scientists experience a substantial and persistent reduction in research output if they lose specialized physical capital, that is, equipment and material they created over time for a particular research purpose. In contrast, they quickly recover if they only lose generic physical capital. Affected scientists in older laboratories who presumably lose more obsolete physical capital, are more likely to change their direction of research and recover in scientific productivity. These findings suggest that a scientist’s investments into their own physical capital yield lasting returns but also create path dependence in relation to research direction.


The study suggests that science and innovation policy should give more consideration to the role of physical capital in knowledge production.


Directly to the publication by
Stefano Baruffaldi and Fabian Gaessler
The Returns to Physical Capital in Knowledge Production: Evidence from Lab Disasters
Max Planck Institute for Innovation & Competition Research Paper No. 21-19

Study  |  08/03/2021

Protection of Geographical Indications: Further Steps in the GI Research Agenda of the Institute

Since the effects of Geographical Indications (GIs) have not been sufficiently researched so far, the Institute launched a GI Initiative in 2018. Since then, a Research Group has been investigating different approaches to the protection of GIs in the European Union and in Latin American countries.

Geographical Indications are designations for products from certain geographical areas that owe their quality or reputation to their geographical origin. Because they indicate specific products’ qualities, they tend to receive more attention in the market and command higher prices. Thus, Geographical Indications are important tools to promote economic development, especially in rural areas.


Despite increased attention to GIs from politics as well as from the economy, there has been little legal research on the topic so far. The Institute, which has been researching the subject for many years, in 2018 launched a research agenda that is dedicated to exploring Geographical Indications in-depth. The initiative looks in two directions: the overall functioning of the GI system in the European Union and the potential of GI systems in Latin America.


Overall EU System Assessment


A unitary protection scheme for Geographical Indications for agricultural products and foodstuffs has existed in the European Union since 1992. Two types of Geographical Indications are distinguished: So-called Protected Geographical Indications (PGIs) and Protected Designations of Origin (PDOs). Both types of designations enjoy the same scope of protection, but have different registration and maintenance requirements.


Although the European GI system has proven itself in practice, there is a need to better understand its overall functioning during the last three decades. For that purpose, a Research Team consisting of five scholars undertook a comprehensive quantitative and qualitative analysis of available data.


First, the team conducted a statistical analysis of all PGIs and PDOs registered between 1996 and 2019 under the EU protection scheme for agricultural products and foodstuffs. The data source for this analysis was the so-called “Single Document”. As the core of every protection application it includes, inter alia, a description of the definition of the geographical area, a description of the method of production and details on the so-called origin link - that is the causal link between the product and the geographical area. Further research on bakery products and potatoes from selected countries also dealt with the full specification. The research revealed a significant improvement in the quality and accuracy of information provided in these documents, in particular about the link between the geographical region and the product.


Though the requirements for obtaining GI protection and the main procedural rules are unified within the EU, the national authorities are also involved in the registration process. Further investigation on selected countries’ national rules and related procedures revealed that national approaches and idiosyncrasies could impair the functioning of a uniform protection system.


In  2018  the  European  Commission  announced  its  intention  to  extend  the  current  EU GI protection system to non-agricultural products. However, for the time being, protection for those products is only granted at the national level. In anticipation of an EU proposal, the researchers looked into some of the national protection schemes, in order to investigate whether the current EU system with its distinction of PGIs and PDOs would also be a good fit for the non-agri sector. The findings indicate that an expansion of the current system - with some procedural adaptations - might work.


As the next step of the Research Project, it is planned - in cooperation with the University of Alicante and the EUIPO - to investigate the interface between the GI system and the trademark system, including collective and certification marks.


Survey on GI Systems in Latin America


Because of their potential to promote economic and social development, quality differentiation systems are particularly important for Latin American countries. Origin-based production, including manufacturing, handicraft, and especially food production is essential to their economies, in particular, to small producers, craftspeople, and family farmers. In this regard, though many products from Latin America are well-suited for GI protection, integration of local needs, cultural tradition, and social aspects require further research. To identify other available distinctive signs, and to investigate the interface, strengths and weaknesses of each one would help to better understand the overall system.


Moreover, the fact that the protection of GIs has been increasingly the subject of Free Trade Agreements (FTAs) involving Latin American countries may restrict their leeway for the determination of national and regional policies. Thus, further investigation on FTA commitments may reveal implementation challenges at the national level.


The Institute’s “Smart IP for Latin America” (SIPLA) Initiative, launched in 2018, defined GIs as an area in need of investigation. Therefore, the first step within the SIPLA research project on “Collective Distinctive Signs” was an investigation of GIs in a comparative legal assessment of the systems of nine selected countries in the region. Because of the amount of information the team was interested in, a comprehensive questionnaire was designed by the SIPLA team and was answered by representatives from Argentina, Brazil, Chile, Colombia, Costa Rica, Mexico, Paraguay, Peru, and Uruguay. The questionnaire was focused mainly on the GI protection systems and other distinctive signs available. It included a request for information regarding national, regional and local legislation if applicable, and case law.


A comprehensive “General Comparative Report” has been built on the information obtained via the questionnaire and the analysis of the FTAs signed by the selected countries. Finally, common elements have been identified in the different national and regional systems. From those elements, at least two possible areas of research emerge to be developed in the future. The first one refers to distinctive signs different from GIs - especially signs for collective use that can benefit family farmers and small producers. The second one considers further research on GIs’ level of protection focusing on the incorporation of TRIPS standards and FTAs commitments at the national and the regional level.


More information on the Research Initiative can be found in the ePaper of the actual Activity Report.

Study  |  07/29/2021

Truly Standard-Essential Patents? An Automated Semantics-Based Analysis

The identification of standard-essential patents (SEPs) poses a considerable challenge for scholars, practitioners, and policymakers. A new study introduces a semantics-based approach to evaluate the claimed standard essentiality of declared patents.

Automated analysis of text similiarity between patents and standards

SEPs have become a key element of technical coordination in standard-setting organizations. Yet, it remains unclear whether a declared SEP is truly standard-essential. Strategic incentives may influence patent holders in their decision to claim standard essentiality. This may cause legal and contractual frictions during standard-setting and subsequent licensing negotiations. The new study by Lorenz Brachtendorf, Fabian Gaessler and Dietmar Harhoff addresses this issue and introduces an automated semantics-based method to approximate the standard essentiality of patents.


Manual assessments of SEPs typically require substantial technical knowledge and effort. In contrast, the introduced method is simple and inexpensive in use. The scalable, objective, and replicable approach allows for various practical applications. The authors illustrate its usefulness in estimating the share of true SEPs in firm patent portfolios for several mobile telecommunication standards. The results reveal firm-level differences that are statistically significant and economically substantial.


Beyond practical applications, the method may also provide insights of policy relevance. For instance, it can be used to examine whether certain policies achieve their goal of mitigating patent-related frictions in the standard-setting and implementation process.


This study makes an important contribution towards facilitating the ex ante coordination between technology contributors and implementers of technical standards. This is of particular relevance as standardized solutions for the information and communications technologies have become an important aspect of technological innovation and are ubiquitous in many industries of our economy. The study will be presented soon at the USPTO 14th Annual Conference on Innovation Economics and at EPIP 2021.


See the project poster.


See the detailed project description in the Activity Report 2018 - 2020.


Hear the EPO Podcast – Talk Innovation “Research into Patents – Drilling Deeper on the Standard Essentiality of Patents” with Dietmar Harhoff.


Publications


Brachtendorf, Lorenz; Gaessler, Fabian; Harhoff, Dietmar (2020). Approximating the Standard Essentiality of Patents – A Semantics-Based Analysis. Final Report for the European Patent Office Academic Research Programme.


Brachtendorf, Lorenz; Gaessler, Fabian; Harhoff, Dietmar (2020). Truly Standard-Essential Patents? A Semantics-Based Analysis. CEPR Discussion Paper No. DP14726 and CRC Discussion Paper No. 265.

Research Group “Regulation of the Digital Economy”, Position Statement „Artificial Intelligence and Intellectual Property Law“ Max Planck Institut für Innovation und Wettbewerb, Reto M. Hilty, Josef Drexl, Daria Kim
Study  |  04/21/2021

Research Group Develops Analysis on Artificial Intelligence and IP Rights

The increasing use of Artificial Intelligence (AI) has the potential to alter the parameters of the existing IP system. In an in-depth study, a Research Group of the Institute’s legal departments presents a broad overview of issues arising at the intersection of AI and IP law.

Research Group “Regulation of the Digital Economy”, Position Statement „Artificial Intelligence and Intellectual Property Law“ Max Planck Institut für Innovation und Wettbewerb, Reto M. Hilty, Josef Drexl, Daria Kim
The Research Group “Regulation of the Digital Economy” is investigating the effects of Artificial Intelligence on Intellectual Property Law, Photo: Myriam Rion

The more Artificial Intelligence (AI) shapes the digital economy, the more insistently questions arise on the interplay of AI and intellectual property rights. To fully realize its potential for fostering innovation and welfare, AI needs an appropriate legal framework, which also includes property rights.


So far, the political and legal discussion has focused primarily on the output; more precisely, what is generated by the use of, or at least with the support of, Artificial Intelligence. To evaluate whether the existing IP system can still fulfill its function within the parameters of this fast-moving technology, a more holistic view is necessary. Particular consideration must be given to the individual steps of an AI-driven innovation cycle in which IP rights may play a role.


Comprehensive analysis


Against this backdrop, the Research Group “Regulation of the Digital Economy” of the Institute’s legal departments led by the two Directors Josef Drexl and Reto M. Hilty has developed a comprehensive analysis. The paper identifies potential issues that could arise at the intersection of Artificial Intelligence and IP rights and introduces different directions in which solutions can be found.


The structure of the analysis is based on the three levels that need to be distinguished with regard to innovation or creation processes. First, issues related to the input required for the development of AI systems are investigated. The second part of the paper examines protection of AI Tools, while the third part focuses on property rights for AI-generated or AI-aided output.


Focus von European IP law


The analysis focuses on substantive European IP law, in particular on copyright, patents and designs, as well as on the sui-generis protection for databases and the protection of trade secrets. The latter can already play a role on the input side, but are especially important with regard to AI as a tool since the traditional IP systems hardly appear to be suitable for the particularities that need to be considered. However, property rights play a role primarily with regard to what is generated using AI; this also includes aspects such as the allocation of rights and, if applicable, the scope of protection.


The paper builds on insights that the Research Group has already gained in previous studies, especially with regard to the technical context. On this basis, it identifies those questions that require further – especially interdisciplinary – research. Overall, the paper emphasizes the need for a more holistic view, especially with regard to the fact that various IP rights play a role and may overlap in IP-driven innovation or creation.


The complete Position Statement “Artificial Intelligence and Intellectual Property Law” can be found here.

Study  |  02/26/2021

Find Your Academic Doppelgänger! How to Build Scientists Control Groups With Sosia

Econometric analysis in Economics of Science and Innovation often requires control groups. The identification of such a population often constitutes a daunting data effort. The python package sosia simplifies and automates the search in the Scopus database.

Michael E. Rose und Stefano H. Baruffaldi

Econometric analysis in Economics of Science and Innovation often requires control groups. These control groups need to have similar observable characteristics to a sample of researchers of interest. There are specific methodologies and tools to assist Econometricians in the matching exercise. However, the identification of such a population often constitutes a daunting data effort, which may turn impossible for samples of scientists spanning multiple fields, institutions, or countries. The python package sosia – Italian for Doppelgänger – intends to simplify and automate the search for comparable researchers in the Scopus database.


See the publication by

Michael E. Rose and Stefano H. Baruffaldi
Finding Doppelgängers in Scopus: How to Build Scientists Control Groups Using Sosia
Max Planck Institute for Innovation & Competition Research Paper No. 20-20

Study  |  07/28/2020

Identifying and Measuring Artificial Intelligence – Making the Impossible Possible

Researchers of the Institute and the OECD have published a new study on how to identify and measure AI-related developments in science, algorithms and technologies. Using information from scientific publications, open source software (OSS) and patents, they find a marked increase in AI-related developments over recent years. The growing role of China in the AI space emerges throughout.

Artificial Intelligence (AI) is a term commonly used to describe machines performing human-like cognitive functions (e.g., learning, understanding, reasoning, and interacting). AI is expected to have far-ranging economic repercussions, as it has the potential to revolutionize production, to influence the behavior of economic actors and to transform economies and societies.


The vast potential of this (now considered) general purpose technology has led OECD countries and G20 economies to agree on key principles aimed at fostering the development of ethical and trustworthy AI. The practical implementation of such principles nevertheless requires a common understanding of what AI is and is made of, in terms of both scientific and technological developments, as well as possible applications.


Addressing the challenges inherent in delineating the boundaries of such a complex subject matter, the study proposes an operational definition of AI, based on the identification and measurement of AI-related developments in science, algorithms and technologies. The analysis draws on information contained in scientific publications, open source software and patents.
 

Approach of the study


The three-pronged approach of the study relies on an array of established bibliometric and patent-based methods, and is complemented by an experimental machine learning (ML) approach implemented on purposely collected open source software data:
 

  • The identification of the science behind AI developments builds on a bibliometric two-step approach, whereby a first set of AI-relevant keywords is extracted from scientific publications classified as AI in the Elsevier’s Scopus® database. This set is then augmented and refined using text mining techniques and expert validation.
  • As AI is ultimately implemented in the form of algorithms, the authors use open-source software’s information about software commits (i.e., contributions) posted on GitHub (an online hosting platform) to track AI-related software developments and applications. Such data are combined with information from papers presented at key AI conferences to identify “core” AI repositories. Machine learning techniques trained using information for the thus identified core set are used to explore the whole set of software contributions in GitHub to identify all AI-related repositories.
  • Information contained in patent data serves to identify and map AI-related inventions and new technological developments embedding AI-related components. Text mining techniques are used to search abstracts and patent documents referring to AI-related papers.
     

Selected findings of the study
 

  • The authors find an acceleration in the number of publications in AI in the early 2000s, followed by a steady growth of 10% a year on average until 2015, before accelerating again at a pace of 23% a year since then. The share of AI-related publications in total publications increased to over 2.2% of all publications in 2018.
  • 28% of the world AI-related papers published in 2016-18 belongs to authors with affiliations in China. Over time, the share of AI publications originating from EU28, the United States and Japan has been decreasing, as compared to the levels observed ten years earlier.
  • Since 2014, the number of open-source software repositories related to AI has grown about three times as much as the rest of open-source software.
  • There is a marked increase in the proportion of AI-related inventions over the total number of inventions after 2015. This ratio averaged to more than 2.3% in 2017.
  • “Neural networks” and “image processing” are the most frequent terms appearing in the abstracts of AI-related patents.
  • In AI-related patents, the contribution of China-based inventors multiplied more than six fold since the mid-2000s, reaching nearly 13% in the mid-2010s.

For more facts and detailed information, see the publication:
 

Stefano Baruffaldi, Brigitte van Beuzekom, Hélène Dernis, Dietmar Harhoffi, Nandan Rao, David Rosenfeld, Mariagrazia Squicciarini (2020).
Identifying and Measuring Developments in Artificial Intelligence: Making the Impossible Possible.
OECD Science, Technology and Industry Working Papers No. 2020/05.


Stefano Baruffaldi is Affiliated Research Fellow in the department Innovation and Entrepreneurship Research and Assistant Professor at the University of Bath.

Dietmar Harhoff is director at the Max Planck Institute for Innovation and Competition.