Understanding the factors affecting the direction of innovation is a central aim of research in the economics and strategic management of innovation. Progress on this topic has been inhibited by difficulties in measuring the location and movement of innovation in ideas space. We introduce and explore an approach based on an unassisted machine learning technique, Hierarchical Dirichlet Process (HDP), that flexibly generates categories from a corpus of text and enables calculations of the distance and movement in ideas space. We apply our algorithm to patent abstracts from the period 2000-2018 and demonstrate that, relative to the USPTO taxonomy of patent classes, our algorithm provides a leading indicator of a shift in innovation topics and enables a more precise analysis of movement in ideas space.
Contact Person: Michael E. Rose