Michael J. Andrews
I provide a primer on six recent large‐scale historical patent data sets for use in innovation research. I discuss how each data set is constructed, the types of patent information included in each, and the quality and completeness of each. Throughout, I emphasize when our knowledge of the history of invention is dependent on the data source used and provide recommendations about which data set is most likely to be best for different contexts. Overall, these data sets paint a remarkably consistent picture of the history of U.S. invention. When the data sets do disagree, these differences tend to be minor, although I highlight some important exceptions. I further describe several “niche” historical patent data sets that allow researchers to study institutional contexts that cannot be studied using modern data. Finally, I discuss features of patent data that are not available for the historical patents but are available for modern patents.