Here is a small amount of text that is visible by default.
Read more
Here is the larger amount of text that will be revealed when "Read more" is clicked.
Read less
Revise & Resubmits
"Scientific Talent Leaks Out of Funding Gaps". R&R at The Review of Economics and Statistics (with Stephanie Cheng, Elisabeth Perlman, and Wei Yang Tham)
Abstract We study how delays in NIH grant funding affect the career outcomes of research personnel. Using comprehensive earnings and tax records linked to university transaction data along with a difference-in-differences design, we find that a funding interruption of more than 30 days has a substantial effect on job placements for personnel who work in labs with a single NIH R01 research grant, including a 3 percentage point (40%) increase in the probability of not working in the US. Incorporating information from the full 2020 Decennial Census and data on publications, we find that about half of those induced into nonemployment appear to permanently leave the US and are 90% less likely to publish in a given year, with even larger impacts for trainees (postdocs and graduate students). Among personnel who continue to work in the US, we find that interrupted personnel earn 20% less than their continuously-funded peers, with the largest declines concentrated among trainees and other non-faculty personnel (such as staff and undergraduates). Overall, funding delays account for about 5% of US nonemployment in our data, indicating that they have a meaningful effect on the scientific labor force at the national level.
Coverage: Marginal Revolution; Good Science Project; Nature
Other Versions: NBER Summer Institute Science of Science Funding (July 19, 2024); Center for Economic Studies (CES) Working Paper Series; NBER Summer Institute Innovation (July 19, 2022)
Publications
"Cutting the Innovation Engine: How Federal Funding Shocks Affect University Patenting, Entrepreneurship, and Publications". (2023) The Quarterly Journal of Economics, https://doi.org/10.1093/qje/qjac046 (with Tania Babina, Alex He, Sabrina Howell, and Elisabeth Perlman)
Abstract This paper studies how federal funding affects the innovation outputs of university researchers. We link person-level research grants from 22 universities to patent, publication, and career outcomes from the U.S. Census Bureau. We focus on the effects of large, idiosyncratic, and temporary cuts to federal funding in a researcher's pre-existing narrow field of study. Using an event-study design that controls for principal investigator fixed effects, we document that these negative federal funding shocks reduce high-tech entrepreneurship and publications but increase patenting. The lost publications tend to be higher quality and more basic, while the additional patents tend to be lower quality, less general, and more often privately assigned. Overall, the federal funding cuts push researchers away from more open research with greater impact on future knowledge, and towards more subsequently appropriated research. The level of funding explains the effects on publications, while the source of funding—federal vs. private—appears to play an important role in the effects on high-tech entrepreneurship and patents. Together with evidence from industry contracts, the results suggest that shifting university research funding from federal to private sources leads to more appropriation of intellectual property by corporate sponsors.
Other Versions: NBER WP (December 2020), SSRN (November 2020), NBER Summer Institute Science of Science Funding (July 16, 2020), SSRN (March 2020)
"Publish or Perish: Selective Attrition as an Unifying Explanations for Patterns in Innovation over the Career". (2023) The Journal of Human Resources, https://doi.org/10.3368/jhr.59.2.1219-10630R1 (with Huifeng Yu, Gerald Marschke, Matthew Ross, and Bruce Weinberg)
Abstract For nearly 150 years, researchers from many disciplines have studied how the quantity and quality of research output varies over the career but have reached conflicting answers. A definitive answer directly informs federal research policy and increases our understanding of a broader set of questions related to human capital accumulation over the career. Here, we study 5.6 million articles published in biomedical science between 1980 and 2009 and use rich characterizations of citations and text to measure the quality of articles. We provide evidence of selective attrition whereby, low "ability" researchers stop publishing at earlier stages of their career, leaving high "ability" researchers to produce a growing share of publications. We find that controlling for selective attrition reconciles the long-standing conflicts in the existing literature. Specifically, the quality of research declines monotonically over the career for the average researcher but this decline is masked in the cross-section because the authors publishing at later ages are the ones who produce the highest quality research. Our results have implications for efforts to shift funding from late- to early-career researchers – while such policies will provide more funding to researchers at the point when they are most creative, they must be undertaken carefully because young researchers are less "able" on average.
"Academic Entrepreneurship and Inequality: Evidence from Administrative Data". (2022) Proceedings of the 17th European Conference on Innovation and Entrepreneurship, Volume 17, Number 1, https://doi.org/10.34190/ecie.17.1.839
Abstract Over the past several decades, universities have increasingly emphasized knowledge and technology transfer. Faculty are key agents facilitating this transfer, engaging in commercial and entrepreneurial activities such as, consulting, student placement, patenting, and the founding of start-ups. This paper documents the prevalence of faculty commercial engagement as well as the extent to which it widens earnings inequality among faculty. In contrast to previous work that uses surveys with low response rates to measure the commercial engagement of university faculty, this paper uses detailed administrative data from universities (UMETRICS) linked to confidential earnings data at the Internal Revenue Service (IRS) and U.S. Census Bureau (including the universe of W2 and 1099 tax records) to analyze how often university faculty engage in the types of commercial and entrepreneurial activity that catalyze knowledge/technology transfer.
"Mandating Access: Assessing the NIH's Public Access Policy". (2020) Economic Policy, Volume 35, Issue 102, April 2020, Pages 269–304, https://doi.org/10.1093/epolic/eiaa015
Other Versions: Munich Personal RePEc Archive (MPRA) (February 2019), SSRN (August 2018), Munich Personal RePEc Archive (MPRA) (November 2017)
Abstract In April 2008, the National Institutes of Health (NIH) implemented the Public Access Policy (PAP), which mandated that the full text of NIH-supported articles be made freely available on PubMed Central -- the NIH's repository of biomedical research. This paper uses 600 thousand NIH articles and a matched comparison sample to examine how the PAP impacted researcher access to the biomedical literature and publishing patterns in biomedicine. Though some estimates allow for large citation increases after the PAP, the most credible estimates suggest that the PAP had a relatively modest effect on citations, which is consistent with most researchers having widespread access to the biomedical literature prior to the PAP, leaving little room to increase access. I also find that NIH articles are more likely to be published in traditional subscription-based journals (as opposed to "open access" journals) after the PAP. This indicates that any discrimination the PAP induced, by subscription-based journals against NIH articles, was offset by other factors -- possibly the decisions of editors and submission behavior of authors.
"Occupational Classifications: A Machine Learning Approach". (2019) Journal of Economic and Social Measurement 44(2-3): 10.3233/JEM-190463 (with Akina Ikudo, Julia Lane, and Bruce Weinberg)
Other Versions: NBER WP (August 2018), Center for Economic Studies (CES) WP Series (August 2018), SSRN (August 2018)
Abstract Characterizing the work that people do on their jobs is a longstanding and core issue in labor economics. Traditionally, classification has been done manually. If it were possible to combine new computational tools and administrative wage records to generate an automated crosswalk between job titles and occupations, millions of dollars could be saved in labor costs, data processing could be sped up, data could become more consistent, and it might be possible to generate, without a lag, current information about the changing occupational composition of the labor market. This paper examines the potential to assign occupations to job titles contained in administrative data using automated, machine-learning approaches. We use a new extraordinarily rich and detailed set of data on transactional HR records of large firms (universities) in a relatively narrowly defined industry (public institutions of higher education) to identify the potential for machine-learning approaches to classify occupations.
"High-impact and transformative science (HITS) metrics: Definition, exemplification, and comparison". (2018) PLoS ONE 13(7): e0200597. 10.1371/journal.pone.0200597 (with Yu H, Light RP, Marschke G, Börner K, Weinberg BA)
Abstract Countries, research institutions, and scholars are interested in identifying and promoting high-impact and transformative scientific research. This paper presents a novel set of text- and citation-based metrics that can be used to identify high-impact and transformative works. The 11 metrics can be grouped into seven types: Radical-Generative, Radical-Destructive, Risky, Multidisciplinary, Wide Impact, Growing Impact, and Impact (overall). The metrics are exemplified, validated, and compared using a set of 10,778,696 MEDLINE articles matched to the Science Citation Index ExpandedTM. Articles are grouped into six 5-year periods (spanning 1983–2012) using publication year and into 6,159 fields constructed using comparable MeSH terms, with which each article is tagged. The analysis is conducted at the level of a field-period pair, of which 15,051 have articles and are used in this study. A factor analysis shows that transformativeness and impact are positively related (ρ = .402), but represent distinct phenomena. Looking at the subcomponents of transformativeness, there is no evidence that transformative work is adopted slowly or that the generation of important new concepts coincides with the obsolescence of existing concepts. We also find that the generation of important new concepts and highly cited work is more risky. Finally, supporting the validity of our metrics, we show that work that draws on a wider range of research fields is used more widely.
Book Chapters
"Automating Response Evaluation for Franchising Questions on the 2017 Economic Census". 2022. Big Data for 21st Century Economic Statistics (with Yifang Wei, Lisa Singh, Shawn Klimek, J. Bradford Jensen, and Andrew Baer)
Other Versions: Preliminary Book Draft (NBER), Center for Economic Studies (CES) Working Paper Series
Abstract Between the 2007 and 2012 Economic Censuses (EC), the count of franchise-affiliated establishments declined by 9.8%. One reason for this decline was a reduction in resources that the Census Bureau was able to dedicate to the manual evaluation of survey responses in the franchise section of the EC. Extensive manual evaluation in 2007 resulted in many establishments, whose survey forms indicated they were not franchise affiliated, being recoded as franchise-affiliated. No such evaluation could be undertaken in 2012. In this paper, we examine the potential of using external data harvested from the web in combination with machine learning methods to mostly automate the process of evaluating responses to the franchise section of the 2017 EC. Our method allows us to quickly and accurately identify and recode establishments have been mistakenly classified as not being franchise-affiliated, increasing the unweighted number of franchise-affiliated establishments in the 2017 EC by 22%-42%.
Working Papers
"A Tale of Two Fields? STEM Career Outcomes". (with Xuan Jian and Bruce Weinberg)
Abstract Is the labor market for US researchers experiencing the best or worst of times? This paper analyzes the market for recently minted Ph.D. recipients using supply-and-demand logic and data linking graduate students to their dissertations and W2 tax records. We also construct a new dissertation-industry ``relevance'' measure, comparing dissertation and patent text and linking patents to assignee firms and industries. We find large disparities across research fields in placement (faculty, postdoc, and industry positions), earnings, and the use of specialized human capital. Thus, it appears to simultaneously be a good time for some fields and a bad time for others.
Other Versions: NBER Working Paper; Center for Economic Studies (CES) Working Paper Series; SSRN
"Double-Pane Glass Ceiling: The Contribution of Commercial Engagement to the Faculty Gender Earnings Gap".
Abstract This paper analyzes the contribution of commercial engagement to gender earnings gaps among university faculty. Administrative data from universities (UMETRICS) linked to the universe of confidential W2 and 1040-C tax records allow me to precisely classify earnings sources and measure faculty commercial engagement. Female faculty are 20 pp less likely to commercially engage, with the entire gap driven by self-employment. The raw earnings gap is $63k (on a base of $162k), with non-university earnings accounting for $18k (29%). Thus, though university earnings account for most of the total gap, commercial engagement substantially expands it. Earnings gaps also exist for all components of non-university earnings, including self-employment and from incumbent, young/startup, high-tech, and low-tech firms. Large, though attenuated, gaps remain after accounting for hours worked as well as controls for publications, patents, field, university, scientific resources, age, marital status, children, and demographics. As faculty move up the earnings distribution, earnings gaps grow and commercial engagement becomes a more important contributor to the total gap. Gaps also grow throughout the career, starting very small at career outset. Finally, while both genders commercially engage with the similar industries, male faculty earn more across all common industries
"Best and Brightest? The Selectivity of Foreign-Born Ph.D. Recipients in the U.S." (with Valerie Bostwick and Bruce Weinberg)
Abstract In this paper, we address the scientific workforce's "small-n problem" that has bedeviled prior work attempting to study the labor market outcomes of particular underrepresented groups. We do this by linking population-wide data on US-trained Ph.D. recipients from the Survey of Earned Doctorates (SED) to the universe of W2 and 1040 Schedule C (1040-C) tax records. This linkage allows us to track the entire US labor market history (between 2005-2020) for all Ph.D. recipients trained at US institutions for cohorts graduating between 2004 and 2015. In contrast to traditional sources of data, such as the Survey of Doctorate Recipients (SDR), this SED-tax linkage allows us to examine the outcomes of groups that have been traditionally too small to study. In particular, we not only examine earnings gaps separately by gender, race, and ethnicity, but also the gaps that exist when these characteristics are fully interacted. Moreover, our SED-tax sample of Ph.D. recipients allows us to examine how these earnings gaps vary across fields.
"Addressing the Small-n Problem for the Scientific Workforce." (with Holden Diethorn, Gerald Marschke, Elisabeth Perlman, and Bruce Weinberg)
Abstract The foreign-born are central to the STEM workforce in the U.S. This paper examines the selectivity of U.S.-trained, foreign-born Ph.D. recipients relative to their U.S.-born and trained counterparts. We do this in terms of both the training/laboratory environments to which they were exposed during graduate school as well as their post-graduation labor market outcomes. We uncover strong evidence of selection, both pre- and post-graduation, and across the observed choices Ph.D. recipients make (e.g. field or sector of employment) as well as their unobserved characteristics (e.g. ``ability''). Moreover, we find that as the strength of Ph.D. recipients' attachment to the U.S. labor market increases, so does the selectivity of the foreign-born. Thus, at least among Ph.D. recipients, the world does not appear to be sending the United States their tired and poor, but rather their best and brightest, yearning to succeed.
"The Impacts of Open Access on Scientists, Inventors, and the Public"
Abstract The main goals of making scientific literature open access are twofold: 1) to speed scientific discovery and 2) to give access to the public who funded the research. In this paper, I use citations from articles, patents, and Wikipedia to examine whether open access achieves these goals by estimating whether innovators (scientists and inventors) or the general public increase their use of articles after those articles become freely available on PubMed Central, the largest repository of free full-text biomedical articles. Estimates suggest that innovators modestly increase their use of the typical article after it becomes freely available, but that the public substantially increases their use. Using the National Institutes of Health's Public Access Policy (PAP) as an instrument for an article becoming freely available suggests that, in contrast to the modest effects for the average article, innovators substantially increase their use of complier articles -- those articles that are freely available only because the PAP requires them to be. I unpack the sources of these citation increases by analyzing whether particular subsets of individuals disproportionately increase their citations to an article after it becomes freely available. These subsets include scientists at different types of institutions (firm/university/hospital) or in different countries (upper/middle/lower income) and different types of firms to which patents are assigned (small, young, high-tech, etc.). The latter group will be identified by creating the first-ever link of patent-to-article citation data to confidential firm-level data at the U.S. Census Bureau.
Other Versions: NBER Summer Institute Science of Science Funding (July 16, 2020)
"Estimating the Local Productivity Spillovers from Science" (with Subhra Saha and Bruce Weinberg)
Abstract We estimate the local productivity spillovers from science by relating wages and real estate prices across metros to measures of scientific activity in those metros. We address three fundamental challenges: (1) factor input adjustments using wages and real estate prices, along with Shepard's Lemma, to estimate changes metros' productivity, which must equal changes in unit production cost; (2) unobserved differences in metros/causality using a share shift index that exploits historic variation in the mix of research in metros interacted with trends in federal funding for specific fields as an instrument; (3) unobserved differences in workers using data on the states in which people are born. Our estimates show a strong positive relationship between wages and scientific research and a weak positive relationship for real estate prices. Overall, we estimate high rate of return to research.
"The Impact of Proposition 71 on Stem Cell Research in California" (with Wei Yang Tham)
Abstract In 2004, California passed Proposition 71, allocating $3 billion over 10 years to supporting stem cell research. We evaluate the effects of this policy using a generalization of the synthetic control method. We find that the policy led to a 20% increase in publications by the state of California as a whole. This increase in the volume of research drove similar increases in high-impact or novel papers from California. Our results suggest that government can play a substantial role in determining the direction of research.
"Scientific Talent Leaks Out of Funding Gaps". R&R at The Review of Economics and Statistics (with Stephanie Cheng, Elisabeth Perlman, and Wei Yang Tham)
Abstract We study how delays in NIH grant funding affect the career outcomes of research personnel. Using comprehensive earnings and tax records linked to university transaction data along with a difference-in-differences design, we find that a funding interruption of more than 30 days has a substantial effect on job placements for personnel who work in labs with a single NIH R01 research grant, including a 3 percentage point (40%) increase in the probability of not working in the US. Incorporating information from the full 2020 Decennial Census and data on publications, we find that about half of those induced into nonemployment appear to permanently leave the US and are 90% less likely to publish in a given year, with even larger impacts for trainees (postdocs and graduate students). Among personnel who continue to work in the US, we find that interrupted personnel earn 20% less than their continuously-funded peers, with the largest declines concentrated among trainees and other non-faculty personnel (such as staff and undergraduates). Overall, funding delays account for about 5% of US nonemployment in our data, indicating that they have a meaningful effect on the scientific labor force at the national level.
Coverage: Marginal Revolution; Good Science Project; Nature
Other Versions: NBER Summer Institute Science of Science Funding (July 19, 2024); Center for Economic Studies (CES) Working Paper Series; NBER Summer Institute Innovation (July 19, 2022)
Publications
"Cutting the Innovation Engine: How Federal Funding Shocks Affect University Patenting, Entrepreneurship, and Publications". (2023) The Quarterly Journal of Economics, https://doi.org/10.1093/qje/qjac046 (with Tania Babina, Alex He, Sabrina Howell, and Elisabeth Perlman)
Abstract This paper studies how federal funding affects the innovation outputs of university researchers. We link person-level research grants from 22 universities to patent, publication, and career outcomes from the U.S. Census Bureau. We focus on the effects of large, idiosyncratic, and temporary cuts to federal funding in a researcher's pre-existing narrow field of study. Using an event-study design that controls for principal investigator fixed effects, we document that these negative federal funding shocks reduce high-tech entrepreneurship and publications but increase patenting. The lost publications tend to be higher quality and more basic, while the additional patents tend to be lower quality, less general, and more often privately assigned. Overall, the federal funding cuts push researchers away from more open research with greater impact on future knowledge, and towards more subsequently appropriated research. The level of funding explains the effects on publications, while the source of funding—federal vs. private—appears to play an important role in the effects on high-tech entrepreneurship and patents. Together with evidence from industry contracts, the results suggest that shifting university research funding from federal to private sources leads to more appropriation of intellectual property by corporate sponsors.
Other Versions: NBER WP (December 2020), SSRN (November 2020), NBER Summer Institute Science of Science Funding (July 16, 2020), SSRN (March 2020)
"Publish or Perish: Selective Attrition as an Unifying Explanations for Patterns in Innovation over the Career". (2023) The Journal of Human Resources, https://doi.org/10.3368/jhr.59.2.1219-10630R1 (with Huifeng Yu, Gerald Marschke, Matthew Ross, and Bruce Weinberg)
Abstract For nearly 150 years, researchers from many disciplines have studied how the quantity and quality of research output varies over the career but have reached conflicting answers. A definitive answer directly informs federal research policy and increases our understanding of a broader set of questions related to human capital accumulation over the career. Here, we study 5.6 million articles published in biomedical science between 1980 and 2009 and use rich characterizations of citations and text to measure the quality of articles. We provide evidence of selective attrition whereby, low "ability" researchers stop publishing at earlier stages of their career, leaving high "ability" researchers to produce a growing share of publications. We find that controlling for selective attrition reconciles the long-standing conflicts in the existing literature. Specifically, the quality of research declines monotonically over the career for the average researcher but this decline is masked in the cross-section because the authors publishing at later ages are the ones who produce the highest quality research. Our results have implications for efforts to shift funding from late- to early-career researchers – while such policies will provide more funding to researchers at the point when they are most creative, they must be undertaken carefully because young researchers are less "able" on average.
"Academic Entrepreneurship and Inequality: Evidence from Administrative Data". (2022) Proceedings of the 17th European Conference on Innovation and Entrepreneurship, Volume 17, Number 1, https://doi.org/10.34190/ecie.17.1.839
Abstract Over the past several decades, universities have increasingly emphasized knowledge and technology transfer. Faculty are key agents facilitating this transfer, engaging in commercial and entrepreneurial activities such as, consulting, student placement, patenting, and the founding of start-ups. This paper documents the prevalence of faculty commercial engagement as well as the extent to which it widens earnings inequality among faculty. In contrast to previous work that uses surveys with low response rates to measure the commercial engagement of university faculty, this paper uses detailed administrative data from universities (UMETRICS) linked to confidential earnings data at the Internal Revenue Service (IRS) and U.S. Census Bureau (including the universe of W2 and 1099 tax records) to analyze how often university faculty engage in the types of commercial and entrepreneurial activity that catalyze knowledge/technology transfer.
"Mandating Access: Assessing the NIH's Public Access Policy". (2020) Economic Policy, Volume 35, Issue 102, April 2020, Pages 269–304, https://doi.org/10.1093/epolic/eiaa015
Other Versions: Munich Personal RePEc Archive (MPRA) (February 2019), SSRN (August 2018), Munich Personal RePEc Archive (MPRA) (November 2017)
Abstract In April 2008, the National Institutes of Health (NIH) implemented the Public Access Policy (PAP), which mandated that the full text of NIH-supported articles be made freely available on PubMed Central -- the NIH's repository of biomedical research. This paper uses 600 thousand NIH articles and a matched comparison sample to examine how the PAP impacted researcher access to the biomedical literature and publishing patterns in biomedicine. Though some estimates allow for large citation increases after the PAP, the most credible estimates suggest that the PAP had a relatively modest effect on citations, which is consistent with most researchers having widespread access to the biomedical literature prior to the PAP, leaving little room to increase access. I also find that NIH articles are more likely to be published in traditional subscription-based journals (as opposed to "open access" journals) after the PAP. This indicates that any discrimination the PAP induced, by subscription-based journals against NIH articles, was offset by other factors -- possibly the decisions of editors and submission behavior of authors.
"Occupational Classifications: A Machine Learning Approach". (2019) Journal of Economic and Social Measurement 44(2-3): 10.3233/JEM-190463 (with Akina Ikudo, Julia Lane, and Bruce Weinberg)
Other Versions: NBER WP (August 2018), Center for Economic Studies (CES) WP Series (August 2018), SSRN (August 2018)
Abstract Characterizing the work that people do on their jobs is a longstanding and core issue in labor economics. Traditionally, classification has been done manually. If it were possible to combine new computational tools and administrative wage records to generate an automated crosswalk between job titles and occupations, millions of dollars could be saved in labor costs, data processing could be sped up, data could become more consistent, and it might be possible to generate, without a lag, current information about the changing occupational composition of the labor market. This paper examines the potential to assign occupations to job titles contained in administrative data using automated, machine-learning approaches. We use a new extraordinarily rich and detailed set of data on transactional HR records of large firms (universities) in a relatively narrowly defined industry (public institutions of higher education) to identify the potential for machine-learning approaches to classify occupations.
"High-impact and transformative science (HITS) metrics: Definition, exemplification, and comparison". (2018) PLoS ONE 13(7): e0200597. 10.1371/journal.pone.0200597 (with Yu H, Light RP, Marschke G, Börner K, Weinberg BA)
Abstract Countries, research institutions, and scholars are interested in identifying and promoting high-impact and transformative scientific research. This paper presents a novel set of text- and citation-based metrics that can be used to identify high-impact and transformative works. The 11 metrics can be grouped into seven types: Radical-Generative, Radical-Destructive, Risky, Multidisciplinary, Wide Impact, Growing Impact, and Impact (overall). The metrics are exemplified, validated, and compared using a set of 10,778,696 MEDLINE articles matched to the Science Citation Index ExpandedTM. Articles are grouped into six 5-year periods (spanning 1983–2012) using publication year and into 6,159 fields constructed using comparable MeSH terms, with which each article is tagged. The analysis is conducted at the level of a field-period pair, of which 15,051 have articles and are used in this study. A factor analysis shows that transformativeness and impact are positively related (ρ = .402), but represent distinct phenomena. Looking at the subcomponents of transformativeness, there is no evidence that transformative work is adopted slowly or that the generation of important new concepts coincides with the obsolescence of existing concepts. We also find that the generation of important new concepts and highly cited work is more risky. Finally, supporting the validity of our metrics, we show that work that draws on a wider range of research fields is used more widely.
Book Chapters
"Automating Response Evaluation for Franchising Questions on the 2017 Economic Census". 2022. Big Data for 21st Century Economic Statistics (with Yifang Wei, Lisa Singh, Shawn Klimek, J. Bradford Jensen, and Andrew Baer)
Other Versions: Preliminary Book Draft (NBER), Center for Economic Studies (CES) Working Paper Series
Abstract Between the 2007 and 2012 Economic Censuses (EC), the count of franchise-affiliated establishments declined by 9.8%. One reason for this decline was a reduction in resources that the Census Bureau was able to dedicate to the manual evaluation of survey responses in the franchise section of the EC. Extensive manual evaluation in 2007 resulted in many establishments, whose survey forms indicated they were not franchise affiliated, being recoded as franchise-affiliated. No such evaluation could be undertaken in 2012. In this paper, we examine the potential of using external data harvested from the web in combination with machine learning methods to mostly automate the process of evaluating responses to the franchise section of the 2017 EC. Our method allows us to quickly and accurately identify and recode establishments have been mistakenly classified as not being franchise-affiliated, increasing the unweighted number of franchise-affiliated establishments in the 2017 EC by 22%-42%.
Working Papers
"A Tale of Two Fields? STEM Career Outcomes". (with Xuan Jian and Bruce Weinberg)
Abstract Is the labor market for US researchers experiencing the best or worst of times? This paper analyzes the market for recently minted Ph.D. recipients using supply-and-demand logic and data linking graduate students to their dissertations and W2 tax records. We also construct a new dissertation-industry ``relevance'' measure, comparing dissertation and patent text and linking patents to assignee firms and industries. We find large disparities across research fields in placement (faculty, postdoc, and industry positions), earnings, and the use of specialized human capital. Thus, it appears to simultaneously be a good time for some fields and a bad time for others.
Other Versions: NBER Working Paper; Center for Economic Studies (CES) Working Paper Series; SSRN
"Double-Pane Glass Ceiling: The Contribution of Commercial Engagement to the Faculty Gender Earnings Gap".
Abstract This paper analyzes the contribution of commercial engagement to gender earnings gaps among university faculty. Administrative data from universities (UMETRICS) linked to the universe of confidential W2 and 1040-C tax records allow me to precisely classify earnings sources and measure faculty commercial engagement. Female faculty are 20 pp less likely to commercially engage, with the entire gap driven by self-employment. The raw earnings gap is $63k (on a base of $162k), with non-university earnings accounting for $18k (29%). Thus, though university earnings account for most of the total gap, commercial engagement substantially expands it. Earnings gaps also exist for all components of non-university earnings, including self-employment and from incumbent, young/startup, high-tech, and low-tech firms. Large, though attenuated, gaps remain after accounting for hours worked as well as controls for publications, patents, field, university, scientific resources, age, marital status, children, and demographics. As faculty move up the earnings distribution, earnings gaps grow and commercial engagement becomes a more important contributor to the total gap. Gaps also grow throughout the career, starting very small at career outset. Finally, while both genders commercially engage with the similar industries, male faculty earn more across all common industries
"Best and Brightest? The Selectivity of Foreign-Born Ph.D. Recipients in the U.S." (with Valerie Bostwick and Bruce Weinberg)
Abstract In this paper, we address the scientific workforce's "small-n problem" that has bedeviled prior work attempting to study the labor market outcomes of particular underrepresented groups. We do this by linking population-wide data on US-trained Ph.D. recipients from the Survey of Earned Doctorates (SED) to the universe of W2 and 1040 Schedule C (1040-C) tax records. This linkage allows us to track the entire US labor market history (between 2005-2020) for all Ph.D. recipients trained at US institutions for cohorts graduating between 2004 and 2015. In contrast to traditional sources of data, such as the Survey of Doctorate Recipients (SDR), this SED-tax linkage allows us to examine the outcomes of groups that have been traditionally too small to study. In particular, we not only examine earnings gaps separately by gender, race, and ethnicity, but also the gaps that exist when these characteristics are fully interacted. Moreover, our SED-tax sample of Ph.D. recipients allows us to examine how these earnings gaps vary across fields.
"Addressing the Small-n Problem for the Scientific Workforce." (with Holden Diethorn, Gerald Marschke, Elisabeth Perlman, and Bruce Weinberg)
Abstract The foreign-born are central to the STEM workforce in the U.S. This paper examines the selectivity of U.S.-trained, foreign-born Ph.D. recipients relative to their U.S.-born and trained counterparts. We do this in terms of both the training/laboratory environments to which they were exposed during graduate school as well as their post-graduation labor market outcomes. We uncover strong evidence of selection, both pre- and post-graduation, and across the observed choices Ph.D. recipients make (e.g. field or sector of employment) as well as their unobserved characteristics (e.g. ``ability''). Moreover, we find that as the strength of Ph.D. recipients' attachment to the U.S. labor market increases, so does the selectivity of the foreign-born. Thus, at least among Ph.D. recipients, the world does not appear to be sending the United States their tired and poor, but rather their best and brightest, yearning to succeed.
"The Impacts of Open Access on Scientists, Inventors, and the Public"
Abstract The main goals of making scientific literature open access are twofold: 1) to speed scientific discovery and 2) to give access to the public who funded the research. In this paper, I use citations from articles, patents, and Wikipedia to examine whether open access achieves these goals by estimating whether innovators (scientists and inventors) or the general public increase their use of articles after those articles become freely available on PubMed Central, the largest repository of free full-text biomedical articles. Estimates suggest that innovators modestly increase their use of the typical article after it becomes freely available, but that the public substantially increases their use. Using the National Institutes of Health's Public Access Policy (PAP) as an instrument for an article becoming freely available suggests that, in contrast to the modest effects for the average article, innovators substantially increase their use of complier articles -- those articles that are freely available only because the PAP requires them to be. I unpack the sources of these citation increases by analyzing whether particular subsets of individuals disproportionately increase their citations to an article after it becomes freely available. These subsets include scientists at different types of institutions (firm/university/hospital) or in different countries (upper/middle/lower income) and different types of firms to which patents are assigned (small, young, high-tech, etc.). The latter group will be identified by creating the first-ever link of patent-to-article citation data to confidential firm-level data at the U.S. Census Bureau.
Other Versions: NBER Summer Institute Science of Science Funding (July 16, 2020)
"Estimating the Local Productivity Spillovers from Science" (with Subhra Saha and Bruce Weinberg)
Abstract We estimate the local productivity spillovers from science by relating wages and real estate prices across metros to measures of scientific activity in those metros. We address three fundamental challenges: (1) factor input adjustments using wages and real estate prices, along with Shepard's Lemma, to estimate changes metros' productivity, which must equal changes in unit production cost; (2) unobserved differences in metros/causality using a share shift index that exploits historic variation in the mix of research in metros interacted with trends in federal funding for specific fields as an instrument; (3) unobserved differences in workers using data on the states in which people are born. Our estimates show a strong positive relationship between wages and scientific research and a weak positive relationship for real estate prices. Overall, we estimate high rate of return to research.
"The Impact of Proposition 71 on Stem Cell Research in California" (with Wei Yang Tham)
Abstract In 2004, California passed Proposition 71, allocating $3 billion over 10 years to supporting stem cell research. We evaluate the effects of this policy using a generalization of the synthetic control method. We find that the policy led to a 20% increase in publications by the state of California as a whole. This increase in the volume of research drove similar increases in high-impact or novel papers from California. Our results suggest that government can play a substantial role in determining the direction of research.