China Multi-Generational Panel Datasets (CMGPD)

Since the 1980s, James Lee, Cameron Campbell, and a variety of collaborators, students, and coders have constructed and analyzed large databases of information on individuals and families who lived in northeast China and/or Beijing between the 18th century and early 20th century. These include the China Multigenerational Datasets-Liaoning (CMGPD-LN), -Shuangcheng (CMGPD-SC), and Imperial Lineage (CMGPD-IL). These datasets not only provide longitudinal life histories for individuals and follow families over multiple generations, but the CMGPD-LN and CMGPD-SC link individuals to other members of their household as well as kin living in other households or even other villages. We have used the datasets to study the influence of individual characteristics and family and community context on key demographic outcomes such as marriage, reproduction, and mortality. More recently, we have used them to study stratification processes that unfold over multiple generations, examining the persistence of patterns of inequality over multiple generations. In addition, we have used these data to compare population and family behavior with other populations in Eurasia. The data are now publicly available for download at ICPSR. If you would like some idea of the possibilities that these data afford, please consult the bibliography of publications using the CMGPD.



The original source for two of these datasets (CMGPD-LN and CMGPD-SC) were household registers that were compiled annually or triennially in northeast China during the Qing dynasty, and the original source for the third (CMGPD-IL) were continuous ‘Jade’ registers, compiled decadally,  that tracked members of the Imperial Lineage who lived in what are now Beijing and Shenyang. The original household registers from which we constructed the Chima Multigenerational Panel Database-Liaoning (CMGPD-LN) are held in the Liaoning Provincial Archives.  The registers that comprise the China Multigenerational Panel Database-Shuangcheng (CMGPD-SC) are held in the Shuangcheng County Archives in Heilongjiang Province.  The registers that are at the core of the China Multigenerational Panel Database-Imperial Lineage (CMGPD-IL) are held in the First Historical Archives in Beijing. We thank all three archives for their cooperation over many years, in some cases decades.

The first two of these three databases (CMGPD-SC and CMGPD-LN) are already in public release, and available for download from ICPSR.  Their detail, generational depth, and geographic breadth makes them nearly unique among historical databases.  Our hope in making these data publicly accessible is that other researchers working with these data will find applications for them that we never imagined, not just in the study of Chinese history, but in the study of demography, stratification and other topics in the social sciences more generally.

The China Multigenerational Panel Dataset – Liaoning (CMGPD-LN) includes 698 communities in a swath of what is now Liaoning province between 1749 and 1909. The data are triennial, and comprise 1.6 million records of 260,000 individuals. The highlight of the data are its generational depth: it covers families for as many as seven generations. The communities it covers are scattered over an area roughly equivalent in size to the Netherlands or New Jersey, and were economically, ecologically, and geographically diverse. The CMGPD-LN was the basis of the book Fate and Fortune in Rural China by James Lee and Cameron Campbell as well as numerous research articles.

The China Multigenerational Panel Dataset – Shuangcheng (CMGPD-SC) cover 125 communities in Shuangcheng county in Heilongjiang province from 1866 to 1913.  They contain 1,346,829 records of 108,100 individuals. The registers are annual.  Relative to the CMGPD-LN, the data in the CMGPD-SC have much more detail on socioeconomic status, including landholding at regular intervals.  The population is also of interest in its own right.  It consists of recent settlers, but the settlers are a mixture of rusticated Bannermen who had been living in Beijing, and relocated farmers from neighboring provinces who had been brought in to pave the way for the rusticated cityfolk. Shuang Chen joined Cameron Campbell and James Lee for the analysis of CMGPD-SC, and wrote her University of Michigan dissertation and a book published by Stanford University Press using these data.

The China Multigenerational Panel Dataset – Imperial Lineage (CMGPD-IL) records 250,000 members of the Qing Imperial Lineage and related individuals from before the founding of the dynasty until after its end, over a total of 14 generations. These lineage members resided almost exclusively in Beijing and Shenyang. In contrast with most lineage genealogies in China, these records were compiled prospectively. As a result, these are among the most precise and detailed records of demographic behavior for a Chinese population before the 20th century. In most periods, almost all births were recorded, including daughters, and the timing of exits via death and (for daughters) out-marriage were recorded. We would like eventually to release the CMGPD-IL.

Research Outputs

For a current overview of our CMGPD related research, please see Part One of our on-line course Understanding China, 1700-2000: A Data Analytic Approach – Who Are We as well as the many books, articles, and presentations listed in the ICPSR CMGPD bibliography.

A well-known summary of much of our initial CMGPD based thinking, which has withstood some debate as well as the test of time, is James Lee and Wang Feng. 1999. One Quarter of Humanity: Malthusian Mythology and Chinese Realities, 1700-2000.  Cambridge, Mass.: Harvard University Press; published in Chinese as 《人类的四分之一:马尔萨斯的神话与中国现实》 Beijing: Sanlian Shudian, 2000; in French as La population Chinoise: mythes and réalités (China’s Population: Myths and Realities) Montreal: Les Presses de l’Université de Montréal, 2006; and in Korean as 인류 사분의 일 Seoul: Sungkyunkwan University Press, 2013.

We have used the CMGPD to study long-term health consequences of early-life experience throughout individual life course, as well as long-term persistence of inequality across multiple generations. For studies from a life-course perspective on health, see two articles in Social Science & Medicine: Campbell and Lee 2009 and Dong and Lee 2014. For studies on multi-generational stratification and inequality, see articles in Chinese Sociological Review (Campbell and Lee 2011) and American Sociological Review (Song, Campbell and Lee 2015).

We have also used the CMGPD in a number of comparative studies of economy, family organization, and demographic behavior in the past. For comparisons between Europe and Asia, see the three books in the MIT Press Eurasian Population and Family History Series: Bengtsson, Campbell, and Lee et al 2004; Tsuya, Wang, Alter, and Lee et al 2010; and Lundh and Kurosu et al 2014.  For comparisons within East Asia, see articles in Evolution and Human Behavior (Dong et al 2017) and Demography (Dong et al 2015).  More recently, Hao Dong has been using the CMGPD to conduct comparative studies of family systems in East Asia. See his 2016 HKUST PhD thesis in Social Science, “Patriarchy, Family System and Kin Effects on Individual Demographic Behavior throughout the Life Course: East Asia, 1678-1945.”

CMGPD-related resources


NICHHD 1R01HD070985-01. Multi-generational Demographic and Landholding Data: CMGPD-SC Public Release (Cameron Campbell PI). 2012-2017.

Hong Kong Research Grants Council Project No.16400714. Human Agency and Population Behavior in Historical and Comparative Perspective: New Discoveries from East Asian Panel Data (James Z. Lee PI). 2014-2016

国家社会科学基金。项目11BZSO87. 中期以来东北地区人口与社会历史资料整理研究. (James Z. Lee PI). 2011-2014.

Hong Kong Research Grants Council Project Number 642911. Differentiating Community and Family Contextual Influences on Socioeconomic Attainment and Demographic Behavior: Shuangcheng, 1855-1911 (James Z. Lee PI). 2011-2014.

NICHHD 1R01HD057175-01A1. The Liaoning Multi-Generational Panel Dataset: Public Release and User Training (James Z. Lee/Susan Leonard PI) 2009-2012.

NICHHD 1R01HD045695-01A2 Demographic Responses to Community and Family Context” (James Z. Lee PI). 2007-2010.