China Multigenerational Panel Datasets Workshop, UCLA, Los Angeles, CA, January 6-8, 2016.

We are pleased to announce a workshop to be held January 6-8, 2016 to introduce the China Multigenerational Panel Datasets (CMGPD). These are major resources for the study of demography, stratification, and family. The workshop will feature the China Multigenerational Panel Dataset-Shuangcheng (CMGPD-SC), the release of which is nearing completion, as well as the previously released China Multigenerational Panel Dataset-Liaoning (CMGPD-LN). The workshop will be held at the California Center for Population Research at the University of California, Los Angeles.

The China Multi-Generational Panel Dataset – Shuangcheng (CMGPD-SC) provides longitudinal individual, household, and community information on the demographic and socioeconomic characteristics of a resettled population living in Shuangcheng, a county in present-day Heilongjiang Province of Northeastern China, for the period from 1866 to 1913. The dataset includes some 1.3 million annual observations of over 100,000 unique individuals descended from families who were relocated to Shuangcheng in the early 19th century. Distinguishing features of the CMGPD-SC include linked records of household landholding, registered ethnicity, and better registration of unmarried daughters than most microdata for pre-20th century Chinese populations.

The China Multigenerational Panel Dataset-Liaoning (CMGPD-LN), which will also be reviewed, provides 1.6 million triennial observations of approximately 250,000 individuals who lived in what is now Liaoning province between the middle of the 17th century and the beginning of the 20th century. The most distinctive feature of the CMGPD-LN is its time depth, with many families covered for as many as seven generations, and its geographic breadth, covering villages spread across an area the size of the Netherlands or New Jersey.

More information about the CMGPD datasets are available at their page at ICPSR. Details of the origin and basic characteristics of the CMGPD-LN are available in its User Guide. A User Guide is also available for the CMGPD-SC.

The workshop is intended to allow interested researchers to assess the suitability of the CMGPD for their research topics, and provide current users with additional insight into key features that may affect their use of the data or their interpretation of results. No prior quantitative training or knowledge of Chinese history is required. The workshop will not provide instruction in quantitative analysis or data management, and anyone seeking such training should go elsewhere.

At the workshop, sessions will introduce the background and context of the populations covered by the data, review the key features, outline its strengths and limitations, and assess its suitability for the study of a variety of topics in demography, sociology, and economics. Particular emphasis will be on the longitudinal and multi-generational features of the data. The workshop will provide examples of how the data may be manipulated to take advantage of longitudinal and kinship linkage to produce variables for specific research applications.


The workshop will be in the California Center for Population Research Seminar Room, Public Policy 4202.

Overview of the CMGPD – Wednesday, January 6, 2015

  • 9AM-9:30AM Welcome and participant self-introductions
  • 9:30AM-10:30AM Unique features of the CMGPD, including comparisons to other datasets
  • 10:30AM-10:45AM Break
  • 10:45AM-11:30AM Comparison of the CMGPD-SC and CMGPD-LN
  • 11:30AM-2PM Lunch (not provided)
  • 2PM-3:30PM Format and basic structure of the CMGPD
  • 3:30PM-3:45PM Break
  • 3:45PM-5PM Limitations of the CMGPD to be aware of when considering use

Contents of the CMGPD – Thursday, January 7, 2015

  • 9AM-10:30AM Demographic outcomes
  • 10:30AM-10:45AM Break
  • 10:45AM-12 Noon Household context variables
  • 12 Noon-1:30PM Lunch (not provided)
  • 1:30PM-3:15PM Social, economic and institutional status variables
  • 3:15PM-3:30PM Break
  • 3:30PM-4:30PM Constructed kinship variables
  • 4:30PM-5PM Geographic context variables

Advanced operations with the CMGPD – Friday, January 8, 2015

  • 9AM-10:30AM – Identifier variables for use in linkage
  • 10:30AM-10:45AM – Break
  • 10:45AM-12 Noon – Constructing life history, kinship, and community contextual variables
  • 12 Noon-1:30PM Lunch (not provided)
  • 1:30PM-3:15PM Examples of applications
  • 3:15PM-3:30PM Break
  • 3:30PM-5PM Participant presentations and general Q&A

Recommended reading/viewing

Please read or view as much as possible of the following in advance of the workshop, and come prepared with questions..


We will be being able to provide some support for travel expenses to registered participants. Applicants have an opportunity to indicate need for support at the application portal.

There is no fee for attendance, but prospective participants must complete a simple application and submit some basic documentation.

The application portal is now open. We are still considering applications. We will normally respond to completed applications within a day or two.

If you have questions, please email me at


The workshop is being organized by the Data Sharing for Demographic Research (DSDR) project at the Interuniversity Consortium for Political and Social Research (ICPSR). DSDR is a project supported by the Population Dynamics Branch (PDB) of the Eunice Kennedy Shriver National Institute of Child Health and Human Development (U24 HD048404). The UCLA California Center for Population Research (CCPR) is providing the venue as well as logistical support. CCPR receives population research infrastructure funding (R24HD041022) from the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD).

Preparation of the CMGPD-SC and accompanying documentation for public release via ICPSR DSDR was supported by the National Institutes of Health, Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) Grant no. R01 HD070985 “Multi-generational Demographic and Landholding Data: CMGPD-SC Public Release.”

CMGPD-SC now available at ICPSR!

I am pleased to report that the China Multigenerational Panel Dataset-Shuangcheng is now available for download at ICPSR:

We would like to thank everyone who worked with the draft versions of the release and documentation and reported problems. If you have been working with a draft version of the release downloaded from my own website, I recommend strongly that you download the official release and begin working with it. It incorporates a number of fixes to address problems reported by users.

We anticipate releasing the Landholding File sometime this fall. This will include landholding records that are linked to individuals recorded in the registers. We will also be releasing updates to the User Guide and other documentation over the next year.

Over the next year, we will also overhaul the variables related to official position to reflect new information located in the registers by Shuang Chen. We will also release a price time series.

Preparation of the CMGPD-SC and accompanying documentation for public release via ICPSR DSDR was supported by the National Institutes of Health, Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) Grant no. R01 HD070985 “Multi-generational Demographic and Landholding Data: CMGPD-SC Public Release.” Contents are solely the responsibility of the authors and do not necessarily represent the official views of the NICHD.


CMGPD Training Guide Video: From the Original Registers to the Database


I recorded a third video today. This narrates the portion of the Training Guide that discusses the process by which we turned the original registers into the CMGPD-LN database. There is some discussion of the original format and content of the data, and the implications for analysis. In particular, there is discussion of the origins of the variables for entries and exits that are the basis of event-history analysis.

CMGPD Training Guide Video: Strengths and Weaknesses of the CMGPD-LN

I recorded another narration from the CMGPD Training Guide. This one is for the section that discusses the strengths and weaknesses of the CMGPD-LN. The discussion of strengths focuses on features of the CMGPD-LN that make it unique among sources for the study of historical demography. The discussion of weaknesses highlights some areas where caution needs to be exercised when carrying out analysis. Visitors in China may find it more convenient to view the video that I uploaded at Tudou.

Other videos will be available at the Youtube playlist devoted to CMGPD Training Guide videos.

CMGPD Training Guide Video: The CMGPD and Other Sources

I have begun recording and uploading narrated modules from the CMGPD Training Guide. This is my first effort. This is from the section of the Training Guide that introduces the CMGPD and compares it to other sources commonly used in the study of historical demography. Visitors in China may find it more convenient to view the video at Tudou.

A PDF of the Training Guide is available for download here, along with other documentation.