Introduction

The Lee-Campbell Research Group constructs, analyzes, and disseminates Big Social Science Data collections largely from historical and contemporary China.  Group members include faculty, postdocs, and students who collaborate with James Z. Lee and Cameron D. Campbell on these projects and attend weekly group meetings to discuss their research.  One distinctive feature of the group is that many of the faculty and postdocs who are members are former students who continue to work on major group projects while also pursuing their own research projects. Another distinctive feature is that it is interdisciplinary, with historians, sociologists, and demographers among current associates and alumni. We also work closely with other professors, students, and research assistants outside the group on specific sub-projects.

Our main data projects include the China Multi-Generational Panel Datasets CMGPD, the China University Student Datasets CUSD, the China Government Employee Database CGED, the China Siqing Social Class Datasets CSSCD, and the China Workforce Datasets CWFD.  Please see the respective project pages for information on each project’s principal co-investigators/collaborators, background, history, funding, research output, user guide, and data access.

We currently have individual level information for 2 million individuals who lived in China between the eighteenth century and the present.  One million are from Qing dynasty populations, largely from the middle of the eighteenth century to the beginning of the twentieth century. Another million are from the Republic of China and People’s Republic of China, almost entirely from the twentieth century.  One million are rural farming populations from North and Northeast China, half of whom are longitudinally linked over their life course and across generations.  The second million consist of educated, urban populations of government officials, professionals, and university students and faculty drawn from all over China, whose records we are in the process of linking across careers and for a significant proportion across generations.

Much of our early work focused on the production and analyses of multi-generational, longitudinal, individual and household-level population databases, two of which, China Multigenerational Panel Database-Liaoning (CMGPD-LN) and China Multigenerational Panel Database-Shuangcheng (CMGPD-SC), are now available for download at the Inter-university Consortium for Political and Social Research. Many of our publications, including five prize winning or CHOICE award books, are derived from these and other similar sociodemographic data. Recent work from our ongoing research using such data have appeared in such journals as American Sociological Review, Demography, Evolution and Human Behavior, History of the Family, Social Science History, and Social Science and Medicine.  See James Lee’s CV and Cameron Campbell’s website for details.

We also have three ongoing Big Social Science Data projects on

Initial results from the CUSD, have been published as a Chinese article, 无声的革命: 北京大学、苏州大学的学生社会来源 1952-2002 (China’s Silent Revolution: the Social Origins of Peking University and Soochow University Undergraduates, 1949-2002), in the January 2012 issue of 中国社会科学 (Chinese Social Science) and as a book with the same title in August 2013 by Beijing Joint Publishing. These publications inspired over one hundred news and editorial print articles, interviews, webcasts, and broadcasts posted on one thousand Chinese websites in 2012 and 2013 and continue five years later to receive attention in the popular media.  The book version was awarded the 2014 third prize for Outstanding Achievement in Philosophy and Social Science by the Jiangsu Academy of Social Science.  The article version was reprinted by 中国社会科学 in 中国社会科学创刊35周年论文选 in 2017 as an example of their best scholarship in the thirty-five years from 1980 to 2014.  Meanwhile recent articles on China’s ‘silent revolution’ in such popular media as 中国新闻周刊 continue to receive over 100,000 views each.

We are completing a prequel book to China’s ‘silent revolution’ during the PRC on 中国知识阶层的来源与形成 during the ROC based largely on student records from 35 ROC universities collected in the CUSD and employment records from the CWFD.  The Jiangsu Academy of Social Science awarded a related article《江山代有才人出,各领风骚数十年:中国精英教育四段论,1865-2014》. 《社会学研究》第三期(May/June): 48-70, a 2017 third prize (三等奖) for Outstanding Achievement in Philosophy and Social Science.

Preliminary analyses of our CGED-Q project on the Qing Civil Service based on quarterly published lists of Qing salaried government officials called Jinshenlu (缙绅录) published in 清史研究 2016 4期, and 2018 4 期 and of our CSSCD project on rural wealth composition and distribution in the Peoples Republic of China, 1945-1966 based on household and individual records of rural wealth and political status from the Four Clean-ups Movement (1964-66) called 四清登记表, are equally promising in terms of producing important new findings about early modern and contemporary China.

In addition to the ICPSR public release of the CMGPD-LN and CMGPD-SC, we publicly released in 2019 638,152 records of 50,049 officials (based on our linkage) from the China Government Employee Database – Qing (CGED-Q) 1900-1912 along with a preliminary User Guide and will release further data from the CGED-Q in the years to come.  For more details, including links for downloading the 2019 public release, please visit our CGED-Q Project Page.

We are confident that in the decades ahead, all five Big Data projects, including our multi-generational longitudinal individual population datasets, will continue to produce a Scholarship of Discovery, which, similar to our earlier work on Chinese population behavior, 1700-2000, should transform our understanding of the Chinese state and Chinese stratification, social organization, and social-economic mobility during the last three centuries.

For a preliminary summary of some of these research findings, please see Understanding China, 1700-2000: A Data Analytic Approach available both as a shorter Coursera MOOC and as a longer advanced undergraduate / beginning postgraduate on-line course. We use these course videos in our teaching at HKUST and elsewhere to develop flipped classroom approaches to train students to work together in groups rather than individually, to improve their oral and written communication skills as well as their thinking, and to develop their EQ as well as their IQ.

We have also started to share some of our experiences in historical dataset construction, data analysis, and data analytic teaching in a series of methodological articles published in such major Chinese language journals as 《历史研究》 (Historical Research), 《社会》 (Society), 《文史哲》 (Literature, History, Philosophy), and 《清华大学学报》 (Journal of Tsinghua University), the last of which won the 2015 Parkson Best Article Award, to encourage new efforts in construction, analysis, and teaching of Big Social Science Data from archival sources in China.

See our blog for Lee-Campbell group news and updates