SJTU 2013 Social Demography Final Project

Social Demography
SJTU Summer Short Semester
2013

Due 7/25 at the beginning of class

You are to write an original research paper that uses the IPUMS website to carry out a comparative study of time trends and age patterns of the demographic and socioeconomic characteristics by education, income, ethnicity, race, region, sex, or some other variables.  The emphasis is on comparison.  If you are interested in a particular ethnicity, for example, you still need to compare it to other ethnicities or the population as a whole to establish what is distinct about it.

Please read the following directions carefully.  Since you have nearly two months to complete the project, there is no excuse for not complying with the instructions.

Your research paper should be 2000 words of text (roughly 4 single-spaced pages or 8 double-spaced pages) and 6 tables based on computations at the IPUMS site.   The paper should be organized as the text, followed by the references, followed by the tables, with each table on a separate page.  All tables should be publication quality according to the specifications below, not simply copied and pasted from the website.  Do not insert tables into the main text.  Please number all pages, and make sure that your name is on the first page.

The text should consist of four sections: Introduction, Background, Results, and Conclusion.  Below I suggest guidelines for the lengths of each of these sections.  These guidelines are not rigid, and depending on your topic and your findings the actual word count may differ.  You may end up with more or fewer words in each section than

The Introduction should explain the overall focus of the paper and specify the questions that you are interested in.   250 words should be adequate.

The Background section that provides whatever information from other published sources you think may be necessary to help a reader understand the object of your study.  For example, if your tables focus on comparison of different ethnic groups, you might provide a brief history of each group’s history in the United States that focuses on features relevant to the analysis.  If you are comparing several major cities, you might want to mention key features of each relevant to your analyses.  500 words should be sufficient.

A Results section that discusses the tables one by one, and interprets their contents in light of hypotheses or theories in the introduction.  The tables should be numbered consecutively, and referred to in the text as Table 1, Table 2 etc.

The Conclusion reviews the most interesting results in the paper and suggests further work.  250 words should be sufficient.

Tables

Each of the tables should examine relationships among a distinct set of variables.  In other words, the tables should not be repetitions of the same basic tabulation but with different filters.  At least two tables should make use of demographic or other variables unique to the American Community Survey (ACS) data, which are annual starting in 2001.  At least two tables should make use of variables from the Decennial Census data.

You may also use the Current Population Survey (CPS) data at the IPUMS site.  It tends to have much richer detail on labor force and employment characteristics.  It may also be harder to use.

For some of your tables, you may also use General Social Survey (GSS) data, which is available at a different website (http://sda.berkeley.edu/cgi-bin/hsda?harcsda+gss10).  It can be analyzed via a web interface like the one that you are already familiar with at IPUMS.  The GSS includes questions on topics like religion, political views, and so forth that are not covered in the Census.  Keep in mind that if you want to use the GSS, the tables you create should have something to do with demographic behavior, broadly defined.

Each table should also have a self-explanatory title, and the row and column headings should be sufficient to allow a reader to interpret the table without referring to your text.  Each table should include a totals column and/or totals row as appropriate.  Please format the tables so that there are no vertical lines, and only four horizontal lines: one between the title and the column headings, one between the column headings and the table contents, one between the table contents and the totals row, and one at the bottom.  Basically the table should be formatted like the ones you see in the papers in the assigned reading.  You will notice that in publications, tables almost never have vertical lines, and generally have a limited number of horizontal lines.

Either the title of the table or a note at the bottom of the table should specify any restrictions that were applied in selecting observations to be included in the calculation.  Typically this means specifying the ages that were included in the calculation, the the years.

The tables should not be copied and pasted directly from the site, but rather should be prepared to look like they were publication quality, following the guidelines above.

The tables may be frequencies or cross-tabulations like the ones you are already used to.  You are also encouraged to take advantage of some of the other tools available at the site.  You are most likely to find the comparison of means tool (https://sda.usa.ipums.org/helpfiles/helpan.htm#means) the most useful.   This allows you to calculate the mean of one variable for different combinations of other variables.  For example, you could calculate mean income (INCTOT) for different combinations of RACE and YEAR.  If you are more adventurous, you may try using the correlation or regression tools, but these can take a long time.

Filter variables to restrict the observations included in the analysis

In constructing your tables, make sure to select or filter observations correctly to make sure the ones you include are relevant.  You can restrict the valid range of a variable used in the analysis to achieve the same effect as a filter: https://sda.usa.ipums.org/helpfiles/helpan.htm#range

Depending on the analysis that you are doing, you may want to use a filter to restrict to people of particular ages, or people with particular characteristics.  For example, when looking at completed education, EDUC, you will almost always want to restrict to people aged 25 or over, so you will only be looking at people who have completed their education.  Similarly, most of the income and occupation variables are only relevant for people of working ages, 18-55.  For details on using the selection filter at IPUMS, please see https://sda.usa.ipums.org/helpfiles/helpan.htm#filter

Recode continuous variables like income, age etc. into a manageable number of categories

When constructing tables that are tabulations, you will also want to use recode for any variable that is continuous (a quantity), not discrete (a category).  Examples include age, year of birth, and almost any of the income variables.  If you are working with age, instead of having a separate row or column for each single year of age (1,2,3, etc.) you will want to have a limited number of age groups: 1-9, 10-19 and so on.  Similarly, If you want to use total income (INCTOT), income from wages (INCWAGE), or other variables that record an amount in dollars, not a category, you will definitely need to recode the original values into into categories.

If you attempt to carry out a tabulation in which one of the income variables is a row, column, or control variable, and don’t record, the tabulation will almost certainly fail, with an error message indicating that there are too many rows or columns.  The definition of your income categories will depend on the year that you are looking at.  Because of inflation, typical incomes change dramatically over time.  See  https://sda.usa.ipums.org/helpfiles/helpan.htm#recode on how to carry out a recode.

Exclude observations with missing or not available (N/A) values

You will also need to exclude missing or not available (N/A) values, especially if you are computing a mean.  In the IPUMS data, when information is missing for a variable in a particular observation, that is typically represented with a numeric value that will be included in any mean that you compute, unless you exclude it.  This is especially important for income variables.  In total income (INCTOT), missing data is represented by 9999999: https://usa.ipums.org/usa-action/variables/INCTOT/#codes_section.  For wage income (INCWAGE), missing is represented as 999999: https://usa.ipums.org/usa-action/variables/INCWAGE/#codes_section.  For the socioeconomic index (SEI), N/A is represented as 0: https://usa.ipums.org/usa-action/variables/SEI/#codes_section And so on.  If you fail to exclude the numeric codes for missing values from the calculation of a mean, you may get peculiarly high values (if N/A was being represented as 999999) or particularly low values (if N/A was being represented as 0).  If you are using other variables, you will need to check the documentation for them to see how missing or N/A was coded, and then exclude those values.

Demographic and Socioeconomic Characteristics to Treat as Outcomes/Dependent Variables

Basic demographic and socioeconomic variables available in most of the decennial Censuses that you might want to consider as outcomes (dependent variables) include but are not limited to:

  • Current marital status (MARST)
  • Number of children born (CHBORN)
  • Age at first marriage (AGEMARR)
  • Total individual income (INCTOT)
  • Poverty status (POVERTY)
  • Educational attainment (EDUC)
  • Socioeconomic index (SEI) – this is a commonly used measure of the standing of an individual’s occupation.
  • Of course if you have found another variable that you are interested in, you are welcome to use that.  Some of you have mentioned school enrollment, home ownership, type of school, health insurance, and so forth.

The ACS also includes a rich set of demographic variables that could be used as outcomes.  The ACS are the data that show up annually since 2000 for 2001, 2002, 2003 etc.  The most interesting relevant to the class are some variables for very recent years that indicate whether certain events have occurred in the last year, and could be the basis of the calculation of rates, as opposed to percentages:

These lists are only meant as suggestions, and if you have other interests that can be addressed with other variables you have found, you may pursue them.

Demographic  and Socioeconomic Characteristics to Treat as Explanatory/Independent Variables

Generally your explanatory variables should precede your outcome variables in time.  That doesn’t always  mean they have a causal effect on the outcome, but a causal interpretation is at least more plausible.  So, for example, you might examine number of children born (CHBORN) for women aged 45 according to their level of education (EDUC), but you probably won’t think about studying the education of women aged 45 according to their number of children.  The variables are of course the same in both cases, but the interpretation of which is an outcome and which is explanatory differs.

  • Race (RACE) – Note that since 2000, Race includes codes identifying people who have said they were two or more races.   There are also codes since 2000 for single races, for example, RACASIAN
  • Hispanic (HISPAN) – Note that Hispanic status is separate from race.
  • A variety of other nativity and ancestry variables are available at http://usa.ipums.org/usa-action/variables/group/race_eth.  The availability of these variables tends to change over time, so there isn’t really one nativity or ancestry variable that is available on a continuous basis since 1850.  I will post a separate guide to using some of the key variables.
  • Geographic identifiers in http://usa.ipums.org/usa-action/variables/CITY#codes_section.  Note that the IPUMS doesn’t offer any more detail than City, so with IPUMS you can’t compare different neighborhoods in the same city.
  • Of course you can use EDUC, INCTOT and other variables as explanatory variables, just make sure that your dependent variable comes after them in time.

Examples of tables you could construct

  • Use the comparison of means to look at mean number of children born for people of difference races in different years.  In this case, you would select number of children as your dependent variable, and RACE and YEAR as row and column variables.  You would probably want to filter to restrict to (for example) women who were old enough to have completed their childbearing, say 50 years old.  You might want to restrict to decennial census years.
  • Use the comparison of means to look at mean income for people of different ages with different levels of education.  In this case you would select income as your dependent variable, and age and education as your rows and columns.  You would probably want to set a filter to restrict to ages when people might actually have incomes, for example, 25-55.  You would want to recode age so that instead of having fifty rows, one for each age, you have three rows, one for each ten year age group.

Reminders

  • My posts with IPUMS tips and tracks are accessible via http://camerondcampbell.me/category/ipums/ Make sure to review to see if there is anything that helps you.
  • If you are trying to use an income variable such as INCTOT as a row or column variable, you will need to record it into a limited number of categories in order for a table to work.  If you simply specify INCTOT or another income variable as a row or column variable, the table won’t run, because there are too many distinct values, requiring thousands of columns or rows.  You will need to use the recode to regroup incomes into a manageable number of categories, and of course exclude 9999999 and 9999998.
  • Most if not all of the income variables, including INCTOT, FINCTOT, and HINCTOT, code missing values or not available as 9999999,  9999998, 999999, 999998, or some variant thereof.  INCTOT codes missing values as 9999999: https://usa.ipums.org/usa-action/variables/INCTOT/#codes_section.  If you are carrying out a comparison of means, you need to exclude those observations because the average shouldn’t include these values.  You could do this by putting inctot(*-9999997) in the filter.
  • Similarly, If you are categorizing income, make sure that the highest category of income doesn’t include 9999998 and 9999999.  For example, inctot(r:0-9999;10000-19999;20000-29999;30000-39999;40000-49999;50000-9999997)
  • Many of the fertility variables use 0 to indicate missing or no response, 1 to indicate no births or no children.  For example, the ACS variable FERTYR is 0 for Not Available, 1 for no births in the last year, and 2 for one or more births in the last year: https://usa.ipums.org/usa-action/variables/FERTYR#codes_tab .  Similarly, CHBORN is 0 for not available, 1 for no children, 2 for one child, and so forth: In those cases, 2 often means 1 child, 3 means 2 children and so forth: https://usa.ipums.org/usa-action/variables/CHBORN#codes_tab   Be attentive to this when you interpret .  If you are computing mean number of children, or mean numbers of births, you will often want to subtract one from the numbers you present.
  • If you are computing averages of any variables via Comparison of Means, make sure to inspect the detailed documentation for those variables to find out how missing values are coded, and use a selection filter to exclude them.
  • Again, use selection filters to make sure that the observations you include are relevant to the question you are interested in.  For example, if you want to use school to look at whether or not someone is currently enrolled in school, you would want to restrict to people who have a chance of being currently enrolled by applying a selection filter based on age.  Restricting to age(14-18), for example, would let you look at people who were eligible to be eligible to be in high school.  If you are looking at completed education, normally you would want to restrict to ages 25 and above.
  • Remember that not every variable is available in every year.  For the variables you are interested in, check to see which years they are available in.  Some very interesting variables are only available in one or two years.  The variables related to ethnicity, nativity, and origin are especially prone to change.
  • Remember that 2001-2009 are based on the ACS.  If you just want to present data from the decennial Census, you would restrict to years 1850-2000, and if you just wanted ACS data, you would restrict to 2001-2009.
  • Keep in mind that the ACS has some nice variables that allow for direct computation of certain demographic rates, like whether or not someone has married in the last year, whether or not someone has had a birth in the last year, and so forth.

 

First publication using the CMGPD-LN public release!

Congratulations to Wang Lei at the Chinese Academy of Social Sciences’ Institute of Labor and Population Economics!  Wang Lei has just published what we believe is the first publication using the public release of the CMGPD-LN that doesn’t have one of us as a co-author: http://www.cnki.com.cn/Article/CJFDTotal-RKJJ201302006.htm The paper is a study of bachelorhood in northeast China in the eighteenth and nineteenth centuries, taking advantage of the excellent data on marital status available in the CMGPD-LN. It appeared in 人口与经济 (Population and Economics), which is one of China’s major social science journals.

We all expect that this will be just the first of many publications by others that make use the CMGPD-LN.

Here is the full citation for anyone who is interested:

Wang Lei.  2013.  清代辽东旗人社会中的男性失婚问题研究-基于中国多世代人口数据库—辽宁部分( CMGPD-LN) (A Study of Males’ Out-of-marriage in Bannerman Society of East Liaoning in Qing Dynasty: Based on CMGPD-LN).  人口与经济 (Population and Economics).  2013(2):35-43.

And for anyone who is interested, here is a paper we published on male marriage, which Wang Lei was kind enough to cite: http://sjeas.skku.edu/upload/200905/17-42JamesLee-1.pdf

 

2013 SJTU Summer Short Course: Social Demography

Social Demography

Shanghai Jiaotong University
Summer Short Semester 2013
7/1/2013-7/26/2013

Course description at Shanghai Jiaotong University website: http://summer.jwc.sjtu.edu.cn/web/sjtu/XJXQ/198690.htm

INTRODUCTION

This is an overview class intended to familiarize students with key concepts, major debates, and recent research in population and social demography. The focus will be on contemporary trends in marriage, childbearing, divorce, migration, and health and mortality. Issues discussed will be a balanced mixture of topics of academic interest, contemporary relevance, and policy concern. Along the way, methods and data sources used in the study of population and social demography will be introduced. Readings will include academic publications that are examples of classic or recent work in key issues of population or social demography. Students should come away with the class with an awareness of the range of issues considered in population studies and social demography, a basic understanding of relevant data and methods, and an ability to read articles related to population in an informed and critical fashion.

The emphasis will be on trends and patterns in demographic behavior in the contemporary United States, in historical and comparative perspective.

INSTRUCTOR

Cameron Campbell, camcam@ucla.edu

FORMAT

The class will meet twice a week for four weeks. Each class meeting will last for three hours. The first half of each class meeting will be devoted to lecture relevant to the topic and assigned readings. After a break, the second half will be devoted to class discussion and student presentations of optional readings.

REQUIREMENTS

  • Attendance – 10% Attendance will be taken at each lecture.
  • Discussion – 10% Part of each class meeting will be reserved for discussions of the lecture and the assigned readings. Students are also welcome to initiate discussion or ask questions during lecture, without waiting for the time dedicated to discussion.  Students will be expected to participate in discussion.
  • Research project (written) – 35% Students will complete a research paper describing and interpreting patterns and trends in demographic and socioeconomic characteristics of an ethnic group, state or other geographic region (city etc.), or other well-defined subpopulation, using data from IPUMS USA (http://usa.ipums.org/usa/). Characteristics of interest may include age and sex distribution, marital status, childbearing, and educational attainment. For the paper, students will carry out tabulations at the IPUMS website, produce tables or graphs, and write accompanying text that refers to relevant literature to interpret observed trends. The text should be about 5-7 double-spaced pages of text.
    • Tables, graphs, and references follow at the end and do not count toward the page requirement.
    • All papers must have a reference section
    • Please begin familiarizing yourself with the IPUMS website as soon as possible. In addition to visiting the main IPUMS USA page (http://usa.ipums.org/usa/), please make sure to visit the main page for the Online Data Analysis system (ODA) that you will be using to do the calculations for your research paper: http://usa.ipums.org/usa/sda/. There is also a short set of instructions for using the ODA at: http://usa.ipums.org/usa/resources/sda/sdainstructions.pdf
    • If you are especially interested in economic characteristics of your population of interest, you may also want to consider using Current Population Survey (CPS) data: http://cps.ipums.org/cps/. The Online Data Analysis system for the CPS is available at: http://cps.ipums.org/cps/sda
    • The detailed prompt for the research project is available separately.
    • You may work together on your projects in teams of 2 people.  For team projects, the length requirement is multiplied by the number of team members.  Thus, a paper from a team of two should be 10-14 pages.
  • Presentation on research project  – 15% Students will make short presentations on their research papers at the last two class meetings.
  • Assignments – 30% Assignments will introduce students to various web resources for population and demography.  Assignments should be handed in to the TA at the beginning of the class on the day that they are do.  See the class schedule later in the syllabus for descriptions of the assignments.

READINGS AND RESOURCES

Haupt, Arthur.  2004.  Population Handbook.  Fifth Edition.  Washington: Population Reference Bureau.  http://www.prb.org/pdf/PopHandbook_Eng.pdf

TOPICS AND READINGS ARE PRELIMINARY, AND MAY CHANGE.  CHECK BACK BEFORE CLASS STARTS.

SCHEDULE

Lecture 1 – 7/2/2013

Introduction
Sources for the study of social demography
Population growth over the long term
Population studies and the social sciences

Reading

  • McFalls, Joseph.  2007.  “”Population: A lively introduction.  Fifth Edition.”  Population Bulletin.  62(1).  Link
  • Haupt, Chapters 1 and 2

Optional, not required

  • Preston, Samuel H.  1993.  “The Contours of Demography: Estimates and Projections  Demography.  30(4):593-606.  JSTOR
  • Keyfitz, Nathan. 1975. “How do we know the facts of demography?” Population and Development Review 1(Dec):267-288. J.

Discussion

Self-introductions

Lecture 2 – 7/4/2013

Demographic behavior in the past
Marriage and childbearing before the 20th century: East-West comparisons
Household and family before the 20th century
Mortality and fertility decline, and demographic transition

Reading

Optional, not required

  • Campbell, Cameron and James Lee. 2010. “Fertility control in historical China revisited: New methods for an old debate.” History of the Family. 15:370-385. doi:10.1016/j.hisfam.2010.09.003.

Discussion

Introduction to IPUMS

Assignment 1

Please review the topics in the syllabus.  Which topic do you find most interesting?  Why?  What related to that topic would you most like to learn about?  One single-spaced page.

Lecture 3 – 7/9/2013

Marriage and Cohabitation
Trends in age at marriage and non-marriage in Asia, North America, and Europe
Non-marriage
Socioeconomic, racial and ethnic differences in marriage
Interracial marriage, educational homogamy, and other aspects of partner choice
Emerging trends: living together apart

Reading

Optional, not required

Discussion

Ideas for topics for the final paper.

Assignment 2

Review the variables available for analysis at the IPUMS website.  Make sure to look at variables available for the Decennial Censuses (1850-2010) and in the American Community Survey (annually since 2000).  After you have examined the site to see what is available.  Write a page identifying a topic you would like to work on for your final paper and listing the variables that you plan to make use of.

Lecture 4 – 7/11/2013

Racial and socioeconomic differences in childbearing in the U.S.
Non-marital childbearing and childrearing
Changing age patterns of childbearing
Ultra-low fertility in Europe and Asia

Reading

Optional, not required

The West

China

The Rest of the World

Assignment 3

Prepare two tables at the IPUMS website using variables that you are interested in. For this exercise, I strongly encourage you to learn how to recode variables, and use filters to limit the observations included in the calculation.  Recoding variables allows you to regroup values so that for example instead of having a separate row for every year of age, you can have age groups 20-24, 25-29 etc.  If you can do all of this for this exercise, completing the project should be straightforward.  Make sure to pay attention to handling of missing values.

Make sure to read the description of the final project carefully for detailed instructions on handling variables.  Pay special attention to the discussion of recoding variables, handling missing values, and restricting observations by use of filters.

For the first table, carry out a cross-tabulation of one variable against another, with appropriate restrictions on cases and so forth.  By cross-tabulation, I mean that you should select one variable of interest as a row variable, and another variable of interest as a column variable, and use the IPUMS website to prepare a table that summarizes the distribution of one of the variables as a function of the other variable.  For example, you might choose RACE as a column variable, and YEAR as a row variable, and prepare a table that presents the percentage of the population in each race category by year.  Such table might present the % white, % black etc. in 1850, 1860, and so forth.  Hopefully you can pick a different combination of variables based on your interests.  Most likely you will choose AGE or YEAR as a row variable, and something like education, race, or some other substantive variable as a column variable, and then calculate row percentages so in each year, you can present the % of the population in each of the categories of interest.  Of course you might choose some other combination, like race and education.

Make sure to apply appropriate restrictions (see the prompt for the final project for details of using filters) so that your calculation makes sense.   If you are looking at education, you will almost always want to restrict to people old enough to have finished their education, that is people 25 and above.  If you are looking at something related to marriage, you will want to restrict to people old enough to marry, that is 16 and above.  And so forth.

For the second table, use the comparison of means, to calculate the mean of one variable according to the values of two other variables chosen as row and column variables.  Here is an explanation that I prepared for using comparison of means to calculate percentages/proportions.  For example, you can use comparison of means to calculate the percentage of people who have ever been married, according to their age and level of education.  You would choose age as a row variable, education as a column variable, and then compute the mean of a recoded marital status variable to get the proportion married.  Of course you could also compute the mean of some other variable, like number of children, or income.  You may need to recode so that the mean actually makes sense.

Lecture 5 – 7/16/2013

Divorce and Union Dissolution
Trends in divorce rates: the leveling of divorce in North America, rising divorce rates in East Asia
Racial and socioeconomic differences in divorce
Implications of divorce for couples and for children

Reading

Optional, not required

Assignment 4

Select two or three of the optional readings in the syllabus that are all on a related theme, and write a review and comparison.  What hypotheses do the authors seek to test?  What data and methods do they use?  What are their conclusions?  Which of the readings do you find most convincing?  If you were to carry out a similar analysis in China, what would you focus on?

Lecture 6 – 7/18/2013

Migration
International migration
Domestic migration, residential segregation, and neighborhood formation

Reading

Lecture 7 – 7/23/2013

Health and mortality

Reading

Lecture 8 – 7/25/2013

Research project presentations

Final project due

WEB LINKS

Information for non-SJTU students about registering for the class

Class-related resources

Evaluations from Sociology 116 in Winter 2013

UCLA has been shifting to on-line evaluations.  In the last few weeks, students are prompted to visit a website where they can rate the course and offer feedback.  This quarter I offered an incentive to students to complete their evaluations: I offered a very small amount of credit to the students who were recorded as having completed the evaluations.  I was able to do this because we were told in advance that we receive a list of students who completed evaluations.  Obviously I couldn’t see anybody’s individual responses. Individual responses were anonymous.  I was only able to see the aggregated responses once I had submitted grades.

With this incentive, the response rate was the highest I have ever had for student evaluations: 83 percent.  Previously, when students filled in evaluations by bubbling in forms during lecture, response rates were typically 30 to 40 percent.  I guess I am somewhat surprised that even with the offer of a little bit of credit, 17 percent of the class didn’t bother to fill out an evaluation.  Go figure.

Interestingly, the results with the higher response rate are broadly similar to ones I have received before, with the lower response rate, and the manual forms.  The comments in the feedback are also in line with what I’ve seen in the way of handwritten feedback: well-organized, material is dry, somewhat boring.  Students’ views about whether I am a nice person or not seems to depend on what kind of interaction they had with me: some way say that I am very nice and considerate, and others saying I am not so nice and considerate.  I’m a big fan of transparency, so here is the summary report I downloaded from the course website, after I posted grades:

Course evaluations from W13 Sociology 116

And here is the summary report from my last offering of the class:

http://camerondcampbell.me/2012/01/course-evaluations-from-fall-2011-offering-of-social-demography-sociology-116/

As it always the case, I am also mystified by the students’ stated expectations re their grades.  By the time they fill out the evaluations, it is close to the end of the quarter, and the students know most of their scores.  And I specify the scale in the syllabus.  Students certainly have enough information to make a more informed assessment of their probable grade.  As a number of students noted in their written feedback, the grade it based on many small pieces that accumulate over the course of the quarter, so by week 10, it is usually fairly clear where things are headed.  This quarter, there were certainly a lot more B and C grades at the end than suggested by the distribution of students’ expected grades.  Interestingly, the distribution looked a lot more like the students’ distribution of reported GPAs.

Final Project (Sociology 116 W13)

Sociology 116
Winter 2013
Final Project

Due Friday 3/15 at 11:59pm via TurnItIn.

You are to write an original research paper that uses the IPUMS to carry out a comparative study of time trends and age patterns of the demographic and socioeconomic characteristics by education, income, ethnicity, race, region, sex, or some other variables.  The emphasis is on comparison.  If you are interested in a particular ethnicity, for example, you still need to compare it to other ethnicities or the population as a whole to establish what is distinct about it.

Please read the following directions carefully.  Since you have nearly two months to complete the project, there is no excuse for not complying with the instructions.

Your research paper should be 2500 words of text (roughly 5 single-spaced pages or 10 double-spaced pages) and 7 tables based on computations at the IPUMS site.   The paper should be organized as the text, followed by the references, followed by the tables, with each table on a separate page.  All tables should be publication quality according to the specifications below, not simply copied and pasted from the website.  Do not insert tables into the main text.  Please number all pages, and make sure that your name is on the first page.

The text should consist of four sections: Introduction, Background, Results, and Conclusion.  Below I suggest guidelines for the lengths of each of these sections.  These guidelines are not rigid, and depending on your topic and your findings the actual word count may differ.

The Introduction should explain the overall focus of the paper and specify the questions that you are interested in.   250 words should be sufficient.

The Background section that provides whatever information from other published sources you think may be necessary to help a reader understand the object of your study.  For example, if your tables focus on comparison of different ethnic groups, you might provide a brief history of each group’s history in the United States that focuses on features relevant to the analysis.  If you are comparing several major cities, you might want to mention key features of each relevant to your analyses.  500 words should be sufficient.

A Results section that discusses the tables one by one, and interprets their contents in light of hypotheses or theories in the introduction.  The tables should be numbered consecutively, and referred to in the text as Table 1, Table 2 etc.

The Conclusion reviews the most interesting results in the paper and suggests further work.  250 words should be sufficient.

Tables

Each of the tables should examine relationships among a distinct set of variables.  In other words, the tables should not be repetitions of the same basic tabulation but with different filters.  At least two tables should make use of the American Community Survey (ACS) data, which are annual starting in 2001.  At least two tables should make use of Decennial Census data.

The tables should not be repetitions of ones you have already constructed for a class assignment.

You may also use the Current Population Survey (CPS) data at the IPUMS site.  It tends to have much richer detail on labor force and employment characteristics.

For some of your tables, you may also use General Social Survey (GSS) data, which is available at a different website (http://sda.berkeley.edu/cgi-bin/hsda?harcsda+gss10).  It can be analyzed via a web interface like the one that you are already familiar with at IPUMS.  The GSS includes questions on topics like religion, political views, and so forth that are not covered in the Census.  Keep in mind that if you want to use the GSS, the tables you create should have something to do with demographic behavior, broadly defined.

Each table should also have a self-explanatory title, and the row and column headings should be sufficient to allow a reader to interpret the table without referring to your text.  Each table should include a totals column and/or totals row as appropriate.  Please format the tables so that there are no vertical lines, and only four horizontal lines: one between the title and the column headings, one between the column headings and the table contents, one between the table contents and the totals row, and one at the bottom.  Basically the table should be formatted like the ones you see in the papers in the assigned reading.  You will notice that in publications, tables almost never have vertical lines, and generally have a limited number of horizontal lines.

Either the title of the table or a note at the bottom of the table should specify any restrictions that were applied in selecting observations to be included in the calculation.  Typically this means specifying the ages that were included in the calculation, the the years.

The tables should not be copied and pasted directly from the site, but rather should be prepared to look like they were publication quality, following the guidelines above.

The tables may be frequencies or cross-tabulations like the ones you are already used to.  You are also encouraged to take advantage of some of the other tools available at the site.  You are most likely to find the comparison of means tool (https://sda.usa.ipums.org/helpfiles/helpan.htm#means) the most useful.   This allows you to calculate the mean of one variable for different combinations of other variables.  For example, you could calculate mean income (INCTOT) for different combinations of RACE and YEAR.  If you are more adventurous, you may try using the correlation or regression tools, but these can take a long time.

In constructing your tables, make sure to select or filter observations correctly to make sure the ones you include are relevant.  You can restrict the valid range of a variable used in the analysis to achieve the same effect as a filter: https://sda.usa.ipums.org/helpfiles/helpan.htm#range

Depending on the analysis that you are doing, you may want to use a filter to restrict to people of particular ages, or people with particular characteristics.  For example, when looking at completed education, EDUC, you will almost always want to restrict to people aged 25 or over, so you will only be looking at people who have completed their education.  Similarly, most of the income and occupation variables are only relevant for people of working ages, 18-55.  For details on using the selection filter at IPUMS, please see https://sda.usa.ipums.org/helpfiles/helpan.htm#filter

When constructing tables that are tabulations, you will also want to use recode for any variable that is continuous (a quantity), not discrete (a category).  Examples include age, year of birth, and almost any of the income variables.  If you are working with age, instead of having a separate row or column for each single year of age (1,2,3, etc.) you will want to have a limited number of age groups: 1-9, 10-19 and so on.  Similarly, If you want to use total income (INCTOT), income from wages (INCWAGE), or other variables that record an amount in dollars, not a category, you will definitely need to recode the original values into into categories.  If you attempt to carry out a tabulation in which one of the income variables is a row, column, or control variable, and don’t record, the tabulation will almost certainly fail, with an error message indicating that there are too many rows or columns.  The definition of your income categories will depend on the year that you are looking at.  Because of inflation, typical incomes change dramatically over time.  See  https://sda.usa.ipums.org/helpfiles/helpan.htm#recode on how to carry out a recode.

You will also need to exclude missing or not available (N/A) values, especially if you are computing a mean.  In the IPUMS data, when information is missing for a variable in a particular observation, that is typically represented with a numeric value that will be included in any mean that you compute, unless you exclude it.  This is especially important for income variables.  In total income (INCTOT), missing data is represented by 9999999: https://usa.ipums.org/usa-action/variables/INCTOT/#codes_section.  For wage income (INCWAGE), missing is represented as 999999: https://usa.ipums.org/usa-action/variables/INCWAGE/#codes_section.  For the socioeconomic index (SEI), N/A is represented as 0: https://usa.ipums.org/usa-action/variables/SEI/#codes_section And so on.  If you fail to exclude the numeric codes for missing values from the calculation of a mean, you may get peculiarly high values (if N/A was being represented as 999999) or particularly low values (if N/A was being represented as 0).  If you are using other variables, you will need to check the documentation for them to see how missing or N/A was coded, and then exclude those values.

Demographic and Socioeconomic Characteristics to Treat as Outcomes/Dependent Variables

Basic demographic and socioeconomic variables available in most of the decennial Censuses that you might want to consider as outcomes (dependent variables) include but are not limited to:

  • Current marital status (MARST)
  • Number of children born (CHBORN)
  • Age at first marriage (AGEMARR)
  • Total individual income (INCTOT)
  • Poverty status (POVERTY)
  • Educational attainment (EDUC)
  • Socioeconomic index (SEI) – this is a commonly used measure of the standing of an individual’s occupation.
  • Of course if you have found another variable that you are interested in, you are welcome to use that.  Some of you have mentioned school enrollment, home ownership, type of school, health insurance, and so forth.

The ACS also includes a rich set of demographic variables that could be used as outcomes.  The ACS are the data that show up annually since 2000 for 2001, 2002, 2003 etc.  The most interesting relevant to the class are some variables for very recent years that indicate whether certain events have occurred in the last year, and could be the basis of the calculation of rates, as opposed to percentages:

These lists are only meant as suggestions, and if you have other interests that can be addressed with other variables you have found, you may pursue them.

Demographic  and Socioeconomic Characteristics to Treat as Explanatory/Independent Variables

Generally your explanatory variables should precede your outcome variables in time.  That doesn’t always  mean they have a causal effect on the outcome, but a causal interpretation is at least more plausible.  So, for example, you might examine number of children born (CHBORN) for women aged 45 according to their level of education (EDUC), but you probably won’t think about studying the education of women aged 45 according to their number of children.  The variables are of course the same in both cases, but the interpretation of which is an outcome and which is explanatory differs.

  • Race (RACE) – Note that since 2000, Race includes codes identifying people who have said they were two or more races.   There are also codes since 2000 for single races, for example, RACASIAN
  • Hispanic (HISPAN) – Note that Hispanic status is separate from race.
  • A variety of other nativity and ancestry variables are available at http://usa.ipums.org/usa-action/variables/group/race_eth.  The availability of these variables tends to change over time, so there isn’t really one nativity or ancestry variable that is available on a continuous basis since 1850.  I will post a separate guide to using some of the key variables.
  • Geographic identifiers in http://usa.ipums.org/usa-action/variables/CITY#codes_section.  Note that the IPUMS doesn’t offer any more detail than City, so with IPUMS you can’t compare different neighborhoods in the same city.
  • Of course you can use EDUC, INCTOT and other variables as explanatory variables, just make sure that your dependent variable comes after them in time.

Examples of tables you could construct

  • Use the comparison of means to look at mean number of children born for people of difference races in different years.  In this case, you would select number of children as your dependent variable, and RACE and YEAR as row and column variables.  You would probably want to filter to restrict to (for example) women who were old enough to have completed their childbearing, say 50 years old.  You might want to restrict to decennial census years.
  • Use the comparison of means to look at mean income for people of different ages with different levels of education.  In this case you would select income as your dependent variable, and age and education as your rows and columns.  You would probably want to set a filter to restrict to ages when people might actually have incomes, for example, 25-55.  You would want to recode age so that instead of having fifty rows, one for each age, you have three rows, one for each ten year age group.

Reminders

  • My posts with IPUMS tips and tracks are accessible via http://camerondcampbell.me/category/ipums/ Make sure to review to see if there is anything that helps you.
  • If you are trying to use an income variable such as INCTOT as a row or column variable, you will need to record it into a limited number of categories in order for a table to work.  If you simply specify INCTOT or another income variable as a row or column variable, the table won’t run, because there are too many distinct values, requiring thousands of columns or rows.  You will need to use the recode to regroup incomes into a manageable number of categories, and of course exclude 9999999 and 9999998.
  • Most if not all of the income variables, including INCTOT, FINCTOT, and HINCTOT, code missing values or not available as 9999999,  9999998, 999999, 999998, or some variant thereof.  INCTOT codes missing values as 9999999: https://usa.ipums.org/usa-action/variables/INCTOT/#codes_section.  If you are carrying out a comparison of means, you need to exclude those observations because the average shouldn’t include these values.  You could do this by putting inctot(*-9999997) in the filter.
  • Similarly, If you are categorizing income, make sure that the highest category of income doesn’t include 9999998 and 9999999.  For example, inctot(r:0-9999;10000-19999;20000-29999;30000-39999;40000-49999;50000-9999997)
  • Many of the fertility variables use 0 to indicate missing or no response, 1 to indicate no births or no children.  For example, the ACS variable FERTYR is 0 for Not Available, 1 for no births in the last year, and 2 for one or more births in the last year: https://usa.ipums.org/usa-action/variables/FERTYR#codes_tab .  Similarly, CHBORN is 0 for not available, 1 for no children, 2 for one child, and so forth: In those cases, 2 often means 1 child, 3 means 2 children and so forth: https://usa.ipums.org/usa-action/variables/CHBORN#codes_tab   Be attentive to this when you interpret .  If you are computing mean number of children, or mean numbers of births, you will often want to subtract one from the numbers you present.
  • If you are computing averages of any variables via Comparison of Means, make sure to inspect the detailed documentation for those variables to find out how missing values are coded, and use a selection filter to exclude them.
  • Again, use selection filters to make sure that the observations you include are relevant to the question you are interested in.  For example, if you want to use school to look at whether or not someone is currently enrolled in school, you would want to restrict to people who have a chance of being currently enrolled by applying a selection filter based on age.  Restricting to age(14-18), for example, would let you look at people who were eligible to be eligible to be in high school.  If you are looking at completed education, normally you would want to restrict to ages 25 and above.
  • Remember that not every variable is available in every year.  For the variables you are interested in, check to see which years they are available in.  Some very interesting variables are only available in one or two years.  The variables related to ethnicity, nativity, and origin are especially prone to change.
  • Remember that 2001-2009 are based on the ACS.  If you just want to present data from the decennial Census, you would restrict to years 1850-2000, and if you just wanted ACS data, you would restrict to 2001-2009.
  • Keep in mind that the ACS has some nice variables that allow for direct computation of certain demographic rates, like whether or not someone has married in the last year, whether or not someone has had a birth in the last year, and so forth.

 

Recoding variables at IPUMS

For my social demography class at UCLA, I have the students visit the IPUMS website to do basic analysis. I have been using SnagIt to prepare screen-capture videos demonstrating various capabilities at the site. This one introduces recoding variables. You will probably want to watch it full frame in order to make out the text. I intended this for students enrolled in my class, but hope it is useful for anyone who stumbles across it.

Sociology 116 Social Demography (W13) Syllabus

Sociology 116

Social Demography
Winter 2013

Class web page at UCLA Social Science Computing
Check enrollments at the registrar’s entry
Evaluations from my Fall 2011 offering of Sociology 116

12/14/2012 Revision

 

INTRODUCTION

This is an overview of population and social demography intended to familiarize students with key concepts, major debates, and recent research.  Topics will be a balanced mixture of academic interest, contemporary relevance, and policy concern.  Along the way, the class will introduce methods and data sources used in the study of population and social demography.  Readings will include academic publications that are examples of classic or recent work in key issues of population or social demography.  Students should come away with the class with an awareness of the range of issues considered in population studies and social demography, a basic understanding of relevant data and methods, and an ability to read articles related to population in an informed and critical fashion.

Some of the topics to be covered include:

  • The history of world population
  • Relationships between population growth, economic development, and the environment
  • Contemporary trends in the family, including marriage, cohabitation, and divorce
  • Race, ethnic, socioeconomic and other differences in demographic behavior and family organization in the United States
  • International differences in demographic trends and patterns, with an emphasis on comparison between East Asia and the West
  • Interactions between differentials in birth, death, and marriage patterns and population composition

INSTRUCTOR

Cameron Campbell, camcam@ucla.edu. See class website for location, office hours, phone number.  If you email, please review this guide to ‘etiquette guide’ for emailing professors:  http://www.wikihow.com/Email-a-Professor.  It is most important that in any email you send me, you provide your full name as it appears on the roster, and that you use the account that is on file at URSA.  I usually ignore class-related emails that do not clearly identify the sender, or which come from an email account other than the one on file with the university.  I may also ignore emails requesting information that is already provided in the syllabus.

REQUIRED TEXTS

These should be available at Ackerman, and also on reserve at Powell.

Livi-Bacci, Massimo.  2007.  A Concise History of World Population.  Fifth Edition.  Blackwell Publishing.  (Fourth edition is also OK)

Malthus, Thomas Robert. 2008. An Essay on the Principle of Population, edited by Geoffrey Gilbert. Oxford University Press. ISBN-13: 978-0199540457

Note that Malthus’ Essay is available in many versions. The one I referenced above is the one I submitted in the requisition to the bookstore, and Powell reserves

Additional readings listed below will be available as PDF files on the web, either on a password-protected class website, or else at an external site like JSTOR (www.jstor.org).  Access to external sites will normally require that you have a computer with a UCLA IP address.  This means that you must either use a computer on campus, or if you are off campus, connect to the net through the UCLA VPN or Proxy Server.  You can download the UCLA VPN client for free here: http://www.bol.ucla.edu/services/vpn/.  If you are unsure of how to use the UCLA VPN or Proxy Server, I strongly recommend that you download all the readings when you are on campus.

There will not be a reader available for purchase.  Students are responsible for obtaining access to the readings listed below.

i>clickers

We will be using i>clickers this quarter to test comprehension, poll the class about opinions, and generally make the class more fun and interactive. You will not be graded on whether you answer questions correctly, but rather the frequency of your participation, as described below.

You can purchase a new or used i>clicker 2 at Ackerman Union for $46 (new) or
considerably less for a used one.  You will then need to register the remote.  I have added an i>clicker ‘block’ at the Course website where you should be able to register your device.     To receive credit for participation, you will have to register your clicker and bring it to class.

You may not use a clicker registered to another student in class.  Neither are you allowed to give your clicker to another student to bring to class and use on your behalf.  Any such
cases will be treated as academic dishonesty and referred to the Dean of Students.

I assume there may be some technical problems in the first week, thus assignment of credit based on i>clickr participation will begin in the second week of class.

GRADING

  • Research paper, due at the end of 10th week: 15%
  • Completion of course evaluation at end of quarter: 1%.
    • I will receive a list of names for students who completed the evaluation by the deadline. Obviously, I will not receive anything but the list of names, and will not be given access to any individual responses.
  • Lecture attendance (demonstrated via participation in i-Clickr surveys): 5%
    • Each class, I will carry out one or more surveys about topics related to class via i-Clickr.
    • Participation in these surveys via i-Clickr will be recorded.  Because the i-Clickr will be used to promote interaction in class, not for evaluation, the content of responses will be anonymous.  In other words, I will know that you responded to a question, and what the distribution of responses for the class is, but I will not know how any individual student responded.
    • Each class will be weighted equally, and your score for a class will be based on the proportion of surveys that day which you participate in.
    • Each student is allowed two ‘free’ class absences over the course of the quarter.
  • Section attendance and participation: 4%
    • The TAs will take attendance at every section meeting.  This will account for half of the 4%
    • TAs will assign the other half of the 4% based on participation in class discussion.
  • Assignments: 45%
    • There will be one assignment due roughly every other week, for a total of 4-6 over the course of the quarter.  Early in the quarter, assignments will be more frequent because they are intended to introduce you to important web resources.
    • Prompts for the assignments will be posted on the class website.
    • Some of the exercises will require you to visit websites to gather demographic data, carry out some basic demographic calculations, and write up results.
    • Submit assignments at turnitin.com
      • Access turnitin.com via the link in the course entry at MyUCLA, not by going directly to www.turnitin.com
      • When you enter your name at turnitin.com, please make sure it matches the one on the class roster.
    • Late policy on essays and exercises
      • Within 1 days (24 hours) of the original posted time: 2.5 points will be deducted
      • Within 2 days (48 hours) of the original posted time: 5 points will be deducted
      • Within 1 week of the original posted time: 10 points will be deducted.
      • Within 2 weeks of the original posted time: 20 points will be deducted.
      • After 2 weeks, but before the last day of finals week: 25 points will be deducted.
      • This policy is designed so that if you are a few minutes or hours late on the occasional essay, or very late with one essay, it should have little impact on your grade, but if you are habitually late, or seriously late on more than one essays, it will affect your grade.
  • Midterm exam: 10%
    • There will be a multiple-choice midterm exam assessing knowledge of key facts and concepts during the first half of the quarter.
    • The final exam will be open book and open note.  Tablets, e-readers, or laptops may be used but must be in ‘airplane mode’ i.e. internet connectivity turned off.
  • Final exam: 20%
    • There will be a multiple-choice final exam assessing knowledge of key facts and concepts covered during the quarter.
    • The final exam will be open book and open note.  Tablets, e-readers, or laptops may be used but must be in ‘airplane mode’ i.e. internet connectivity turned off.
  • Scale: 96.7 A+, 93.3 A, 90.0 A-, 86.7 B+ and so forth.
  • All scores will be available at MyUCLA.

Research paper

  • The research paper will describe and interpret patterns and trends in demographic and socioeconomic characteristics of an ethnic group, state or other geographic region (city etc.), or other well-defined subpopulation, using data from IPUMS USA (http://usa.ipums.org/usa/).
  • Characteristics of interest may include age and sex distribution, marital status, childbearing, educational attainment, For the paper, students will carry out tabulations at the IPUMS website, produce tables or graphs, and write accompanying text that refers to relevant literature to interpret observed trends.
  • The paper should include about 8-10 double-spaced pages of text, that is 2000-2500 words.  Tables, graphs, and references will follow at the end.
  • Please begin familiarizing yourself with the IPUMS website as soon as possible.  In addition to visiting the main IPUMS USA page (http://usa.ipums.org/usa/), please make sure to visit the main page for the Online Data Analysis system (ODA) that you will be using to do the calculations for your research paper, and some of the assignments: http://usa.ipums.org/usa/sda/.  There is also a short set of instructions for using the ODA at: http://usa.ipums.org/usa/resources/sda/sdainstructions.pdf
  • If you are especially interested in economic characteristics of your population of interest, you may also want to consider using Current Population Survey (CPS) data: http://cps.ipums.org/cps/.  The Online Data Analysis system for the CPS is available at: http://cps.ipums.org/cps/sda

Essays

  • Some assignments will require you to write an essay.
  • Essays are expected to be 500-700 words each.
  • Essays outside the word range may be penalized. The penalty is not automatic, and depends on the essay content.  If the essay is too long because it is unfocused, rambling, or includes irrelevant material, that may be penalized.  But an essay that is on target and well-written may not be penalized.  Essays under the minimum are more likely to be penalized because in my experience they usually leave our key material that I am looking for.
  • Topics for essays will be announced via email and posted on the announcements section of the class web page. They will normally cover the material in the week in which they were posted.. They will be due the following week, usually on Thursday or Friday
  • Essays and exercises will be graded out of 100 points.
  • The lowest grade will not be dropped.
  • Spelling errors, incorrect or inconsistent word usage, incoherent writing, run-on sentences and other typographical and grammatical errors will all be penalized. You are strongly encouraged to make use of the spell-checker that is no doubt already part of the software you are using for word processing. You should also make use of a grammar checker such as Grammatik. Microsoft Word, and many other packages, now include one.
  • Your essays should demonstrate that you have read all of the assigned material and paid attention in lecture. Failure to demonstrate a careful reading of the assigned material will be penalized.
  • Essays that are substantially off-topic will not receive credit, no matter how long or well-written.
  • Please do not use unusual fonts, line spacing, or other special effects.
  • Essays will be submitted via TurnItIn.
  • Mysterious software and hardware problems that cause your work to vanish after you have completed it but before you have had a chance to submit it, or after you think you submitted it, are not acceptable as excuses for turning in late work.  I strongly suggest that you keep a copy of whatever you submit on your computer, and also confirm after uploading that it has been received by TurnItIn.
  • The written work you submit each week must be your own. Unattributed use of the work of others is plagiarism, and is not acceptable. If you do feel the need to include text from another source, set it off in quotes and include a proper citation. If you have any questions about how to attribute sources, how to use quotations, etc., PLEASE ASK ME OR THE TA! Do not put yourself in jeopardy by submitting an essay that includes material that appears to be plagiarized. I will refer to the Dean any essays that appear to contain material that is not original. Keep in mind that I have complete files of every essay submitted in this class since I began teaching it and electronically compare essays with those submitted in previous years via TurnItIn.
  • In general, I prefer you to paraphrase, not quote. By successfully paraphrasing, you demonstrate your understanding of the material. By providing quotations, you just demonstrate that you can type. If your essay has too many quotations, it will be penalized.
  • If you make a claim or assertion that is not clearly based on material from lecture or the reading, and the validity of it is not self-evident, you must provide evidence to back it up, in the form of a citation or a brief argument. If you can’t do that, you at least must clarify that what you are saying represents a personal opinion by prefacing the claim with “I believe that…” or something equivalent
  • I will grade each essay on a 100 point scale.

POLICIES

  • Announcements will be made via the class web page, and all assignments posted there. You are responsible for checking the web page frequently.
  • If you have an inquiry the answer to which you think would be of general interest to the class, please post it to the discussion board. Thus questions about grading policies, due dates, assignments, lecture material, and so forth should all go to the discussion board. If you contact me with a question that I believe should be posted to the discussion board, I will tell you to post it
  • Otherwise, the best way to reach me is via email.
  • I will try to leave some time at the end of each lecture for questions and discussion. Because the class is large and time is limited, if you have additional questions about the readings or the content of the lectures, please post them to the discussion board. I will do my best to respond promptly. Your classmates are also encouraged to respond.
  • You are always welcome to come to my office. I am guaranteed to be there during office hours.  I am in my office most days 9-5 except when I am teaching or in a meeting.

Starting from the 2nd week of class, I will require participation in class polls via  i>clicker.

SCHEDULE

Important dates

2/12     Tuesday         Midterm, covering material through the end of Week 5.
3/15     Friday             Research Paper Due at Midnight via TurnItIn

Location and time of final TBD.  Please check MyUCLA rather than asking me.

There will be some adjustment to the readings before or during the quarter. 

Changes during the quarter will be announced via email or a post at the class website.  Please check the syllabus before printing ou/downloading the readings for each week.

Week 1 – What is social demography?

Introduction

Sources for the study of social demography

Population growth over the long term

Population studies and the social sciences

Reading

Livi-Bacci, Chapter 1

Preston, Samuel H.  1993.  “The Contours of Demography: Estimates and Projections.”  Demography.  30(4):593-606.  J

McFalls, Joseph.  2007.  “”Population: A lively introduction.  Fifth Edition.”  Population Bulletin.  62(1).  Link

Optional, not required

Keyfitz, Nathan. 1975. “How do we know the facts of demography?” Population and Development Review 1(Dec):267-288. J.

Week 2 – Population, the economy, and the environment

Reading

Livi-Bacci, Chapters 2 and 3

Malthus, An Essay on the Principle of Population, Chapters I-VII, XVI-XIX.  If you have the Geoffrey Gilbert edition, please read his introduction.

Boserup, Ester. 1976. “Environment, population, and technology in primitive societies.” Population and Development Review. 2(March): 21-36. J.

Optional, not required

De Souza, Roger-Mark, John S. Williams, Frederick A.B. Meyerson.  2003’  “Critical Links: Population, Health, and the Environment.”  Population Bulletin.  58(3):1-48.
http://www.prb.org/Source/58.3CriticalLinksPHE_Eng.pdf

Week 3 – Demographic measures and methods

Fertility: Crude birth rates, total fertility rates

Measuring mortality: crude rates, age-specific rates, cause-specific rates and standardized rates, the life table

Reading

Haupt, Arthur.  2004.  Population Handbook.  Fifth Edition.  Washington: Population Reference Bureau.  http://www.prb.org/pdf/PopHandbook_Eng.pdf  Chapters 3, 5

Week 4 – Population Composition

Population aging, causes and consequences of changes in population composition.

Reading

Haupt, Arthur.  2004.  Population Handbook.  Fifth Edition.  Washington: Population Reference Bureau.  http://www.prb.org/pdf/PopHandbook_Eng.pdf  Chapter 2, 9, 11..

Hout, Michael and Joshua Goldstein. 1994. “How 4.5 million Irish immigrants became 40 million Irish Americans: Demographic and Subjective Aspects of Ethnic Composition of White Americans.” American Sociological Review 59:64-82.  J.

Hout, Michael, Andrew Greeley, Melissa J. Wilde. 2001. “The Demographic Imperative in Religious Change in the United States.” American Journal of Sociology. 107(2):468-500. http://www.journals.uchicago.edu/AJS/journal/contents/v107n2.html

Lee, Ronald, and Shripad Tuljapurkar. 1997. “Death and taxes: Longer life, consumption, and social security.” Demography 34: 67-81. J

Week 5 –The Demographic Transition

Reading

Haupt, Arthur.  2004.  Population Handbook.  Fifth Edition.  Washington: Population Reference Bureau.  http://www.prb.org/pdf/PopHandbook_Eng.pdf  Chapter 12.

Livi-Bacci, Chapter 4

Cutler, David and Grant Miller.  2005.  “The Role of Public Health Improvements in Health Advances: The Twentieth-Century United States.”  Demography.  42(1):1-22.  http://muse.jhu.edu/journals/demography/v042/42.1cutler.pdf

Lee, Ronald.  2003.  “The Demographic Transition: Three Centuries of Fundamental Change.”  Journal of Economic Perspectives.  17(4):167-190.  J

Special topic: Fertility in historical and contemporary China, readings TBA

Optional, not required

Caldwell, John C. 1986. “Routes to low mortality in poor countries.” Population and Development Review 12(2):171-220. J.

Cleland, John and Christopher Wilson. 1987. “Demand theories of the fertility transition: An iconoclastic view.” Population Studies.   41:5-30. J.

Population Reference Bureau.  2004.  “Transitions in World Population.”  Population Bulletin.  59(1).  http://www.prb.org/Source/ACFFF4.pdf

Week 6 – Health and Mortality in Developed Countries

Health and mortality differentials by race, gender, socioeconomic status, and other characteristics

MIDTERM (2/12)

Reading

Elo, Irma T. and Samuel H. Preston. 1996. “Education differentials in mortality: United States, 1979-1985.” Social Science and Medicine.  42(1):47-57.

Hayward, Mark D.  2004.  “The Long Arm of Childhood: The Influence of Early-Life Social Conditions on Men’s Mortality.”  Demography.  41(1):87-107.  http://muse.jhu.edu/journals/demography/v041/41.1hayward.pdf

Optional

Palloni, Alberto and Elizabeth Arias.  2004.  “Paradox Lost: Explaining the Hispanic adult mortality advantage.”  Demography.  41(3):385-415.  http://muse.jhu.edu/journals/demography/v041/41.3palloni.pdf

Rogers, Richard G., Robert A. Hummer, Charles B. Nam and Kimberly Peters.  1996.  “Demographic, Socioeconomic, and Behavioral Factors Affecting Ethnic Mortality by Cause.”  Social Forces.  74(4):1419-1438. http://links.jstor.org/sici?sici=0037-7732%28199606%2974%3A4%3C1419%3ADSABFA%3E2.0.CO%3B2-L

Week 7 –Marriage and Cohabitation

Haupt, Arthur.  2004.  Population Handbook.  Fifth Edition.  Washington: Population Reference Bureau.  http://www.prb.org/pdf/PopHandbook_Eng.pdf  Chapters 7, 10.

Qian, Zhenchao and Daniel T. Lichter.  “Social Boundaries and Marital Assimilation:  Interpreting Trends in Racial and Ethnic Intermarriage.” American Sociological Review 72: 68-94.  http://www.ingentaconnect.com/content/asoca/asr/2007/00000072/00000001/art00004

Waite, Linda J. 1995.  “Does Marriage Matter?”  Demography.   32(4):483-507.
http://links.jstor.org/sici?sici=0070-3370%28199511%2932%3A4%3C483%3ADMM%3E2.0.CO%3B2-J

Axinn, William G. and Arland Thornton.  1992.  “The Relationship between Cohabitation and Divorce: Selectivity or Causal Influence?”  Demography.   29(3): 357-374.  http://links.jstor.org/sici?sici=0070-3370%28199208%2929%3A3%3C357%3ATRBCAD%3E2.0.CO%3B2-6

Optional

Schwartz, C. R., and R. D. Mare. 2005. “Trends in Educational Assortative Marriage from 1940 to 2003.”  Demography 42: 621-46.  http://muse.jhu.edu/journals/demography/v042/42.4schwartz.pdf

Oppenheimer, Valerie Kincade, Matthijs Kalmijn, and Nelson Lim. 1997. “Men’s Career Development and Marriage Timing During a Period of Rising Inequality.” Demography.  34(3):311-330.  http://links.jstor.org/sici?sici=0070-3370%28199708%2934%3A3%3C311%3AMCDAMT%3E2.0.CO%3B2-7

Qian, Zhenchao, Sampson Lee Blair, and Stacey Ruf.  2001.   “Asian American Interracial and Interethnic Marriages: Differences by Education and Nativity.” International Migration Review 35: 557-586.  http://links.jstor.org/sici?sici=0197-9183%28200122%2935%3A2%3C557%3AAAIAIM%3E2.0.CO%3B2-9

Week 8 – Reproduction

Billari, Francesco and Hans-Peter Kohler.  2004. “Patterns of Low and Lowest-Low Fertility in Europe.” Population Studies 58(2), 161-176.  http://taylorandfrancis.metapress.com/link.asp?id=67t0ppuum5f714kk

Morgan, S. Philip. 1996. “Characteristic features of modern American fertility.” Pp. 19-63 in John B. Casterline, Ronald D. Lee, and Karen A. Foote (eds.), Fertility in the United States: New Patterns, New Theories.  New York: The Population Council.  http://links.jstor.org/sici?sici=0098-7921%281996%2922%3C19%3ACFOMAF%3E2.0.CO%3B2-I

Morgan, S. Philip.  2003.  “Is Low Fertility a Twenty-First Century Demographic Crisis?”  Demography.  40(4):589-604.
http://links.jstor.org/sici?sici=0070-3370%28200311%2940%3A4%3C589%3AILFATD%3E2.0.CO%3B2-M

Week 9 – Divorce and Union Dissolution

Cherlin, Andrew. 1999. “Going to extremes: Family structure, children’s well-being, and social science.” Demography. 36: 421-428.
http://links.jstor.org/sici?sici=0070-3370%28199911%2936%3A4%3C421%3AGTEFSC%3E2.0.CO%3B2-Z

Goldstein, Joshua R. 1999. “The Leveling of Divorce in the United States.” Demography.  36: 409-14.
http://links.jstor.org/sici?sici=0070-3370%28199908%2936%3A3%3C409%3ATLODIT%3E2.0.CO%3B2-H

Martin, Teresa Castro and Larry Bumpass. 1989. “Recent Trends in Marital Disruption.” Demography 26:37-51.
http://links.jstor.org/sici?sici=0070-3370%28198902%2926%3A1%3C37%3ARTIMD%3E2.0.CO%3B2-Q

Smock, Pamela J., Wendy D. Manning, and Sanjiv Gupta. 1999.  “The Effect of Marriage and Divorce on Women’s Economic Well-Being.” American Sociological Review. 64(6):794-812.
http://links.jstor.org/sici?sici=0003-1224%28199912%2964%3A6%3C794%3ATEOMAD%3E2.0.CO%3B2-S

Optional, not required

Preston, Samuel H. and John McDonald. 1979. “The Incidence of Divorce Within Cohorts of American Marriages Contracted Since the Civil War.” Demography 16(1):1-25.
http://links.jstor.org/sici?sici=0070-3370%28197902%2916%3A1%3C1%3ATIODWC%3E2.0.CO%3B2-T

Week 10 – Migration

Durand, Jorge, William Kandel, Emilio A. Parrado, Douglas S. Massey. 1996. “International migration and development in Mexican communities.”  Demography.  33:249-264.
http://links.jstor.org/sici?sici=0070-3370%28199605%2933%3A2%3C249%3AIMADIM%3E2.0.CO%3B2-Z

Bruch, Elizabeth and Robert Mare.  2006.  “Neighborhood Choice and Neighborhood Change.”  American Journal of Sociology.
http://www.journals.uchicago.edu/AJS/journal/issues/v112n3/090192/090192.web.pdf

WEB LINKS

SJTU Summer Course in Social Demography: Final Project

Final Project
Detailed Description of Requirements
Due on paper at the last class meeting. 
You are to write an original research paper that uses the IPUMS site to carry out a basic comparative study of trends and patterns in the demographic behaviors such as marriage and reproduction by education, income, ethnicity, race, region, sex, or some other variables.  The emphasis is on comparison.  If you are interested in a particular ethnicity, for example, you still need to compare it to other ethnicities or the population as a whole to establish what is distinct about it. 
Please read the following directions carefully.  Since you have approximately 3 weeks to complete the project, there is no excuse for not complying with the instructions.
Your research paper should be based on computations at the IPUMS site.   The paper should be organized as the text, followed by the references, followed by the tables, with each table on a separate page.  You should construct four to six tables, each of which represents a different combination of variables.  Do not insert tables into the main text.  Please number all pages, and make sure that your name is on the first page.
The text should consist of four sections: Introduction, Background, Results, and Conclusion
If you are working together, the requirements for numbers of tables scales up according to the number of people in your team.
The Introduction should explain the overall focus of the paper and specify the questions that you are interested in.  
The Backgroundsection that provides whatever information from other published sources you think may be necessary to help a reader understand the object of your study.  For example, if your tables focus on comparison of different ethnic groups, you might provide a brief history of each group’s history in the United States that focuses on features relevant to the analysis.  If you are comparing several major cities, you might want to mention key features of each relevant to your analyses. 
A Results section that discusses the tables one by one, and interprets their contents in light of hypotheses or theories in the introduction.  The tables should be numbered consecutively, and referred to in the text as Table 1, Table 2 etc. 
The Conclusion reviews the most interesting results in the paper and suggests further work.  

Tables
Each of the tables should examine relationships among a distinct set of variables.  In other words, the tables should not be repetitions of the same basic tabulation but with different filters.  At least two tables should make use of ACS data, which are annual starting in 2001.  At least two tables should make use of Decennial Census data.  You may also use the Current Population Survey (CPS) data at the IPUMS site.  It tends to have much richer detail on employment and so forth.   
For your tables, you may also use General Social Survey (GSS) data, which is available at a different website (http://sda.berkeley.edu/cgi-bin/hsda?harcsda+gss10) but can be analyzed via a web interface like the one that you are already familiar with at IPUMS.  The GSS includes questions on topics like religion, political views, and so forth that are not covered in the Census.  Keep in mind that if you want to use the GSS, the tables you create should have something to do with demographic behavior, broadly defined.
Each table should also have a self-explanatory title, and the row and column headings should be sufficient to allow a reader to interpret the table without referring to the text.  Please format the tables so that there are no vertical lines, and only four horizontal lines: one between the title and the column headings, one between the column headings and the table contents, one between the table contents and the totals row, and one at the bottom.  Basically the table should be formatted like the ones you see in the papers in the assigned reading.  You will notice that in publications, tables almost never have vertical lines, and generally have a limited number of horizontal lines.  The tables should not be copied and pasted directly from the site, but rather should be prepared to look like they were publication quality, following the guidelines above.
The tables may be frequencies or cross-tabulations, or you may take advantage of some of the other analytic tools available at the site.  You are most likely to find the comparison of means tool (http://sda.usa.ipums.org/cgi-bin/sdaweb/hsda?harcsda+1850-2009) the most useful.   This allows you to calculate the mean of one variable for different combinations of other variables.  For example, you could calculate mean income (INCTOT) for different combinations of RACE and YEAR.  If you are more adventurous, you may try using the correlation or regression tools, but these can take a long time.
In constructing your tables, make sure to filter observations correctly to make sure the ones you include are relevant.   Depending on what  analysis you are doing, you may want to use a filter to restrict to particular ages, or people with particular characteristics.  You may also want to use recode for variables like age that take on many values, instead of having a separate row or column for each age (1,2,3, etc.) you just have a limited number of age groups (1-9, 10-19, etc.).  You will also need to pay attention to handling situations where there are codes for a variable that indicate that the information was missing.  Normally you will want to exclude these from your analysis.
Demographic and Socioeconomic Characteristics to Treat as Outcomes/Dependent Variables

Basic demographic and socioeconomic variables available in most of the decennial Censuses that you might want to consider as outcomes (dependent variables) include but are not limited to:
  • Current marital status (MARST)
  • Number of children born (CHBORN)
  • Age at first marriage (AGEMARR)

·         Of course if you have found another variable that you are interested in, you are welcome to use that.  Some of you have mentioned school enrollment, home ownership, type of school, health insurance, and so forth.
The ACS also includes a rich set of demographic variables that could be used as outcomes.  The ACS are the data that show up annually since 2000 for 2001, 2002, 2003 etc.  The most interesting relevant to the class are some variables for very recent years that indicate whether certain events have occurred in the last year, and could be the basis of the calculation of rates, as opposed to percentages:
  • Children born within the last year (FERTYR)
  •  Married, divorced or widowed within the last year (MARRINYR, DIVINYR, WID INYR).

 These lists are only meant as suggestions, and if you have other interests that can be addressed with other variables you have found, you may pursue them.
Socioeconomic Characteristics to Treat as Explanatory/Independent Variables
There are a vast number of variables that you could use as explanatory/independent variables in your analysis.  I have provided a few examples below.  There are many more that you can see at the IPUMS site.
  • Race (RACE) – Note that since 2000, Race includes codes identifying people who have said they were two or more races.   There are also codes since 2000 for single races, for example, RACASIAN
  • Hispanic (HISPAN) – Note that Hispanic status is separate from race.
  • A variety of other nativity and ancestry variables are available at http://usa.ipums.org/usa-action/variables/group/race_eth.  The availability of these variables tends to change over time, so there isn’t really one nativity or ancestry variable that is available on a continuous basis since 1850.  I will post a separate guide to using some of the key variables.
  •  Geographic identifiers in http://usa.ipums.org/usa-action/variables/CITY#codes_section.  Note that the IPUMS doesn’t offer any more detail than City, so with IPUMS you can’t compare different neighborhoods in the same city.
  • Total individual income (INCTOT)
  • Poverty status (POVERTY)
  • Educational attainment (EDUC)
  •  Socioeconomic index (SEI) – this is a commonly used measure of the standing of an individual’s occupation. 
Examples of tables you could construct (please come up with your own tables, don’t just produce these)

Use the comparison of means to look at mean number of children born for people of difference races in different years. In this case, you would select number of children as your dependent variable, and RACE and YEAR as row and column variables. You would probably want to filter to restrict to (for example) women who were old enough to have completed their childbearing, say 50 years old. You might want to restrict to decennial census years.

Use the comparison of means to look at mean income for people of different ages with different levels of education. In this case you would select income as your dependent variable, and age and education as your rows and columns. You would probably want to set a filter to restrict to ages when people might actually have incomes, for example, 25-55. You would want to recode age so that instead of having fifty rows, one for each age, you have three rows, one for each ten year age group.

2012 SJTU Summer Short Course: Social Demography

Social Demography
SJTU Summer Short Semester 2012

INTRODUCTION

This is an overview class intended to familiarize students with key concepts, major debates, and recent research in population and social demography. The focus will be on contemporary trends in marriage, childbearing, divorce, migration, and health and mortality. Issues discussed will be a balanced mixture of topics of academic interest, contemporary relevance, and policy concern. Along the way, methods and data sources used in the study of population and social demography will be introduced. Readings will include academic publications that are examples of classic or recent work in key issues of population or social demography. Students should come away with the class with an awareness of the range of issues considered in population studies and social demography, a basic understanding of relevant data and methods, and an ability to read articles related to population in an informed and critical fashion.

The emphasis will be on trends and patterns in demographic behavior in the contemporary United States, in historical and comparative perspective.

INSTRUCTOR

Cameron Campbell, camcam@ucla.edu


FORMAT

The class will meet twice a week for four weeks. Each class meeting will last for three hours. The first half of each class meeting will be devoted to lecture relevant to the topic and assigned readings. After a break, the second half will be devoted to class discussion and student presentations of optional readings.

REQUIREMENTS

  • Attendance – 10%Attendance will be taken at each lecture.
  • Discussion – 5%Half of each class meeting will be reserved for discussions of the lecture and the assigned readings. Students will be expected to participate in discussion. Each students will be expected to introduce and discuss at least one paper of their own choice that is relevant to the topics in class but not on the list of required readings.
  • Research project (written) – 35%Students will complete a research paper describing and interpreting patterns and trends in demographic and socioeconomic characteristics of an ethnic group, state or other geographic region (city etc.), or other well-defined subpopulation, using data from IPUMS USA (http://usa.ipums.org/usa/). Characteristics of interest may include age and sex distribution, marital status, childbearing, and educational attainment. For the paper, students will carry out tabulations at the IPUMS website, produce tables or graphs, and write accompanying text that refers to relevant literature to interpret observed trends. The text should be about 5-7 double-spaced pages of text.
    • Tables, graphs, and references follow at the end and do not count toward the page requirement.
    • All papers must have a reference section 
    • Please begin familiarizing yourself with the IPUMS website as soon as possible. In addition to visiting the main IPUMS USA page (http://usa.ipums.org/usa/), please make sure to visit the main page for the Online Data Analysis system (ODA) that you will be using to do the calculations for your research paper: http://usa.ipums.org/usa/sda/. There is also a short set of instructions for using the ODA at: http://usa.ipums.org/usa/resources/sda/sdainstructions.pdf
    • If you are especially interested in economic characteristics of your population of interest, you may also want to consider using Current Population Survey (CPS) data: http://cps.ipums.org/cps/. The Online Data Analysis system for the CPS is available at: http://cps.ipums.org/cps/sda
    • The prompt for the research project will be posted separately.
    • You may work together on your projects in teams of up to 4 people.  For team projects, the length requirement is multiplied by the number of team members.  Thus, a paper from a team of two should be 10-14 pages, and a team of three should be 15-21 pages, and a team of four should produce a paper that is 20-28 pages.
  • Research project (presentation) – 15%Students will make short presentations on their research papers at the last two class meetings.  
  • Assignments – 35%Assignments will introduce students to various web resources for population and demography.  There will be 4 to 6 assignments, and they will have equal weight.

READINGS AND RESOURCES

Haupt, Arthur.  2004.  Population Handbook.  Fifth Edition.  Washington: Population Reference Bureau.  http://www.prb.org/pdf/PopHandbook_Eng.pdf

SCHEDULE