Evaluations from my summer 2013 short course in Social Demography at SJTU

I received the evaluations from my summer 2013 short course in Social Demography at Shanghai Jiaotong University.  This undergraduate course is an abbreviated version of the one that I have taught at UCLA in the past, and am teaching at HKUST now.  If I understand the scores correctly, I don’t seem to have done too much damage.

Personally, I believe that aggregated information from teaching evaluations should be public, at least to students.  This should also be combined with efforts to maximize response rates.  I liked the system implemented at UCLA where the administration provided a list of students who had completed web-based evaluations in time for the instructor to provide a small amount of credit included in the calculation of the final grade.  Obviously, all the administration provided was a list of names.  It didn’t include the content of the responses.  We only saw the summary report on the evaluations and the collected written comments students after we turned in grades.

If you are having difficulty viewing the embedded Excel spreadsheet, you can download it here.

You can view other entries where I have posted the class evaluations.

Transferring results from IPUMS to Excel and then into Word

Some students who have begun work on their final projects have asked how to transfer results of online data analysis from IPUMS to Word.  Fortunately, recent versions of Microsoft Excel do a fairly decent job of parsing results that are copied from the website into Excel.  I will provide an example.

Below is a screenshot from a page with output from a tabulation of year versus race:

Capture1

Here, I have used the mouse to select the text that includes the column names, the row names, and the contents of the values…

Capture2

I copy the selected text to the clipboard (Ctrl-C in Windows), go to Excel, go to the upper, left cell (A1) and paste (Ctrl-V in Windows, and Excel automatically formats the text fairly nicely into columns, with no additional work on my part…  Note that I am using Microsoft Office 2010.  I think some earlier versions will do the same, but I’m not sure.

Capture3

 

Of course it is possible to copy larger amounts of material.

All that remains now is to format the table to following the guidelines for the class, and make it look nice.  This means eliminating all vertical lines and most horizontal lines, removing the red and blue shading, stretching columns to accommodate the large numbers. retitling rows and columns, creating a title for the table, removing the boldface and italics, and perhaps a few other tweaks.

Capture4

This is now ready to be pasted into Word.

Note that this table is not yet perfect.  Because I was short of time, I didn’t copy over the totals row.  For most of your tables, you will want to include a totals row.

 

 

 

SJTU 2013 Social Demography Final Project

Social Demography
SJTU Summer Short Semester
2013

Due 7/25 at the beginning of class

You are to write an original research paper that uses the IPUMS website to carry out a comparative study of time trends and age patterns of the demographic and socioeconomic characteristics by education, income, ethnicity, race, region, sex, or some other variables.  The emphasis is on comparison.  If you are interested in a particular ethnicity, for example, you still need to compare it to other ethnicities or the population as a whole to establish what is distinct about it.

Please read the following directions carefully.  Since you have nearly two months to complete the project, there is no excuse for not complying with the instructions.

Your research paper should be 2000 words of text (roughly 4 single-spaced pages or 8 double-spaced pages) and 6 tables based on computations at the IPUMS site.   The paper should be organized as the text, followed by the references, followed by the tables, with each table on a separate page.  All tables should be publication quality according to the specifications below, not simply copied and pasted from the website.  Do not insert tables into the main text.  Please number all pages, and make sure that your name is on the first page.

The text should consist of four sections: Introduction, Background, Results, and Conclusion.  Below I suggest guidelines for the lengths of each of these sections.  These guidelines are not rigid, and depending on your topic and your findings the actual word count may differ.  You may end up with more or fewer words in each section than

The Introduction should explain the overall focus of the paper and specify the questions that you are interested in.   250 words should be adequate.

The Background section that provides whatever information from other published sources you think may be necessary to help a reader understand the object of your study.  For example, if your tables focus on comparison of different ethnic groups, you might provide a brief history of each group’s history in the United States that focuses on features relevant to the analysis.  If you are comparing several major cities, you might want to mention key features of each relevant to your analyses.  500 words should be sufficient.

A Results section that discusses the tables one by one, and interprets their contents in light of hypotheses or theories in the introduction.  The tables should be numbered consecutively, and referred to in the text as Table 1, Table 2 etc.

The Conclusion reviews the most interesting results in the paper and suggests further work.  250 words should be sufficient.

Tables

Each of the tables should examine relationships among a distinct set of variables.  In other words, the tables should not be repetitions of the same basic tabulation but with different filters.  At least two tables should make use of demographic or other variables unique to the American Community Survey (ACS) data, which are annual starting in 2001.  At least two tables should make use of variables from the Decennial Census data.

You may also use the Current Population Survey (CPS) data at the IPUMS site.  It tends to have much richer detail on labor force and employment characteristics.  It may also be harder to use.

For some of your tables, you may also use General Social Survey (GSS) data, which is available at a different website (http://sda.berkeley.edu/cgi-bin/hsda?harcsda+gss10).  It can be analyzed via a web interface like the one that you are already familiar with at IPUMS.  The GSS includes questions on topics like religion, political views, and so forth that are not covered in the Census.  Keep in mind that if you want to use the GSS, the tables you create should have something to do with demographic behavior, broadly defined.

Each table should also have a self-explanatory title, and the row and column headings should be sufficient to allow a reader to interpret the table without referring to your text.  Each table should include a totals column and/or totals row as appropriate.  Please format the tables so that there are no vertical lines, and only four horizontal lines: one between the title and the column headings, one between the column headings and the table contents, one between the table contents and the totals row, and one at the bottom.  Basically the table should be formatted like the ones you see in the papers in the assigned reading.  You will notice that in publications, tables almost never have vertical lines, and generally have a limited number of horizontal lines.

Either the title of the table or a note at the bottom of the table should specify any restrictions that were applied in selecting observations to be included in the calculation.  Typically this means specifying the ages that were included in the calculation, the the years.

The tables should not be copied and pasted directly from the site, but rather should be prepared to look like they were publication quality, following the guidelines above.

The tables may be frequencies or cross-tabulations like the ones you are already used to.  You are also encouraged to take advantage of some of the other tools available at the site.  You are most likely to find the comparison of means tool (https://sda.usa.ipums.org/helpfiles/helpan.htm#means) the most useful.   This allows you to calculate the mean of one variable for different combinations of other variables.  For example, you could calculate mean income (INCTOT) for different combinations of RACE and YEAR.  If you are more adventurous, you may try using the correlation or regression tools, but these can take a long time.

Filter variables to restrict the observations included in the analysis

In constructing your tables, make sure to select or filter observations correctly to make sure the ones you include are relevant.  You can restrict the valid range of a variable used in the analysis to achieve the same effect as a filter: https://sda.usa.ipums.org/helpfiles/helpan.htm#range

Depending on the analysis that you are doing, you may want to use a filter to restrict to people of particular ages, or people with particular characteristics.  For example, when looking at completed education, EDUC, you will almost always want to restrict to people aged 25 or over, so you will only be looking at people who have completed their education.  Similarly, most of the income and occupation variables are only relevant for people of working ages, 18-55.  For details on using the selection filter at IPUMS, please see https://sda.usa.ipums.org/helpfiles/helpan.htm#filter

Recode continuous variables like income, age etc. into a manageable number of categories

When constructing tables that are tabulations, you will also want to use recode for any variable that is continuous (a quantity), not discrete (a category).  Examples include age, year of birth, and almost any of the income variables.  If you are working with age, instead of having a separate row or column for each single year of age (1,2,3, etc.) you will want to have a limited number of age groups: 1-9, 10-19 and so on.  Similarly, If you want to use total income (INCTOT), income from wages (INCWAGE), or other variables that record an amount in dollars, not a category, you will definitely need to recode the original values into into categories.

If you attempt to carry out a tabulation in which one of the income variables is a row, column, or control variable, and don’t record, the tabulation will almost certainly fail, with an error message indicating that there are too many rows or columns.  The definition of your income categories will depend on the year that you are looking at.  Because of inflation, typical incomes change dramatically over time.  See  https://sda.usa.ipums.org/helpfiles/helpan.htm#recode on how to carry out a recode.

Exclude observations with missing or not available (N/A) values

You will also need to exclude missing or not available (N/A) values, especially if you are computing a mean.  In the IPUMS data, when information is missing for a variable in a particular observation, that is typically represented with a numeric value that will be included in any mean that you compute, unless you exclude it.  This is especially important for income variables.  In total income (INCTOT), missing data is represented by 9999999: https://usa.ipums.org/usa-action/variables/INCTOT/#codes_section.  For wage income (INCWAGE), missing is represented as 999999: https://usa.ipums.org/usa-action/variables/INCWAGE/#codes_section.  For the socioeconomic index (SEI), N/A is represented as 0: https://usa.ipums.org/usa-action/variables/SEI/#codes_section And so on.  If you fail to exclude the numeric codes for missing values from the calculation of a mean, you may get peculiarly high values (if N/A was being represented as 999999) or particularly low values (if N/A was being represented as 0).  If you are using other variables, you will need to check the documentation for them to see how missing or N/A was coded, and then exclude those values.

Demographic and Socioeconomic Characteristics to Treat as Outcomes/Dependent Variables

Basic demographic and socioeconomic variables available in most of the decennial Censuses that you might want to consider as outcomes (dependent variables) include but are not limited to:

  • Current marital status (MARST)
  • Number of children born (CHBORN)
  • Age at first marriage (AGEMARR)
  • Total individual income (INCTOT)
  • Poverty status (POVERTY)
  • Educational attainment (EDUC)
  • Socioeconomic index (SEI) – this is a commonly used measure of the standing of an individual’s occupation.
  • Of course if you have found another variable that you are interested in, you are welcome to use that.  Some of you have mentioned school enrollment, home ownership, type of school, health insurance, and so forth.

The ACS also includes a rich set of demographic variables that could be used as outcomes.  The ACS are the data that show up annually since 2000 for 2001, 2002, 2003 etc.  The most interesting relevant to the class are some variables for very recent years that indicate whether certain events have occurred in the last year, and could be the basis of the calculation of rates, as opposed to percentages:

These lists are only meant as suggestions, and if you have other interests that can be addressed with other variables you have found, you may pursue them.

Demographic  and Socioeconomic Characteristics to Treat as Explanatory/Independent Variables

Generally your explanatory variables should precede your outcome variables in time.  That doesn’t always  mean they have a causal effect on the outcome, but a causal interpretation is at least more plausible.  So, for example, you might examine number of children born (CHBORN) for women aged 45 according to their level of education (EDUC), but you probably won’t think about studying the education of women aged 45 according to their number of children.  The variables are of course the same in both cases, but the interpretation of which is an outcome and which is explanatory differs.

  • Race (RACE) – Note that since 2000, Race includes codes identifying people who have said they were two or more races.   There are also codes since 2000 for single races, for example, RACASIAN
  • Hispanic (HISPAN) – Note that Hispanic status is separate from race.
  • A variety of other nativity and ancestry variables are available at http://usa.ipums.org/usa-action/variables/group/race_eth.  The availability of these variables tends to change over time, so there isn’t really one nativity or ancestry variable that is available on a continuous basis since 1850.  I will post a separate guide to using some of the key variables.
  • Geographic identifiers in http://usa.ipums.org/usa-action/variables/CITY#codes_section.  Note that the IPUMS doesn’t offer any more detail than City, so with IPUMS you can’t compare different neighborhoods in the same city.
  • Of course you can use EDUC, INCTOT and other variables as explanatory variables, just make sure that your dependent variable comes after them in time.

Examples of tables you could construct

  • Use the comparison of means to look at mean number of children born for people of difference races in different years.  In this case, you would select number of children as your dependent variable, and RACE and YEAR as row and column variables.  You would probably want to filter to restrict to (for example) women who were old enough to have completed their childbearing, say 50 years old.  You might want to restrict to decennial census years.
  • Use the comparison of means to look at mean income for people of different ages with different levels of education.  In this case you would select income as your dependent variable, and age and education as your rows and columns.  You would probably want to set a filter to restrict to ages when people might actually have incomes, for example, 25-55.  You would want to recode age so that instead of having fifty rows, one for each age, you have three rows, one for each ten year age group.

Reminders

  • My posts with IPUMS tips and tracks are accessible via http://camerondcampbell.me/category/ipums/ Make sure to review to see if there is anything that helps you.
  • If you are trying to use an income variable such as INCTOT as a row or column variable, you will need to record it into a limited number of categories in order for a table to work.  If you simply specify INCTOT or another income variable as a row or column variable, the table won’t run, because there are too many distinct values, requiring thousands of columns or rows.  You will need to use the recode to regroup incomes into a manageable number of categories, and of course exclude 9999999 and 9999998.
  • Most if not all of the income variables, including INCTOT, FINCTOT, and HINCTOT, code missing values or not available as 9999999,  9999998, 999999, 999998, or some variant thereof.  INCTOT codes missing values as 9999999: https://usa.ipums.org/usa-action/variables/INCTOT/#codes_section.  If you are carrying out a comparison of means, you need to exclude those observations because the average shouldn’t include these values.  You could do this by putting inctot(*-9999997) in the filter.
  • Similarly, If you are categorizing income, make sure that the highest category of income doesn’t include 9999998 and 9999999.  For example, inctot(r:0-9999;10000-19999;20000-29999;30000-39999;40000-49999;50000-9999997)
  • Many of the fertility variables use 0 to indicate missing or no response, 1 to indicate no births or no children.  For example, the ACS variable FERTYR is 0 for Not Available, 1 for no births in the last year, and 2 for one or more births in the last year: https://usa.ipums.org/usa-action/variables/FERTYR#codes_tab .  Similarly, CHBORN is 0 for not available, 1 for no children, 2 for one child, and so forth: In those cases, 2 often means 1 child, 3 means 2 children and so forth: https://usa.ipums.org/usa-action/variables/CHBORN#codes_tab   Be attentive to this when you interpret .  If you are computing mean number of children, or mean numbers of births, you will often want to subtract one from the numbers you present.
  • If you are computing averages of any variables via Comparison of Means, make sure to inspect the detailed documentation for those variables to find out how missing values are coded, and use a selection filter to exclude them.
  • Again, use selection filters to make sure that the observations you include are relevant to the question you are interested in.  For example, if you want to use school to look at whether or not someone is currently enrolled in school, you would want to restrict to people who have a chance of being currently enrolled by applying a selection filter based on age.  Restricting to age(14-18), for example, would let you look at people who were eligible to be eligible to be in high school.  If you are looking at completed education, normally you would want to restrict to ages 25 and above.
  • Remember that not every variable is available in every year.  For the variables you are interested in, check to see which years they are available in.  Some very interesting variables are only available in one or two years.  The variables related to ethnicity, nativity, and origin are especially prone to change.
  • Remember that 2001-2009 are based on the ACS.  If you just want to present data from the decennial Census, you would restrict to years 1850-2000, and if you just wanted ACS data, you would restrict to 2001-2009.
  • Keep in mind that the ACS has some nice variables that allow for direct computation of certain demographic rates, like whether or not someone has married in the last year, whether or not someone has had a birth in the last year, and so forth.

 

2013 SJTU Summer Short Course: Social Demography

Social Demography

Shanghai Jiaotong University
Summer Short Semester 2013
7/1/2013-7/26/2013

Course description at Shanghai Jiaotong University website: http://summer.jwc.sjtu.edu.cn/web/sjtu/XJXQ/198690.htm

INTRODUCTION

This is an overview class intended to familiarize students with key concepts, major debates, and recent research in population and social demography. The focus will be on contemporary trends in marriage, childbearing, divorce, migration, and health and mortality. Issues discussed will be a balanced mixture of topics of academic interest, contemporary relevance, and policy concern. Along the way, methods and data sources used in the study of population and social demography will be introduced. Readings will include academic publications that are examples of classic or recent work in key issues of population or social demography. Students should come away with the class with an awareness of the range of issues considered in population studies and social demography, a basic understanding of relevant data and methods, and an ability to read articles related to population in an informed and critical fashion.

The emphasis will be on trends and patterns in demographic behavior in the contemporary United States, in historical and comparative perspective.

INSTRUCTOR

Cameron Campbell, camcam@ucla.edu

FORMAT

The class will meet twice a week for four weeks. Each class meeting will last for three hours. The first half of each class meeting will be devoted to lecture relevant to the topic and assigned readings. After a break, the second half will be devoted to class discussion and student presentations of optional readings.

REQUIREMENTS

  • Attendance – 10% Attendance will be taken at each lecture.
  • Discussion – 10% Part of each class meeting will be reserved for discussions of the lecture and the assigned readings. Students are also welcome to initiate discussion or ask questions during lecture, without waiting for the time dedicated to discussion.  Students will be expected to participate in discussion.
  • Research project (written) – 35% Students will complete a research paper describing and interpreting patterns and trends in demographic and socioeconomic characteristics of an ethnic group, state or other geographic region (city etc.), or other well-defined subpopulation, using data from IPUMS USA (http://usa.ipums.org/usa/). Characteristics of interest may include age and sex distribution, marital status, childbearing, and educational attainment. For the paper, students will carry out tabulations at the IPUMS website, produce tables or graphs, and write accompanying text that refers to relevant literature to interpret observed trends. The text should be about 5-7 double-spaced pages of text.
    • Tables, graphs, and references follow at the end and do not count toward the page requirement.
    • All papers must have a reference section
    • Please begin familiarizing yourself with the IPUMS website as soon as possible. In addition to visiting the main IPUMS USA page (http://usa.ipums.org/usa/), please make sure to visit the main page for the Online Data Analysis system (ODA) that you will be using to do the calculations for your research paper: http://usa.ipums.org/usa/sda/. There is also a short set of instructions for using the ODA at: http://usa.ipums.org/usa/resources/sda/sdainstructions.pdf
    • If you are especially interested in economic characteristics of your population of interest, you may also want to consider using Current Population Survey (CPS) data: http://cps.ipums.org/cps/. The Online Data Analysis system for the CPS is available at: http://cps.ipums.org/cps/sda
    • The detailed prompt for the research project is available separately.
    • You may work together on your projects in teams of 2 people.  For team projects, the length requirement is multiplied by the number of team members.  Thus, a paper from a team of two should be 10-14 pages.
  • Presentation on research project  – 15% Students will make short presentations on their research papers at the last two class meetings.
  • Assignments – 30% Assignments will introduce students to various web resources for population and demography.  Assignments should be handed in to the TA at the beginning of the class on the day that they are do.  See the class schedule later in the syllabus for descriptions of the assignments.

READINGS AND RESOURCES

Haupt, Arthur.  2004.  Population Handbook.  Fifth Edition.  Washington: Population Reference Bureau.  http://www.prb.org/pdf/PopHandbook_Eng.pdf

TOPICS AND READINGS ARE PRELIMINARY, AND MAY CHANGE.  CHECK BACK BEFORE CLASS STARTS.

SCHEDULE

Lecture 1 – 7/2/2013

Introduction
Sources for the study of social demography
Population growth over the long term
Population studies and the social sciences

Reading

  • McFalls, Joseph.  2007.  “”Population: A lively introduction.  Fifth Edition.”  Population Bulletin.  62(1).  Link
  • Haupt, Chapters 1 and 2

Optional, not required

  • Preston, Samuel H.  1993.  “The Contours of Demography: Estimates and Projections  Demography.  30(4):593-606.  JSTOR
  • Keyfitz, Nathan. 1975. “How do we know the facts of demography?” Population and Development Review 1(Dec):267-288. J.

Discussion

Self-introductions

Lecture 2 – 7/4/2013

Demographic behavior in the past
Marriage and childbearing before the 20th century: East-West comparisons
Household and family before the 20th century
Mortality and fertility decline, and demographic transition

Reading

Optional, not required

  • Campbell, Cameron and James Lee. 2010. “Fertility control in historical China revisited: New methods for an old debate.” History of the Family. 15:370-385. doi:10.1016/j.hisfam.2010.09.003.

Discussion

Introduction to IPUMS

Assignment 1

Please review the topics in the syllabus.  Which topic do you find most interesting?  Why?  What related to that topic would you most like to learn about?  One single-spaced page.

Lecture 3 – 7/9/2013

Marriage and Cohabitation
Trends in age at marriage and non-marriage in Asia, North America, and Europe
Non-marriage
Socioeconomic, racial and ethnic differences in marriage
Interracial marriage, educational homogamy, and other aspects of partner choice
Emerging trends: living together apart

Reading

Optional, not required

Discussion

Ideas for topics for the final paper.

Assignment 2

Review the variables available for analysis at the IPUMS website.  Make sure to look at variables available for the Decennial Censuses (1850-2010) and in the American Community Survey (annually since 2000).  After you have examined the site to see what is available.  Write a page identifying a topic you would like to work on for your final paper and listing the variables that you plan to make use of.

Lecture 4 – 7/11/2013

Racial and socioeconomic differences in childbearing in the U.S.
Non-marital childbearing and childrearing
Changing age patterns of childbearing
Ultra-low fertility in Europe and Asia

Reading

Optional, not required

The West

China

The Rest of the World

Assignment 3

Prepare two tables at the IPUMS website using variables that you are interested in. For this exercise, I strongly encourage you to learn how to recode variables, and use filters to limit the observations included in the calculation.  Recoding variables allows you to regroup values so that for example instead of having a separate row for every year of age, you can have age groups 20-24, 25-29 etc.  If you can do all of this for this exercise, completing the project should be straightforward.  Make sure to pay attention to handling of missing values.

Make sure to read the description of the final project carefully for detailed instructions on handling variables.  Pay special attention to the discussion of recoding variables, handling missing values, and restricting observations by use of filters.

For the first table, carry out a cross-tabulation of one variable against another, with appropriate restrictions on cases and so forth.  By cross-tabulation, I mean that you should select one variable of interest as a row variable, and another variable of interest as a column variable, and use the IPUMS website to prepare a table that summarizes the distribution of one of the variables as a function of the other variable.  For example, you might choose RACE as a column variable, and YEAR as a row variable, and prepare a table that presents the percentage of the population in each race category by year.  Such table might present the % white, % black etc. in 1850, 1860, and so forth.  Hopefully you can pick a different combination of variables based on your interests.  Most likely you will choose AGE or YEAR as a row variable, and something like education, race, or some other substantive variable as a column variable, and then calculate row percentages so in each year, you can present the % of the population in each of the categories of interest.  Of course you might choose some other combination, like race and education.

Make sure to apply appropriate restrictions (see the prompt for the final project for details of using filters) so that your calculation makes sense.   If you are looking at education, you will almost always want to restrict to people old enough to have finished their education, that is people 25 and above.  If you are looking at something related to marriage, you will want to restrict to people old enough to marry, that is 16 and above.  And so forth.

For the second table, use the comparison of means, to calculate the mean of one variable according to the values of two other variables chosen as row and column variables.  Here is an explanation that I prepared for using comparison of means to calculate percentages/proportions.  For example, you can use comparison of means to calculate the percentage of people who have ever been married, according to their age and level of education.  You would choose age as a row variable, education as a column variable, and then compute the mean of a recoded marital status variable to get the proportion married.  Of course you could also compute the mean of some other variable, like number of children, or income.  You may need to recode so that the mean actually makes sense.

Lecture 5 – 7/16/2013

Divorce and Union Dissolution
Trends in divorce rates: the leveling of divorce in North America, rising divorce rates in East Asia
Racial and socioeconomic differences in divorce
Implications of divorce for couples and for children

Reading

Optional, not required

Assignment 4

Select two or three of the optional readings in the syllabus that are all on a related theme, and write a review and comparison.  What hypotheses do the authors seek to test?  What data and methods do they use?  What are their conclusions?  Which of the readings do you find most convincing?  If you were to carry out a similar analysis in China, what would you focus on?

Lecture 6 – 7/18/2013

Migration
International migration
Domestic migration, residential segregation, and neighborhood formation

Reading

Lecture 7 – 7/23/2013

Health and mortality

Reading

Lecture 8 – 7/25/2013

Research project presentations

Final project due

WEB LINKS

Information for non-SJTU students about registering for the class

Class-related resources

SJTU Summer Course in Social Demography: Final Project

Final Project
Detailed Description of Requirements
Due on paper at the last class meeting. 
You are to write an original research paper that uses the IPUMS site to carry out a basic comparative study of trends and patterns in the demographic behaviors such as marriage and reproduction by education, income, ethnicity, race, region, sex, or some other variables.  The emphasis is on comparison.  If you are interested in a particular ethnicity, for example, you still need to compare it to other ethnicities or the population as a whole to establish what is distinct about it. 
Please read the following directions carefully.  Since you have approximately 3 weeks to complete the project, there is no excuse for not complying with the instructions.
Your research paper should be based on computations at the IPUMS site.   The paper should be organized as the text, followed by the references, followed by the tables, with each table on a separate page.  You should construct four to six tables, each of which represents a different combination of variables.  Do not insert tables into the main text.  Please number all pages, and make sure that your name is on the first page.
The text should consist of four sections: Introduction, Background, Results, and Conclusion
If you are working together, the requirements for numbers of tables scales up according to the number of people in your team.
The Introduction should explain the overall focus of the paper and specify the questions that you are interested in.  
The Backgroundsection that provides whatever information from other published sources you think may be necessary to help a reader understand the object of your study.  For example, if your tables focus on comparison of different ethnic groups, you might provide a brief history of each group’s history in the United States that focuses on features relevant to the analysis.  If you are comparing several major cities, you might want to mention key features of each relevant to your analyses. 
A Results section that discusses the tables one by one, and interprets their contents in light of hypotheses or theories in the introduction.  The tables should be numbered consecutively, and referred to in the text as Table 1, Table 2 etc. 
The Conclusion reviews the most interesting results in the paper and suggests further work.  

Tables
Each of the tables should examine relationships among a distinct set of variables.  In other words, the tables should not be repetitions of the same basic tabulation but with different filters.  At least two tables should make use of ACS data, which are annual starting in 2001.  At least two tables should make use of Decennial Census data.  You may also use the Current Population Survey (CPS) data at the IPUMS site.  It tends to have much richer detail on employment and so forth.   
For your tables, you may also use General Social Survey (GSS) data, which is available at a different website (http://sda.berkeley.edu/cgi-bin/hsda?harcsda+gss10) but can be analyzed via a web interface like the one that you are already familiar with at IPUMS.  The GSS includes questions on topics like religion, political views, and so forth that are not covered in the Census.  Keep in mind that if you want to use the GSS, the tables you create should have something to do with demographic behavior, broadly defined.
Each table should also have a self-explanatory title, and the row and column headings should be sufficient to allow a reader to interpret the table without referring to the text.  Please format the tables so that there are no vertical lines, and only four horizontal lines: one between the title and the column headings, one between the column headings and the table contents, one between the table contents and the totals row, and one at the bottom.  Basically the table should be formatted like the ones you see in the papers in the assigned reading.  You will notice that in publications, tables almost never have vertical lines, and generally have a limited number of horizontal lines.  The tables should not be copied and pasted directly from the site, but rather should be prepared to look like they were publication quality, following the guidelines above.
The tables may be frequencies or cross-tabulations, or you may take advantage of some of the other analytic tools available at the site.  You are most likely to find the comparison of means tool (http://sda.usa.ipums.org/cgi-bin/sdaweb/hsda?harcsda+1850-2009) the most useful.   This allows you to calculate the mean of one variable for different combinations of other variables.  For example, you could calculate mean income (INCTOT) for different combinations of RACE and YEAR.  If you are more adventurous, you may try using the correlation or regression tools, but these can take a long time.
In constructing your tables, make sure to filter observations correctly to make sure the ones you include are relevant.   Depending on what  analysis you are doing, you may want to use a filter to restrict to particular ages, or people with particular characteristics.  You may also want to use recode for variables like age that take on many values, instead of having a separate row or column for each age (1,2,3, etc.) you just have a limited number of age groups (1-9, 10-19, etc.).  You will also need to pay attention to handling situations where there are codes for a variable that indicate that the information was missing.  Normally you will want to exclude these from your analysis.
Demographic and Socioeconomic Characteristics to Treat as Outcomes/Dependent Variables

Basic demographic and socioeconomic variables available in most of the decennial Censuses that you might want to consider as outcomes (dependent variables) include but are not limited to:
  • Current marital status (MARST)
  • Number of children born (CHBORN)
  • Age at first marriage (AGEMARR)

·         Of course if you have found another variable that you are interested in, you are welcome to use that.  Some of you have mentioned school enrollment, home ownership, type of school, health insurance, and so forth.
The ACS also includes a rich set of demographic variables that could be used as outcomes.  The ACS are the data that show up annually since 2000 for 2001, 2002, 2003 etc.  The most interesting relevant to the class are some variables for very recent years that indicate whether certain events have occurred in the last year, and could be the basis of the calculation of rates, as opposed to percentages:
  • Children born within the last year (FERTYR)
  •  Married, divorced or widowed within the last year (MARRINYR, DIVINYR, WID INYR).

 These lists are only meant as suggestions, and if you have other interests that can be addressed with other variables you have found, you may pursue them.
Socioeconomic Characteristics to Treat as Explanatory/Independent Variables
There are a vast number of variables that you could use as explanatory/independent variables in your analysis.  I have provided a few examples below.  There are many more that you can see at the IPUMS site.
  • Race (RACE) – Note that since 2000, Race includes codes identifying people who have said they were two or more races.   There are also codes since 2000 for single races, for example, RACASIAN
  • Hispanic (HISPAN) – Note that Hispanic status is separate from race.
  • A variety of other nativity and ancestry variables are available at http://usa.ipums.org/usa-action/variables/group/race_eth.  The availability of these variables tends to change over time, so there isn’t really one nativity or ancestry variable that is available on a continuous basis since 1850.  I will post a separate guide to using some of the key variables.
  •  Geographic identifiers in http://usa.ipums.org/usa-action/variables/CITY#codes_section.  Note that the IPUMS doesn’t offer any more detail than City, so with IPUMS you can’t compare different neighborhoods in the same city.
  • Total individual income (INCTOT)
  • Poverty status (POVERTY)
  • Educational attainment (EDUC)
  •  Socioeconomic index (SEI) – this is a commonly used measure of the standing of an individual’s occupation. 
Examples of tables you could construct (please come up with your own tables, don’t just produce these)

Use the comparison of means to look at mean number of children born for people of difference races in different years. In this case, you would select number of children as your dependent variable, and RACE and YEAR as row and column variables. You would probably want to filter to restrict to (for example) women who were old enough to have completed their childbearing, say 50 years old. You might want to restrict to decennial census years.

Use the comparison of means to look at mean income for people of different ages with different levels of education. In this case you would select income as your dependent variable, and age and education as your rows and columns. You would probably want to set a filter to restrict to ages when people might actually have incomes, for example, 25-55. You would want to recode age so that instead of having fifty rows, one for each age, you have three rows, one for each ten year age group.

2012 SJTU Summer Short Course: Social Demography

Social Demography
SJTU Summer Short Semester 2012

INTRODUCTION

This is an overview class intended to familiarize students with key concepts, major debates, and recent research in population and social demography. The focus will be on contemporary trends in marriage, childbearing, divorce, migration, and health and mortality. Issues discussed will be a balanced mixture of topics of academic interest, contemporary relevance, and policy concern. Along the way, methods and data sources used in the study of population and social demography will be introduced. Readings will include academic publications that are examples of classic or recent work in key issues of population or social demography. Students should come away with the class with an awareness of the range of issues considered in population studies and social demography, a basic understanding of relevant data and methods, and an ability to read articles related to population in an informed and critical fashion.

The emphasis will be on trends and patterns in demographic behavior in the contemporary United States, in historical and comparative perspective.

INSTRUCTOR

Cameron Campbell, camcam@ucla.edu


FORMAT

The class will meet twice a week for four weeks. Each class meeting will last for three hours. The first half of each class meeting will be devoted to lecture relevant to the topic and assigned readings. After a break, the second half will be devoted to class discussion and student presentations of optional readings.

REQUIREMENTS

  • Attendance – 10%Attendance will be taken at each lecture.
  • Discussion – 5%Half of each class meeting will be reserved for discussions of the lecture and the assigned readings. Students will be expected to participate in discussion. Each students will be expected to introduce and discuss at least one paper of their own choice that is relevant to the topics in class but not on the list of required readings.
  • Research project (written) – 35%Students will complete a research paper describing and interpreting patterns and trends in demographic and socioeconomic characteristics of an ethnic group, state or other geographic region (city etc.), or other well-defined subpopulation, using data from IPUMS USA (http://usa.ipums.org/usa/). Characteristics of interest may include age and sex distribution, marital status, childbearing, and educational attainment. For the paper, students will carry out tabulations at the IPUMS website, produce tables or graphs, and write accompanying text that refers to relevant literature to interpret observed trends. The text should be about 5-7 double-spaced pages of text.
    • Tables, graphs, and references follow at the end and do not count toward the page requirement.
    • All papers must have a reference section 
    • Please begin familiarizing yourself with the IPUMS website as soon as possible. In addition to visiting the main IPUMS USA page (http://usa.ipums.org/usa/), please make sure to visit the main page for the Online Data Analysis system (ODA) that you will be using to do the calculations for your research paper: http://usa.ipums.org/usa/sda/. There is also a short set of instructions for using the ODA at: http://usa.ipums.org/usa/resources/sda/sdainstructions.pdf
    • If you are especially interested in economic characteristics of your population of interest, you may also want to consider using Current Population Survey (CPS) data: http://cps.ipums.org/cps/. The Online Data Analysis system for the CPS is available at: http://cps.ipums.org/cps/sda
    • The prompt for the research project will be posted separately.
    • You may work together on your projects in teams of up to 4 people.  For team projects, the length requirement is multiplied by the number of team members.  Thus, a paper from a team of two should be 10-14 pages, and a team of three should be 15-21 pages, and a team of four should produce a paper that is 20-28 pages.
  • Research project (presentation) – 15%Students will make short presentations on their research papers at the last two class meetings.  
  • Assignments – 35%Assignments will introduce students to various web resources for population and demography.  There will be 4 to 6 assignments, and they will have equal weight.

READINGS AND RESOURCES

Haupt, Arthur.  2004.  Population Handbook.  Fifth Edition.  Washington: Population Reference Bureau.  http://www.prb.org/pdf/PopHandbook_Eng.pdf

SCHEDULE