Comparing birth cohorts instead of time periods in the IPUMS

[Another note intended for students in my Introduction to Social Demography class who are using IPUMS-USA for their final projects, but which may be of interest to others using IPUMS in their courses]

Many students have expressed interest in examining time trends in average age at marriage, total number of children, completed education, and other phenomena that are fixed relatively early in life.  Looking at these numbers by Census year (i.e. by making year a row or column variable) is plausible, but doing so mixes together people who came of age in various eras, unless there is some carefully restriction on the ages of the people.

For example, looking at total number of children born for women in a single Census mixes together relatively young people who went through their childbearing recently, when birth rates were low, and people who went through childbearing earlier, when birth rates were high.  This makes comparison across Census years problematic.  Similar problems exist for average age at marriage, and so forth.  One approach is to use a filter to limit the women included in a comparison to a narrow age range which is easy to compare across census.

Another approach, however, is to use a recoded variable for year of birth as a row or column variable, and thereby compare men or women according to the era in which they were born.  This is fairly straightforward.

As an example, to look at trends in average age at marriage in successive birth cohort, I set up a comparison of means calculation.  The dependent variable to agemarr, the row variable to birthyr(r:1890-1899;1900-1909;1910-1919;1920-1929;1930-1939;1940-1949), the column variable to sex, and the filter to agemarr(1-99) age(40-50) birthyr(1890-1940).  I restricted to age 40-50 so that the calculation would only include people who had an opportunity to marry. Birthyr is limited to 1940 because agemarr is available only through 1980.  The result was the following:

If I wanted to do this by race, I could have set the column variable to race, and the control variable to sex.

If I wanted graph of the average ages for people born in individual years, I can specify the row variable as birthyr but with no recode, and then down below check ‘Suppress table’ (since it will have 50 rows, one for each year) and then under ‘Type of chart’ choose line chart.  The result is the following:

Of course, you could just as easily do this by race, or education, or something else.

As another example, I redid the calculation to look at mean number of children ever born (chborn).  I set the dependent variable to chborn, the row to birthyr(r:1850-1859;1860-1869;1870-1879;1880-1889;1890-1899;1900-1909;1910-1919;1920-1929;1930-1939;1940-1949), the filter to age(50-80) birthyr(1850-1940) chborn(1-*).  The restriction of chborn to 1 and higher reflects the fact that chborn is 0 for people for whom the information is not available, and 1+the number of the children for everyone else.  The filter for age being 50-80 restricts to women who are at least age 50, and have therefore completed their childbearing.  Thus chborn being 1 means 0 children etc.  In interpreting the results below, remember that the mean of chborn is one higher than the actual mean number of children.  To get the mean number of children, you need to subtract one.

You may want to at least consider this approach for any outcome that is fixed relatively early in life, and may vary a lot according to the era in which someone grew up.  Educational attainment would be another logical choice.