Overall Estimation Strategy

The main goal of the Small Area Income and Poverty Estimates project is to make intercensal estimates of median income and numbers of poor for states and counties. For each of five key income and poverty statistics at the state level and each of four key income and poverty statistics at the county level, we have used a combination of multiple regression estimation techniques and shrinkage techniques to create these estimates. At the state level we model poverty rates (or something very close to them); to obtain estimates of numbers of poor persons we multiply these rates by demographic estimates of their denominators. At the county level we model number of poor directly; we do not model poverty rates for counties because we do not know how to gauge the quality of the population estimates for counties. Our modeling relies on administrative data derived from tax returns, counts of food stamp participants, data from the Bureau of Economic Analysis (BEA), decennial census estimates, intercensal population estimates, and the March Current Population Survey (CPS). Using these administrative and survey data, we build dependent and independent variables for our models and test our models.

Estimates from the March CPS provide the measures of income and poverty that serve as the dependent variables in the regression models. The March CPS estimates have been chosen over those from the decennial census for several reasons: First, the March CPS provides the only timely, consistent series of income and poverty estimates during the intercensal period. Second, the March CPS is the official source of national poverty estimates. Third, if we relied on estimates from the most recent census as the dependent variable we would have to assume that the relationship between the dependent and predictor (administrative) variables remained constant during the postcensal period; events have already proven that assumption false. Hence, use of the March CPS data permits development of new sets of equations for each target year during the decade and these equations will reflect then-current relationships between income, poverty, and their predictor variables.

Choice of the March CPS as the dependent variable in our regression models has two consequences. First, postcensal estimates are not updates of the census income and poverty measures, because the CPS and Census are known to estimate different measures of income and poverty. The model-based intercensal estimates are not directly comparable to the census. Further information is available by returning to the Documentation page and consulting the "Cautions" section. Second, because the March CPS sample size is relatively small, we divide the task of providing intercensal estimates for states and counties into related but separate modeling efforts. While the March CPS sample sizes for some states are large enough to permit the derivation of direct state estimates for some of the key statistics, they are not sufficient for all statistics in some states or any statistics in most states. Direct, useable estimates from the CPS are possible for only a handful of counties, and only slightly more than one-third of all counties contain any March CPS sample households. In short, the strategy of separating the state and county models was adopted because we felt that models constructed for states would be superior in terms of goodness-of-fit, and that their results could provide "controls" to which the weaker county estimates could be adjusted.

For the state regression models, single-year March CPS estimates are used as the dependent variable. In the case of the county regression models, a three-year average of the March CPS income and poverty estimates (e.g., 1992-1994 for 1993) is used. A county regression equation is estimated on the basis of observations from the 1200 to 1500 counties included in the March CPS sample. From this estimated equation and known values of administrative variables, a regression "prediction" is obtained for each county. For each county with sample cases in the CPS, the model prediction is combined with the direct sample estimate, with each component receiving a weight. The sum of the two weights for each county is 1.0; the weight for the model prediction component is the ratio of the sampling variance of the direct estimate to the total variance (sampling plus "lack of fit") of the direct estimate. Using this technique, the more uncertain the direct sample estimate, the larger the contribution from the regression model. These weights are commonly referred to as "shrinkage weights" and the final estimates as "shrinkage estimates." For counties which are not in the CPS sample, the estimates are based solely on the regression equation.

Shrinkage techniques are commonly used in estimating values for small geographic areas. We use them to help reduce the uncertainty of our estimates, and to take advantage of all the information we have. However, significant reductions in variances are achieved only in a few counties where the March CPS sample size is relatively large. The average of the weights on the direct CPS estimate, over all counties, is less than 0.02 for related children 5-17 and all people under age 18 in poverty, and less than 0.03 for people of all ages in poverty. Of course, over half the counties have zero weights on their direct estimates because they are not in the CPS sample. For only a handful of counties (7 to 49 depending on the poverty population being estimated and the year) are the weights on the direct CPS estimates 0.25 or more, and only 1 to 3 of these weights exceed 0.50.

The last step in our state and county estimation procedure is to use a simple ratio technique to control the sum of the state-level estimates of the number of poor to the March CPS national estimate of the number of poor, and the sum of the estimates of the number of poor in each state's counties to the new estimate for the state. Estimated medians for counties are not similarly controlled to state or national medians. For ease of reference, we usually refer to the final county estimates as 'model-based'.

We also produce School District Estimates on the basis of our county model-based estimates. Further details on the SAIPE state and county models can be found by returning to the Documentation page and consulting the "Model Details by Income Year" section.

Source: U.S. Census Bureau

**Last Revised: Thursday, 19-Aug-99 08:08:43**

Census 2000 | Subjects A to Z | Search | Product Catalog | Data Access Tools | Privacy · Policies | Contact Us | Home