Here however is detail on the 3rd step:
What is the regression estimate? It is an analysis of what the polling data “should” be in each state based on its underlying demographics. Put differently, it is a way not to be held hostage by the results of individual polls that might defy common sense, particularly where polling data in a state is sparse.
Polls are an imperfect measure of voter sentiment, subject to the vagaries of small sample size, poor methodology, and transient blips and trends in the numbers. For example, the late February SurveyUSA polls had Barack Obama four points ahead of John McCain in North Dakota, but behind by four points in South Dakota. Since North Dakota and South Dakota are very similar, it is unlikely that there is a true eight-point differential in the polling in these states. The regression estimate is able to sniff out such discrepancies.
For general background on the process of regression analysis, see here.
What is the dependent variable in the regression analysis? Technically speaking, there are two regressions that are computed in each state. The first regression is a regression on the share of the two-way (Democrat + Republican) vote held by the Democratic candidate in that state based on our current polling averages after adjustment for present trendlines. The second is a regression on the total committed vote held by either of the major-party candidates.
What independent variables are included in the regression estimate? The regression models evaluate a total of 16 candidate variables. Variables are dropped via a stepwise process, until such time as each remaining variable is statistically significant at the 85% level or higher.
The 16 variables presently considered by the model are as follows:
Political
1. Kerry. John Kerry’s vote share in 2004. Note that an adjustment is made in Massachusetts and Texas, the home states of Kerry and George W. Bush respectively, based on Al Gore’s results in Massachusetts in 2000, and Bob Dole’s results in Texas in 1996.
2. Fundraising Share. The total share of funds raised in that state by each candidate (expressed specifically as the percentage of all funds raised that were raised by the Democratic candidate).
3. Clinton. The percentage of the two-way (Obama + Clinton) Democratic primary vote received by Hillary Clinton in that state. An adjustment is made to caucus states to account for their higher proclivity to vote for Barack Obama. In Michigan, the variable is based on the results of exit polling, which indicated who voters would have selected if all candidates were on the ballot.
4. Liberal-Conservative (Likert) Score. Per 2004 exit polls, a state’s liberal-conservative orientation, wherein each liberal voter is given a score of 10, each moderate a score of 5, and each conservative a score of 0. The most liberal state, Massachusetts, has a Likert score of 5.65. The most conservative, Utah, has a score of 3.30.
Religious Identity
5. Evangelical. The proportion of white evangelical protestants in each state.
6. Catholic. The proportion of Catholics in each state.
7. Mormon. The proportion of LDS voters in each state.
Ethnic and Racial Identity
8. African-American. The proportion of African-Americans in each state.
9. Hispanic. The number of Latino voters in each state as a proportion of overall voter turnout in 2004, as estimated by the Census Bureau. The reason I use data based on turnout rather than data based on the underlying population of Latinos is because Latino registration and turnout varies significantly from state to state. It is much higher in New Mexico, for instance, which has many Hispanics who have been in the country for generations, than it is in Nevada, where many Hispanics are new migrants and are not yet registered.
10. “American”. The proportion of residents who report their ancestry as “American” in each state, which tends to be highest in the Appalachians. See discussion here.
Economic
11. PCI. Per capita income in each state.
12. Manufacturing. The proportion of jobs in each state that are in the manufacturing sector.
Demographic
13. Senior. The proportion of the white population aged 65 or older in each state. Because life expectancy varies significantly among different ethnic groups, this version has more explanatory significance than when looking at the entire (white and non-white) population.
14. Twenty. The proportion of residents aged 18–29 in each state, as a fraction of the overall adult population..
15. Education. Average number of years of schooling completed for adults aged 25 and older in each state.
16. Suburban. The proportion of voters in each state that live in suburban environments, per 2004 exit polls.
How often is the regression updated? The regression updates automatically based on the latest polling data. Periodically, I will also test out new variables for potential inclusion in the model.
============
So to me, this seems like a sincere attempt is made adjust for actuals, not to adjust to make things seem better for their guy. Bottom line is the only poll that matters is the one on November 4, but this is the most comprehensive and least biased site I know of. If you have a better one, clue me in. And realize that of the sites that predict EVs, they are FAR from the most optimistic.
It’s also easy to crap on something you don’t like, but show me an alternate that makes more sense and says what you want it to say. Oh right, sorry…it doesn’t exist.