Frequently Asked Questions
How do I get started using the state legislative ideology data?
First, download the data set you are interested in as well as the codebook (instructions). When you download the data from Dataverse, you can choose the format you want to download it in. Choose Stata or R if you are using those statistical packages, or CSV if you want to import the data into a spreadsheet like Excel or Google Docs, or some other package. You can download the entire data set or only a subset.
Technical details about how derive our measures can be found in our American Political Science Review paper here. Alternatively, reading the blog posts on this site as well as Boris Shor’s research blog should give you a non-technical introduction to our methodology, as well as practical examples of using the data to explain current politics.
What’s the difference between the aggregate (state level) and individual level data, and when would I want to use one or the other?
These are two perspectives on the same data. The individual level data include our measures for individual state legislators who’ve served in elected office some time between 1993 and 2011. This would be useful if you are interested in something related to individual state legislators; this could include trying to understand how liberal or conservative they are, or explaining their votes on particular bills.
The aggregate, or state level, data is a more forest-level perspective. The individual data is combined together for legislative chambers in particular years. This would be useful in comparing states with each other, such as seeing which states are the most and least polarized. This data can also be useful in understanding how a state changes over time. For example, our data shows that Nebraska’s Unicameral appears to be rapidly polarizing.
What are the key variables I should be looking at?
The first thing you should be doing is looking at the codebooks for each of the two data sets. These contain brief descriptions of each variable in the data.
That said, for the individual level data, np_score is the key variable. This is our measure of the ideology of that particular state legislator. Larger (positive) numbers are more conservative; smaller (negative) numbers are more liberal. This is a typical convention in ideology research. The chamber-year variables such as “house2008” or “senate2010” indicate the years in which that legislator served in that particular chamber, denoted by a “1”.
For the aggregate level data, there are a number of key variables, depending on the user’s interests. For comparing states against each other in terms of ideology, one would look at hou_chamber or sen_chamber, which gives the median ideology of that chamber in that state in that year. determining how liberal or conservative Republicans and Democrats are in a state in a given year, you could look at hou_dem, hou_rep, sen_dem, or sen_rep. If you are interested in polarization, you could look at h_diffs or s_diffs which measure how far the median (center) of the parties are from each other (a common measure of polarization).
Why is there only one score per legislator in the individual level data? And how come there are sometimes two?
Our methodology assigns a single score per legislator throughout their career within a given party and a given district. If they change parties they are treated as a “new” legislator; the same is true if they switch districts as a result of redistricting.
Political scientists have found plenty of evidence that elected legislators, such as Members of Congress, barely switch their ideological position over time. In the words of Keith Poole, they “die in their ideological boots.” The sole exception to that consistency is when legislators switch parties; they then dramatically change their voting behavior, which we call ideology.
This has a practical consequence for the state level data. When states have different scores over time, this is because of the change in their composition–some legislators arrive, others leave–and not because individual legislators change their minds over time.
How come the legislator or state I’m interested isn’t in the data?
While the maximum extent of our data is from 1993 to 2013, we don’t have measures for all states for this time frame (because of the incredibly labor-intensive nature of obtaining the data). Most data is concentrated after 1995. For technical reasons we are missing a number of states in 2009-2010.
Hey — there’s a mistake in your data. Senator Joe Shmo didn’t serve in 2008 — he served in 2010!
Thanks — let us know below what the problem is.