DINOs DINOs where are the DINOs?

There is a lot of talk about DINOs (Democrats In Name Only).  There is, in fact, more talk than there are DINOs.  There is only ONE DINO in the senate (and it’s not Jolting Joe) and perhaps 5 or 8 in the House. And ALL those DINOs are from red red red areas.  

More on the flip
cross posted from daily Kos

Ben Nelson is the only Senate DINO
In the house, there are perhaps 5 DINOs: Boren (OK 2), Taylor (MS 4), Marshall (GA 3), Patterson (MN 7); borderline DINOs are Cramer (AL 5), Barrow (GA 12) and  Melancon (LA 3).

Over at Political Arithmetik Charles Franklin made this point (look for the entry on National Journal numbers)

Here’s the key chart

Now, let’s look at those districts.

Nebraska, Nelson’s state, gave 66% of its votes to Bush over Kerry.  The other senator is Hagel, who, his statements on the war aside, got a 20 rating from the ADA and an 87 from the ACU in 2004.  In the NJ ratings for 2006, Hagel got a liberalism score of 27, Nelson a 45.  

Nelson’s not my favorite Senator. But he’s better than Hagel.

OK-2 went for Bush 59-41 in 2004, and for Boren by 66-34.  Bush or Boren?  You decide. Boren makes a safe D seat out of a solid R district.  Go Boren.

MS-4 went 68-31 for Bush in ’04, and yet gave Taylor a 64-33 victory.  Another solid D seat in a solid R district.  

GA-3  is a little closer. 55-44 for Bush.  63-37 for Marshall, against the same person who he beat by only 51-49 in 2002.

MN-7 is another district that went for Bush: 55-43.  Yet it gave Peterson a 66-34 victory in ’04

AL-5  went 60-39 for Bush, 73-27 for Cramer

GA-12 is the closest district in the DINO list: 54-46 for Bush, 52-48 for Barrow

finally LA-3 went 58-41 for Bush, and in 2004 Melancon won a nailbiter over Billy Tauzin.  In 2002, Tauzin got a 0 from ADA and a 96 from ACU.

There are the DINOs.  There aren’t many.  And, if they weren’t there, Republicans would be.

Let’s stop bashing Democrats and concentrating on bashing Repubs

What are you reading? Movies that are as good as books

Lots of books get made into movies.  Usually, at least in my opinion, the movie is worse.  Often much worse.  Sometimes it’s just as good.  On very rare occasions, the movie is better.

Less often, movies get made into books.  

Below the fold, let’s discuss books and movies
I plan to post this over at Big Orange on Friday, as part of my series.  I didn’t want it on both sites at the same time, so here it is now.

I am not nearly as much of a moviegoer as a book reader, so a lot of this will be in comments…..

Book into movie and the movie is BETTER

The Princess Bride.  The movie is WONDERFUL.  One of the very few movies that you can take your kids to (starting at, say, age 6 or so) and could also enjoy alone.  Perfect casting, wonderful lines, stuff for kids to laugh at, stuff that adults will get…..just….well go see it.  Having loved the movie, I read the book.  Feh.  It’s a nothing.  I think this is the only pairing I know of where the book is just MUCH worse than the movie.  

Tag lines: “This man is only mostly dead!”
“My name is Inigo Montoya.  You killed my father.  Prepare to die”
“As you wish”
“Inconceivable!”
“Liar! Liar Liar Liar Liar!”

Schindler’s List.    The book (in some editions entitled Schindler’s Ark) was very, very good.  The movie is magnificent. One of the great films of all time.  Go.  But expect to be haunted.   (Very brief synopsis: This is the story of Oskar Schindler, a German businessman who, before WW 2 was more or less a jerk, but rose magnificently to save many Jews from Hitler).

Book into movie – both great

The Godfather  One of the rare pairings where each form (book and film) takes great advantage of the strengths of the medium.  The movie follows the book very closely, but each offers different strengths.  

The book makes the Corleone saga into the American Dream in a way that the movie does not.  The book also gets into the minds of the leads in ways the movie cannot.  But the movie!  

2001: A Space Odyssey  I thought the beginning (where the monkeys discover the monolith, was better in the book.  But the visual effect of the movie is stunning.  Both are great.

These are a few of my favorite blogs

(cross posted from daily Kos)

Below the fold, I’ve listed a few of my favorite blogs.  But, you could see my favorites just by looking at my blog roll, so I’ve said a little about each.  

And, in the comments, you can do the same  – and if you tell us about them, that will help us all
Some that are similar to Booman, or to Kos

The Impeach Project
and
never in our names
are wonderful ideas: Impeaching Bush and Stopping Torture.   I wish there were more posts on each – but hey! that means you can post there and have your diary be on the list for a long while!  So go there and post!

Perhaps the most similar to dailyKos is Booman Tribune, but it’s slower, allowing (sometimes) a different type of discussion.  And I love the lounges, which are different from Open Threads……

(see? I mentioned Booman on Big Orange)

Progressive Historians bills itself as “a sort of Daily Kos for the historical set”.  I’m not an historian, but history is interesting, and so is this blog.  A nice place

Street Prophets is Pastor Dan’s blog.  I’m an atheist, but a respectful and curious atheist.  I’ve always been made welcome there.  Since the vast majority of Americans profess some kind of faith, we need to work together if we are going to get anywhere.  

Now, for something completely different:
Statistical modeling, causal inference, and social science is run by Andrew Gelman at Columbia U.  It’s got a lot of fairly theoretical stuff on elections and voting (also other stuff), at a not too high math level.  For stats geeks like me, it’s great!

Good math, bad math devotes itself to ‘finding the fun in good math, squashing bad math and the fools who promote it’.  Lots of interesting articles on debunking creationism, as well as a variety of topics related to math and programming.  

Of all the sites devoted to polls and other political numbers, my favorite is Political Arithmetik, where Charles Franklin does a bang-up job of using graphs and text to illuminate what’s going on in the world of politics.  This is what social science should be, but often isn’t: Informative, interesting, and compelling.

Information aesthetics takes a look at some really great graphs, and ways to use data and visualization in creative and beautiful ways.  

finally, pictures of numbers has a lot of practical advice on making graphs better (which can also be used to catch graphs that are misleading….not that anyone ever uses graphs to mislead….no….)

National Journal numbers

The National Journal has put up their ratingsfor 2006, and, once again, Charles Franklin over at Political Arithmetik has done some great stuff with them.  these ratings are percentiles: A leibaralism rating of 97% of senate/house members were less liberal than you

Some of my own analysis, and some comments, below the fold
The first two charts on the PA site are dot plots.  These are wonderful.  They should be used more often.  

What do they say?  Well, look at what Franklin has written, but some things are obvious:

  1.  Nelson (D-NE) is by FAR the most conservative Dem in the Senate.
  2.  Lieberman (??-CT) is on the conservative side for a Dem, but not outrageiously so…..he is about as conservative as
  3.  Clinton (D-NY) (and if you look at the National Journal site that Franklin links to, you can see Hillary moving rightward).

on the Repub. side

4 Of the Senators who lost or left (Frist, Allen, Burns, Santorum, Talent, DeWine and Chaffee), four were less conservative than at least half their Repub. colleagues – that’s not unusual.  But, NONE of the 18 most conservative lost or left, and THREE of the 18 least conservative lost (Talent, DeWine and Chafee) (the 18 wasn’t a random choice….it divides the Republican Senators into 3 roughly equal sized groups).

Looking at the next set of numbers, Franklin astutely points out that Lieberman is unusual for a Dem only on Foreign Policy

Next, there’s a dot plot showing Senate overlap.  Now that Chafee is out, there is almost no overlap – only Nelson (D-NE) belongs with the other party, in terms of conservatism score.

If you go over to the National Journal link, there’s more interesting stuff.  The most liberal Senate delegations are from IL (Durbin and Obama, total liberalism 181.2), MA (Kennedy and Kerry – total liberalism 179.4); and  MD (Sarbanes and Mikulski,  total 178.5)

Of the defeated House members, only 3 had conservatism scores under 50; the mean was 66.7.

Of the Presidential wannabes in Congress….most liberal in 2006 is Kucinch (87 rating) followed by Obama (86) and Dodd (84), but lifetime, the most liberal is Obama (84.3), and no one else is close.  Of course, given that these are percentiles, lifetime rating is somewhat unfair because it’s easier to be more liberal now than it was 15 years ago.

Lots of other good stuff at those two sites.  Enjoy!

Stats 101: Part 2

The other day, we looked at measures of central tendency.  Today, we will look at measures of spread, and tomorrow, measures of shape .  One way of looking at a measure of central tendency is as your best guess of what something will be.  Measures of spread tell you how good that guess is, and measures of shape tell you how you are likely to be wrong.

There’s more, after the fold.
Statistics can be divided into two big areas: Descriptive statistics and inferential statistics.  Descriptive statistics is about describing data, and inferential statistics is about making inferences from a sample to a population.  Suppose, for instance, you were interested in the average income of adults in the USA.  You can’t get the information on the whole population, so you take a sample.  (We’ll get into ways to do this in a later diary).  When you try to say things about the whole population based on your sample, that’s inferential statistics. When you are just talking about your sample, that’s descriptive statistics.

Sometimes, though, you do have the whole population.  If you wanted to find the average SAT score in a class of students, you could ask everyone.  Then you don’t need to infer anything.

(By the way, don’t get used to these terms being sensible.  Statisticians often use familiar words in unfamiliar ways; in particular, when statisticians use the words significance, power, random, and confidence, they don’t mean exactly what they do in everyday discorse.  Don’t blame me, I didn’t make up the terms).

OK, enough background.  Let’s say you’ve collected the data on whatever it is you are interested in.  There are often several things you are interested in.  You are interested in what a typical person is like, and for this, the measures of central tendency are good.  You can think of this as ways to formalize the idea of a best guess.  But you are also interested in how good that guess is.  For that, you need a measure of spread.  There are several popular ones.  By far the most common is the standard deviation.  Others are the variance,  range, and the interquartile range.

The standard deviation of a sample  is gotten by

  1. Finding the mean
  2. Subtracting each value in your sample from the mean
  3. Squaring each of these
  4. Adding the result of step 3
  5. Dividing by n 6) Taking the square root of step 5

(As an aside, is there a way to type formulas here?)

For the variance, just leave out step 5.

The range is just the lowest value to the highest (it’s usually given as both numbers).  The interquartile range requires first dividing the data into quartiles, which essentially means putting them into order, then taking the bottom quarter, the middle (which is the same as the median), and the top quarter.  The interquartile range is the range from the first quartile to the third (if you remember percentiles, then the first quartile is the same as the 25%tile and the third quartile is the 75%tile).

Enough math.  Those who want more formal definitions and examples can, of course see wikipedia or some such.

When is each of these good? Or bad?

Well, the standaard deviation is usually good for the cases where the mean is a good measure of central tendency (see yesterday’s diary).  The variance is not used much in everyday reporting, it’s mostly used for further statistical work.  The range is almost always useful, and easy to interpret, and the interquartile range ought to be used a lot more, because, once you understand it, it’s easy to interpret, and it gives a good sense of the spread.  

Examples of when SD is better, and when the IQR or range is better. Briefly, if you think the mean is a good measure of central tendency, then usually the SD is a good measure of spread. If you use the median, then you often want the IQR and range in addition to (or even instead of) the SD. And, if there is no good measure of central tendency, there is likely to be no good measure of spread. Some concrete examples: If you wanted to know the average IQ of Boomanites, then (presuming you could get a good sample, which I will talk about in another diary) the mean would be a good measure of central tendency, and the SD a good measure of spread. IQ is normally distributed (we’ll get to that in another diary, too) (actually, there is evidence that IQ isn’t exactly normally distributed, but it’s close). OTOH, if you wanted to know about the income of people at the pond, then the median would be a good measure of central tendency, and, while the SD wouldn’t exactly be WRONG, I would want to look at IQR and range as well. Finally, if you wanted to look at the heights and weights of professional athletes (as a whole group) then no measure of CT would be really good, nor would any measure of spread, because the group is composed of people who are too different from one another.

Statistics 101 Part 1

Just a little while ago, I asked in Froggy Bottom if there was interest in a series on Statistics. There was some! :-).  So, here’s the first entry.

All of these were originally on Daily Kos.

This series will not be for the statistical experts, it will be for those who want to be able to understand some basic statistics, without a lot of heavy-duty math.  I’ll try to emphasize aspects I think will be of interest to Kossacks, including how to tell when someone is misleading you with statistics. I welcome comments, suggestions, and thoughts both from people who are reading this as an intro to statistics and from the more statistically literate.

In today’s diary, I will discuss measures of central tendency.  See you after the fold.
There are various ways to classify variables.  One useful way is to distinguish between continuous and categorical data.  Data is continuous if it can (at least in theory) take on any number.  Data is categorical if it can only take on certain numbers.  For example, weight, income, age and IQ are continuous.  Political party, hair color, and marital status are categorical.

When you have continuous data, two things that you often want to know are “What values are likely?”  and “How spread out are the values?”  Today, we will look at the first question, which, in statistician’s language, is called central tendency.  The most common measure of central tendency is the mean, which is often called the average.  The other commonly quoted measure of central tendency is the median.  We’ll look at those two and a couple others.

The mean is probably  familiar.  Add up the numbers, divide by how many numbers there are, and you’ve got it.  So, for example, if the IQs of the people in your family are

155  (that would be you)
135   (your sister)
and
70   (her wingnut husband)
then the average is (155 + 135 + 70)/ 3 = 120

The median is the number that splits the data into two equal halfs, with half being higher, and half lower (there are slightly more technical definitions, but this will do for our purposes).

Two other, less commonly used measures are the mode and the trimmed mean.  The mode is the most common value, and the trimmed mean is the mean after you throw out some extreme values (typically the highest 10% and the lowest 10%).  

When do you want each?  When do you want to use none of them?

There are some situations where no measure works well.  The most common is when the data are multimodal.  That means that the data have common values that are separated by some uncommon values.  For example, if you had a bunch of athletes from different sports (basketball players, football players, and jockeys), and were intrested in their weights, then no measure of central tendency would be good.

But, more often, you want some measure of central tendency, and have to decide which one.

The mean is a bad choice if the data are skewed, which means that there are some extreme values.  One common example of this is income.  Some people make a whole lot more than the average person, but no one makes that much less.  For instance, if the average income in the USA is $30,000 per year (I made that up) then there are some people who make millions more than that, but the poorest people make $30,000 less.  When the data are skewed, the median and the trimmed mean are good choices.  (You don’t see the trimmed mean much, but it can be very useful).

The mode is sometimes also a good choice.  Suppose, for example, you are reporting on a country where nearly everyone is a peasant making almost nothing, and there are a few multibillionaires making a lot, and a few more people in the middle.  Like this

Income                   Number of people
$100 per year                 1,000,000
$1000 to $100,000 per year        10,000
More                                 500

then the mean would be distorted by the few people making  huge amounts, and the median would be distorted by the pople making a middle amount; the mode would be $100 per year, and that would be a good representation of the income.

Another thing that often goes wrong with the mean is to average things that can’t be averaged.  The most common is to average percentages.  This is a bad idea.  I can get into why if people ask, but this diary is already getting very long, so I will stop here and wait for questions, comments and so on. OK, people have asked for an explanation of why averaging percentages is bad, so here is one (with made up data). Suppose the vote in some political race is as follows: State Democrat Republcan Calif 60% 40% NY 65% 35% South Dakota 35% 65% Alaska 40% 60% (other states data too) If one averages the percentages, one would get 50% each, but that isn’t right. A percentage is a form of a fraction, and you have to add the numerators and denominators and then form a new percentage, that is, add up the NUMBER voting Dem and Repub. and then get the percentage from the total

What are you reading?

(cross posted from daily Kos, where this diary is a Friday morning feature)

Rereading

Quicksilver by Neal Stephenson.  This 1,000 page book is the FIRST volume of three.  And I’m reading it AGAIN.  SO, I must like it, right?  Newton, Leibniz, vagabonds, sex, history, politics, more sex, science…….great stuff.

If you haven’t read Stephenson before, I recommend starting with Cryptonomicon which actually takes place much later, but is a good introduction.

More Oral Sadism and the Vegetarian Personality edited by Glenn Ellenbogen.  Readings from the Journal of Polymorphous Perversity.  A spoof of psychology journals. If you have ever had to read scholarly psychology articles, you should find this hilarious. Even if you haven’t, you might.

Just started
Forecasting presidential elections by Steven Rosenstone.  Although it’s 25 years old, it comes highly recommended by Andrew Gelman, who runs Statistical modeling, causal modeling, and social science, one of my favorite blogs.  If Andrew likes it, it’s going to be good.

In the middle of
A world without time: The forgotten legacy of Godel and Einstein by Palle Yourgrau.  I love this sort of book, and Yourgrau explains things well, although his writing is odd….not unclear, but sort of….I dunno what the word is.  He writes like an incredibly learned 8 year old….that isn’t right, that’s insulting.  Each sentence is fine, but the paragraphs are oddly joined…..I still recommend it.

the last man who knew everything by Andrew Robinson. All about Thomas Young, who was a physician, a physicist, and a philologist (and that’s just the ph s!).  Young proved that light had to be a wave, he deciphered hieroglyphics, he ran a medical practice, and formulated the 3 color theory of color perception.  He was also an expert engineer, and contributed about a dozen articles to the Encyclopedia Britannica.

(I was reading this and have lost it…..I hope to find it in some pile somewhere)

The discoveries by Alan Lightman. The greatest breakthroughs in 20th century science, with the original papers.  Frankly, I find Lightman’s explanations wonderful….clear, concise etc.  The original papers….well, I can’t understand them.

The singularity is near by Ray Kurzweil.  Kurzweil thinks that really really powerful computers will usher in utopia.  He’s been right before.  

Just finished

Out of the labyrinth: Setting mathematics free by Robert and Ellen Kaplan.  The Kaplans run The Math Circle, which is, IMHO, a stunningly good way to teach math. This book is good, but not stunningly so.  If you are interested in education or math, or, especially, math education, this is worthwhile.  It will help if you know some math yourself.  To whet your appetite, they get 5 year olds talking about the nature of infinity.

I will be writing a diary about this as part of the Ed/Up series, on 3/24 (on dailyKos)

Am Stat Assn calls for do-over in FL 13

(cross posted from dailykos)

I’m a statistician.  I belong to the American Statistical Association.  In the latest issue of Amstat News, they have called for a do-over in the race in the 13th CD of Florida

more below the fold
I will use the analysis presented in AMSTAT News together with other information; for example, AMSTAT News refers to ‘Candidate A’ and ‘Candidate B’; I’ve substituted the actual names and parties.

Florida’s 13 CD was open; it has previously been held by the infamous Katherine Harris.  The Republican was Vern Buchanan, and the Democrat was Christine Jennings.  Buchanan has claimed victory.  Jennings has not conceded defeat.  

There were approximately 240,000 votes cast.  Buchanan got 50.08%, and Jennings 49.92%-  the difference was 401 votes.  In Amstat’s words “arguably too close to call, even if only the usual errors were present”

But, as they point out, more than the usual errors were present.  In Sarasota county (which uses touch screens) there were 18,000 votes not counted – 13% of the votes cast.  In other counties in the 13th CD, undervoting ranged from 1% to 5%.  Many voters in the Sarasota County reported difficulty in finding the race.  The votes that DID get counted in that county favored Jennings by 6%, and, if the undercount had been similar to that on absentee ballots in Sarasota, Jennings would have picked up 830 votes, and won the election.

The short article concludes

We conclude that we don’t know who won.  Our professional recommendation is simple: Do it over

This was not an editorial, this was a summary of the findings of the ASA’s Scientific and Public Affairs Advisory Committee

Abraham Lincoln opposes Bush and the Iraq war

We all know that Abraham Lincoln was one of the greatest presidents.  We all know he gave some of the best speeches.  We all know about his leadership in the Civil War, and his signing of the Emancipation Proclamation.

But did you know he could see the future?  

Well, not really.   The piece I have in mind is not about Bush and Iraq, but about Polk and Mexico……but it’s eerie.

And it’s below the fold

I was rereading Gary Wills’ magnificent Lincoln at Gettysburg and ran across this speech of Lincoln’s

    I more than suspect already that he is deeply conscious of being in the wrong – that he feels the blood of this war, like the blood of Abel, is crying to heaven against him.  That originally having some strong motive – what, I will not stop now to give my opinion concerning – to involve two countries in a war, and trusting to escape scrutiny, by fixing the public gaze upon the exceeding brightness of military glory – that attractive rainbow that rises in showers of blood – that serpent’s eye that charms to destroy – he plunged into it and has swept on and on till, disappointed in his calculation of the ease with which Mexico might be subdued, he now finds himself he knows not where.  How like the half-insane mumbling of a fever dream is the whole war part of his late message!…..His mind, tasked beyond its power, is running hither and thither, like some tortured creature on a burning surface, finding no position on which it can settle down and be at ease.

and there you have it all….from a speech given in Congress in 1846.

Bush’s mind is, indeed, tasked beyond its power, and the war was, indeed, presented to us as an attractive rainbow, which, unfortunately, rises in showers of blood.  As to the motives of which Lincoln does not speak, they were, WIlls opines, to expand slavery into Mexico, which had already abolished it.  

The Democrats have a bigger tent. Or do they?

The other day I posted a diary on cluster analysis of the US Senate.  This is sort of a followup, looking at how closely the Dems and Reps align on various ratings.

See below
First, let’s look at the average for each party for each group

   

       

Group.1 Democrat Indep Repub
ADA 91.97619 85.00000 18.75000
ACLU 70.71429 78.00000 12.76923
AFS 97.66667 86.00000 13.07692
LCV 88.85714 100.00000 9.51923
ITIC 61.85714 75.00000 94.94231
NTU 15.16667 23.00000 70.03846
COC 53.88095 59.00000 94.09615
ACU 10.07143 4.00000 88.67308
NTLC 10.14286 5.00000 88.96154
CHC 7.428571 0.000000 95.000000

So, by far the smallest difference is ITIC, which is a single-issue group.  Next smallest is COC, which is pro-business.  (can you spell ‘fundraising”?)

The biggest difference is, indeed, for the Christian Coalition.  But isn’t that just common sense?  The theocons are Repubs.

We can also look at the MINIMUM for each party by each group

Group.1 “D”  “R”  
ADA     25   0
ACLU    33   0
AFS     86   0
LCV     17   0
ITIC     0  82
NTU      0  48
COC      0  67
ACU      0  40
NTLC     0  63
CHC      0  50

which shows that if you’re a Dem, you are certainly sympathetic to labor (AFS), whereas if you are Repub., you are surely sympathetic to the information mega companies (ITIC), businees (COC), for budget insolvency (heheehe NTLC) and Christian friendly.  It ALSO shows that, as my title says, we are the big tent party.

Next, let’s look at the maximum for each group, by party

        1     3    
Group.1 Dem.  Repub  
ADA     100  65
ACLU    100  67
AFS     100  57
LCV     100  92
ITIC    100  100
NTU      34   89
COC      81   100
ACU      52  100
NTLC     43   98
CHC      83  100

There are only 2 groups with low maxima for Dems: NTU (taxes are bad) and ACU (broad based conservative).  For Repubs, only AFS (labor) has a low maximum.

To make it even clearer, we can look at a boxplot of each group’s ratings for each party.  

In each smaller chart, there are boxes for Dem., Indep., and Repub.  The line in the middle is the median.  The box is from the 25%tile to 75%tile, the dotted line going out to a horizontal line is the approximate 5%tile and 95%tile, and dots are outliers.  

Let’s go group by group: For ADA, the Dems have a smaller range. except for one with a rating of 25.  That, oddly enough, belongs to John Kerry.  I know I copied the figure right from the book….

For ACLU, the Dems have a bigger range: There’s one Dem under 40 (Ben Nelson).  For AFS (labor) the Repubs have a much bigger tent.  The Dems are solidly pro-labor (at least if you go by the AFS).  For LCV (conservation) the two parties are almost mirror images of each other.  The three outlying Dems are Kerry, Reid, and Johnson (SD).  The outlying Repubs are McCain and Kyl of AZ, Snowe and Collins of ME, and Chafee (RI) (it looks like 3 dots, but it’s really 5).

For ITIC (big media and tech), the Dems are all over the map, but the Repubs are tightly bunched at the top.

For NTU (anti-tax) the Repubs spread out a bit more than the Dems, but both have a big range.   The COC (chambers of commerce) gives high marks to almost every Repub., but low marks to almost no one of either party.  For the broad conservative ACU, the parties are again almost mirror images.  The outlying Dem is Nelson (NE) and the outlying Repub is Chafee (RI).  NTLC is much like NTU, not surprising, as they have similar issues.
Finally, for the Christian Coalition, there is one outlying Dem (again Nelson) and several outlying Repubs: Snowe, Collins and Chafee.