Search This Blog

Tuesday, June 19, 2007

The Prisoner's Dilemma

The Prisoner’s Dilemma is a psychological exercise in decision making that inform our daily lives and teaches us how to cope in the real world with real people.

Bill Whittle explains:

Not too long ago, just in passing, my friend Richard Riley pointed me to a famous case in game theory called The Prisoner’s Dilemma.

Now we need to really understand this, because if we do I think many of our present troubles will become clear.

Here’s how Wikipedia presents the case:

Two suspects, A and B, are arrested by the police. The police have insufficient evidence for a conviction, and, having separated both prisoners, visit each of them to offer the same deal: if one testifies for the prosecution against the other and the other remains silent, the betrayer goes free and the silent accomplice receives the full 10-year sentence. If both stay silent, both prisoners are sentenced to only six months in jail for a minor charge. If each betrays the other, each receives a two-year sentence. Each prisoner must make the choice of whether to betray the other or to remain silent. However, neither prisoner knows for sure what choice the other prisoner will make. So this dilemma poses the question: How should the prisoners act? The dilemma can be summarized thus:

Prisoner B Stays Silent Prisoner B Betrays
Prisoner A Stays Silent Each serves six months Prisoner A serves ten years
Prisoner B goes free
Prisoner A Betrays Prisoner A goes free
Prisoner B serves ten years Each serves two years

In deciding what to do in strategic situations, it is normally important to predict what others will do. This is not the case here. If you knew the other prisoner would stay silent, your best move is to betray as you then walk free instead of receiving the minor sentence. If you knew the other prisoner would betray, your best move is still to betray, as you receive a lesser sentence than by silence. Betraying is a dominant strategy. The other prisoner reasons similarly, and therefore also chooses to betray. Yet by both betraying they get a lower payoff than they would get by staying silent. So rational, self-interested play results in each prisoner being worse off than if they had stayed silent.

Okay, we can simplify this:

If I screw you, but you don’t screw me, I win very big and you lose very big.
If you screw me and I don’t screw you, I lose very big and you win very big.
If neither screws each other, we both suffer mild punishment.
If we both screw each other, we both suffer medium punishment.
The Prisoner’s Dilemma, therefore, is an analogy we use to test the results of how people treat each other.

Now, if this game is played one time, the winning strategy invariably is to Screw the Other Guy. If he doesn’t screw you, you get off free. If he does, you serve two years. But if you didn’t, and he decided to screw you – ten years. No one wants to risk that. Screw the Other Guy is the only smart position, and when the game is run thousands of times on computers it comes out the very clear winner.

But! What happens if the game is played again and again, against the same person? Does Screw the Other Guy continue to be the best strategy?

It does not!

The best strategy for a repeating game (called the Iterated Prisoner’s Dilemma) is not Screw The Other Guy, and -- surprisingly at first glance -- it’s not Always Cooperate With The Other Guy, either.

The winning strategy is Tit-for-Tat. That is, you do to the guy what he did to you last turn. If he cooperated, you cooperate. If he screwed you, you screw him back. Over thousands and millions of computer runs, using every strategy from complete aggression to complete forgiveness, Tit-for-Tat “wins” every time – that is, it results in the least jail time for you.

Robert Axelrod examined this outcome in a book called The Evolution of Co-operation. (That word ‘evolution’ having great power in this context, as we will see in a second.)

Wikipedia again:

By analysing the top-scoring strategies, Axelrod stated several conditions necessary for a strategy to be successful.


The most important condition is that the strategy must be "nice", that is, it will not betray [Screw the Other Guy] before its opponent does. Almost all of the top-scoring strategies were nice. Therefore a purely selfish strategy for purely selfish reasons will never hit its opponent first.


However, Axelrod contended, the successful strategy must not be a blind optimist. It must always retaliate. An example of a non-retaliating strategy is Always Cooperate. This is a very bad choice, as "nasty" strategies will ruthlessly exploit such softies.


Another quality of successful strategies is that they must be forgiving. Though they will retaliate, they will once again fall back to cooperating if the opponent does not continue to play betrayals. This stops long runs of revenge and counter-revenge, maximizing points.


The last quality is being non-envious, that is not striving to score more than the opponent (impossible for a ‘nice’ strategy, i.e., a 'nice' strategy can never score more than the opponent). Therefore, Axelrod reached the Utopian-sounding conclusion that selfish individuals for their own selfish good will tend to be nice and forgiving and non-envious. [And, they will hit back when they are hit first, and keep hitting back until the opponent stops Screwing the other Guy; upon which they will revert to cooperation.]

One of the most important conclusions of Axelrod's study of the Iterated Prisoner’s Dilemma is that Nice guys can finish first.

Now things get really interesting. In The Prisoner’s Dilemma, these behaviors are expressed as choices made by individuals. But now substitute entire cultures, where the cultural norm is made up of these choices, and what do you see?

You find the easy, knee-jerk reaction is to form a society where everyone tries to screw everyone else. It’s the short-term approach, and it makes sense in the short term. Presumably all robbers and cheats want to maintain short-term relationships with their victims. If they had to meet them again (if the game was iterated) this strategy would be, shall we say, somewhat less successful and the victims would begin to Hit Back.

When I look out into the Third World, this is what I see: short-term strategies for immediate gain at the cost of long-term success. A swarm of trinket vendors on a beach in Mexico all need to make an immediate sale in order to eat that day, even if the cost is being so annoying and frustrating to the tourists that it prevents them from ever returning. Short term gain, long term loss.

I make no value judgment on that behavior, because it works on some level or it would not be so prevalent. In societies where short term values trump long-term ones, it is easy, safe and stable to Screw the Other Guy. But in the long-term, nothing of consequence grows, because nice, forgiving and non-envious are advanced strategies that require a topsoil of general goodwill, trust, and respect for the rule of law.

But as we see from The Iterated Prisoner’s Dilemma, there is an unnatural island of stability that is far more successful, and it is not simply trusting everyone and being all-cooperating all the time. That strategy is the worst, because it rewards being screwed by competing strategies that eat it for breakfast everytime. It is de-selected. It vanishes from the gene pool, so to speak. You see no society like that in the real world, and now you know why. Are you listening, Marxists? It doesn’t work.

But Tit-for-Tat combines generosity and toughness. And look at the terms used to describe the most successful strategic version of Tit-for-Tat: Nice. Retaliating. Forgiving. Non-envious.

Now, this is where my own analysis kicks in, because frankly, nice, retaliating, forgiving and non-envious pretty much sums up how I feel about the West in general and the United States in particular. The web of trust and commerce in Western societies is unthinkable in the Third World because the prosperity they produce are fat juicy targets for people raised on Screw the Other Guy. Crime and corruption are stealing, and stealing is Screwing the Other Guy. It’s short-term win, long-term loss.

Alright, now here come the brass tacks:

If you think about it, all of our laws – and indeed, the very idea of respect for and equality under the law – are written to protect Tit-for-Tat, because Tit-for-Tat produces the best results. You may sell your product at a profit, but if you lie about what it does we will call that fraud and you will go to jail because successful societies start nice but retaliate against those that decide to Screw the Other Guy. The punishment of fraud is what gives us confidence in the claims made by other products. Retaliating against Screw the Other Guy is not mean-spiritedness or a lust for revenge. It is essential to protect the confidence needed to stay focused on long-term wins. And that’s how, in theory, you build a cooperative society.

You retaliate against those that take advantage of the common trust. In other words, you punish the cheaters.

If you do not punish the cheaters, you have an “always cooperate” society that produces, consistently and rapidly, the worst possible outcome because it encourages – it selects – competing nasty strategies, by providing them with what I can only describe as a food source. Without retaliation against cheaters, cheaters thrive because that becomes the smartest strategy. There’s nothing “kind” about non-retaliation, nothing noble or good. Non-retaliation is suicide. Plain and simple.

Remember all those stimuli I mentioned before? What do they have in common?

Cheating in class (or getting a diploma without passing the required tests), cheating by crossing the border illegally, cheating by committing crimes and not paying for it, cheating by bribery and corruption, cheating in general rewards Screw the Other Guy as a social strategy and makes chumps of the people who need a level of societal trust – they need retaliation against Screw the Other Guy – in order to continue to cooperate. Society needs to retaliate against cheaters because not to do so flips the coin from cooperation to betrayal. And that’s the end of everything we have worked for and cherish.

And – and – you don’t need to be a master of game theory to know this in your bones. Because if you are offended by cheaters, it is because you are being betrayed into – you are in fact being forced into – becoming a cheater and betrayer yourself. Always-cooperating dies quickly: if you never betray and the other guy always does, he goes free and you get 20 years every time. (In other words, he’s out getting high while you work to support him.) Sooner or later, even the most dense moralist gets the message.

When a tipping point is reached – when enough people are allowed to cheat – the system swings to a different stability mode (the default mode) and Screw the Other Guy becomes the only rational choice.

The rational choice. Think about that for a moment.

Does that make you angry? It damn well better. And if it does, then you are not alone.

No comments: