# 02 - Segregation and Peer Effects
{%hackmd G-uuuRi2RyKS_IyjBJS3Kw %}
###### tags: `Model Thinking` `Courses`
## Motivation:
Understand the empirical phenomena:
> People who _are_ alike, hang together.
Note that the verb _are_ is interchangeable with look, think, etc.
Clearly, causality can go both ways: people hang together because they are alike, or they are alike precisely because they hang together.
It is crucial to differentiate these two forces __Sorting__ vs. __Peer Effect__,
> __Sorting:__ or what sociologists call homophily, to the effect of personal characteristics in group formation.
> __Peer effect:__ Refers to the group structure in individual characteristics. We choose to start acting like believing like people we're hanging out with.
In this section, we will study the following models:
* [Schelling's Model](https://drive.google.com/file/d/1tHMdgRe3dN-RQ74tXCLO-3srQFl1xnk5/view?usp=sharing) This is a model of racial segregation and it's sometimes called Schelling's tipping model. And that model we're going to see how it's a little more subtle than we might think about what causes those segregation models as we see in modern cities.
* [Granovetter's Model](https://drive.google.com/file/d/1XYnHPESKLIvZDwRlX1yzWhHN8uDpL8np/view?usp=sharing) Ths model looks at people's willingness to participate in some sort of collective behavior. This could be a riot, a political uprising, or a social movement.
* [Standing Ovation Model](https://drive.google.com/file/d/1XYnHPESKLIvZDwRlX1yzWhHN8uDpL8np/view?usp=sharing) This is an extension of Granovetter's, It is a model of peer effects where you change your behavior to match that of others around you like when standing ovations spontaneously form in a theater. And what's interesting about it is the standing ovation gives us an excellent framework within which to think about this question
Finally, we analyze the _identification problem_, which refers to the problem of identifying if sorting or peer effect is the cause of the distribution in the observed data.
### Agent-Based Models:
The kind of model that we are going to discuss is called agent-based models or ABM.
ABMs are composed of the following parts:
* **Individuals or Argents**: those are the individuals populating the model. The entities which behavior or the outcomes of their behavior, we want to study.
* **Behaviors or Rules:** these are the way on which the agents of the model behave, the things that they do, the rules they follow. In some cases, these rules might be optimal rules; in this case, we are in the presence of a game theory or a rational choice model.
* **Outcomes:** Once you've got all these agents following all these rules, that creates something at the macro level. What we can do then is ask (sort of) what kind of outcome do we get and how those outcomes depend on the characteristics of the agents and the rules that they follow.
What we will find in these models is that the outcomes are sort of surprising, and that's, again, why models are so useful. Because we may logically think that if we assume agents that follow this behavior, we are going to get outcome $A$. But when we work through the model and actually, you know, work through all the logic, we'll find that perhaps, the opposite is true.
## Schelling's Model
It about people choosing where to live, the model makes two simplifications:
1. The city is a checkerboard, and people live in each of the squares, two or more people cannot share a square.
2. People only have two options at each time, to stay where they live or to move.
There is a threshold $n\in[0,1]$ that characterizes the minimum fraction of neighbors of the same group that an agent needs to feel happy in their current neighborhood.
**Rules:** People (agents) only have one rule if the fraction of their neighbors that are of their same group is lower than $n$ they move; otherwise, they stay.
For example, consider the following neighborhood:

The agent in the middle has $8$ neighbors $3$ of them are of her same group, and $5$ are from the other group. Now, if her threshold is $n=0.35$, she will be happy and will not move out, but if $n=0.38$, now she will be uncomfortable and will move out. This foreshadows a characteristic of the model; small changes in $n$ can produce large scale changes in the outcomes. Note that the agent didn't become much more intolerant with the change in $n$.
Let's set up the model with $2000$ agents a $50\times50$ grid, we have red agents and green agents, where the green agents are rich, and the red agents are poor (can be Asian and Latino, Red Sox and Yankee's fan or any $A$ and $B$ disjoint groups). We start the model with the agents randomly placed in the grid.
People want 30 percent of their neighbors to look like them., that's their threshold, we also care of two levels of aggregation for the model:
1. _percentage similar:_ how many people are like you in your neighborhood of eight.
2. _percentage unhappy:_, how many people aren't having their threshold met.

Starting with a threshold of $30\%$ similar, and notice we start around $50\%$ similar, and only $18\%$ are unhappy. $50\%$ similar makes sense because people are randomly set out in the grid. If we let this model run, what happens is that the average agent ended with $75\%$ percent similar neighbors. And nobody's unhappy, so the system goes to equilibrium.
What's interesting about this if you look at this, $75\%$ of a person's neighbors are like them even though people are incredibly tolerant they only need $30\%$ of the people. In their neighborhood to be like them, and you end up with $70\%$ of the people in your area like you. So here's the deep insight from Schelling's model.
What you see at the macro level, segregation like this, may not represent what's going on at the micro-level because these are very tolerant people, all they want is a third of the people to look like them, and they'll be okay. But if that's their rule, you end up with $72\%$ percent of people looking like you.
But what if we make them just slightly less tolerant? Consider now a new threshold of $40\%$. We start with $50\%$ similar and remember it's $30\%$ of people unhappy.

If we let this go, what we end up now is $85\%$ of the people end up being similar. So we get even more segregation.
But what's interesting is if we ramp this up even more, to $52\%$, a little over $50\%$. Now around $60\%$ of people are unhappy. That's because over $60\%$ of people have $50\%$ or fewer neighbors like them. And if we run this, we get unbelievable segregation.

Now, what's incredible about this is that $52\%$ isn't that intolerant, people just want to be in the majority they might actually prefer a racially mixed neighborhood or an income-mixed neighborhood, but; the reality is that what we end up with is $94\%$ percent of my neighbors will be of the same group. And if you look closely at this picture, what you see is that there's sort of like little islands of empty space, the black regions are empty space, between the red and green regions. So these people are really segregating.
Now you could say, so this is sort of surprising. And again, we get these fantastic results from showing that at the macro level, we get segregation, but at the micro-level, people are pretty tolerant.
Finally, let's crank this way up to $80\%$. So now, people want $80\%$ of their neighbors to be like them. And if they're not, they're gonna move. Well, we should get massive segregation here, right? Even worse than
before.

Okay. We don't. We don't even get an equilibrium. We get this sort of completely random process, right? Everybody's still hanging out in neighborhoods that are still $50\%$ of people are similar. The reason why is, if you don't want anybody in your neighborhood to be like you, well, it's hard to find a place to live.
__Concussion:__ Micro-motives $\neq$ Macro-behavior
### Tipping Points:
>A tipping point of a system is a threshold that, once the threshold is reached, the system rapidly changes its state.
In Schelling's segregation model, there are two kinds of tipping points:
* __Genesis tip:__ occurs when someone moves into your neighborhood and causes you to move.
* __Exodus tip:__ occurs when someone moves out of your neighborhood, which in turn causes you to move.
Both Genesis tips and Exodus tips are examples of active tipping points. As the simulations show above, although people may be very tolerant, the system will still be segregated to some degree. Also, if the behavioral rule is very high, the system may never equilibrate as individuals are constantly moving to find happiness.
## Granovetter's Peer Effects Model
> "... sometimes the tail wags the dog ... sometimes the people at the end of the distribution, the extremists, are the ones that really drive what happens..."
To explain this sort of situation, we will introduce the model proposed by [Mark Granovetter](https://en.wikipedia.org/wiki/Mark_Granovetter) in 1978.
Because theories oriented to norms assume, implicitly, a simple relation between collective results and individual motives: that if most members of a group make the same behavioral decision-to join a riot, for example, we can infer from this that most ended up sharing the same norm or belief about the situation, whether or not they did so at the beginning.
Granovetter's model, in contrast, takes as the most important causal influence on outcomes the variation of norms and preferences within the interacting group.
Consider a very simple model of collective action that will help us understand how do spontaneous social movements emerge. The movement can be a political protest, a fashion trend, or anything in between.
The model is as follows:
* There are $N$ individuals.
* Each individual has a threshold $T_j$
* They join if and only if other $T_j$ persons join.
We can also think of this model in terms of cost-benefit. Take riots, for example, the cost to an individual of joining a riot decreases with as riot size increases since the probability of being apprehended is smaller when more people are involved.
Different individuals will have different thresholds:
* "radicals" will have lower thresholds,
* there might be "instigators", people whom will riot even if no one is, we set their threshold to $0\%$.
* "conservative" people have higher thresholds,
* there are people who won't riot under any circumstance, and we set their threshold to $100\%$.
**Note:** It is not necessary to be able to deduce people's characteristics from their thresholds since those thresholds can be a consequence of complicated cost-benefit relationships.
A person threshold is simply the point where perceived benefits outweigh perceived risks.
### Model Equilibrium
#### Intuitive Idea
Start with a simple example suppose there are $100$ persons $(x_i)$ in a square and thresholds are distributed as follow:
$T_i = i-1 \quad \forall i \in \{1,...,100\}$
in simpler terms, we have a person with a threshold $0$ a person with a threshold $1$ and so on.
Then the outcome is clear, the person with a threshold of $0$ will start rioting immediately, which will make the person with a threshold $1$ to start rioting, and soon, we will have the whole population in a riot. This is a "bandwagon" or "domino" effect. The equilibrium is $100$ in this example.
Now suppose that we change the threshold of person $x_1$ from $1$ to $2$, now even when the "instigator" with threshold $0$ starts the riot, now there is no one with a threshold $T_i \leq 1$, so the riot doesn't spread, the equilibrium, therefore, is $1$.
What these two examples expose is that even if groups are very similar in the aggregate values of thresholds, differences in the distribution of the thresholds in the micro-level determines opposite outcomes at the macro level.
#### Mathematical Formulation
We show a mathematical account of how one goes from a frequency distribution of thresholds to an equilibrium outcome.
Denote thresholds by $x$, $f(x)$ and $F(x)$ the frequency distribution and cumulative distribution of $x$. Let $r(t)$ be the proportion of the population that have joined by time $t$. For simplicity assume discrete time periods then the process by which $r(t)$ changes is described by the difference equation:
>$r(t+1) = F(r(t))$
then the equilibrium can be fount by setting $t(t)=r(t+1)$.
### Examples
First we will run a simulation with $2000$ agents and thresholds distributed according to:
$\underbrace{0...0}_{10}\underbrace{10...10}_{10}....\underbrace{1000...1000}_{10}$
this results are:

__Note:__ that this models results in the size of the riot increasing in a way that almost can be described as exponential.
Now lets change the distribution of the thresholds, $T_j \sim N(500, 250)$ there s a slight modification needed, values bellow $0$ are set to $0$ and values above $2000$ are set to $2000$. Thresholds above or bellow those values can be interpreted as different political positions that yield the same behavioral rule.
Checking the simulation:

We observe that the riots end up comprising the whole population as before but the growth of the riot is different. In this case growth start slower but when suddenly some threshold is reached the riot extends to the whole population in two steps.
Finally consider a bi-modal distributions of thresholds. This can be intepreted as two disctinct groups such as rich and poor people; poor people have much more to gain from rioting than rich people threfore they will riot first.

In this example since the threshold for the two groups the riot does not spread to the second group of people.
## The Standing Ovation Model
This is a model of peer effects by [John H. Miller and Scott E. Page](http://www.ccpo.odu.edu/~klinck/Reprints/PDF/millerComplex2004.pdf). In the authors' words, the Standing Ovation Problem (SOP) can be described as: "a brilliant economics lecture ends, and the audience begins to applaud. The applause builds, and tentatively, a few audience members may decide to stand." Then, does a standing ovation arise, or does the enthusiasm collapse?.
First, one incontrovertible observation about the phenomenon can be made. If a person stands, there is an **information effect**; this person is telling others about the quality of the show.
### Simple model
In the simple version, there are $N$ people in the auditorium. Every person receives a signal $S= Q+error$ of quality, where $Q$ is the quality of the show or presentation, and $error$ represents the distortion on the signal, or it can be thought of diversity of preferences or taste in the audience. Each person has a threshold of perceived quality $T$, to assess when a performance is outstanding and therefore deserve a standing ovation and a threshold $X$ that indicates the percent of the audience that stands. In this simple version, individuals are homogeneous in $T$, $X$ and heterogeneous in the $error$ term, $error$ follows a distribution $f$.
Formally, an individual stands
* **(initial rule)** If $S>T$
* **(subsequent rule)** If more than $X$ percent of people stand.
The following claims can be deduced from a simple analysis of the rules and definitions of the SOP.
* **Claim 1:** Higher $Q$, more people stand.
This stems from the initial rule; if $Q$ increases, then $S$ increases for every individual, and more people will stand.
* **Claim 2:** Lower $T$, more people stand.
This statement also follows from the initial rule; if $T$ decreases, then more individuals will comply with the first rule.
* **Claim 3:** Lower $X$, more people stand.
This claim follows from the second rule; if $X$ decreases, the probability that a standing ovation occurs is greater after an initial number of people stands.
* **Claim 4:** variation on signals, that is, diversity on the $error$ term, can produce more ovations.
This is a less intuitive claim. If $Q>T$, and there is no much diversity, then more people will stand; if there is "lots" of variation, fewer people will stand. Conversely, if $Q<T$, more variation leads to more people standing, and less to less ovations. Because the terms "more variation" and "less variation" are vague, let us think, for example, variation represented by a uniform distribution of $error$ from [-a, a]; thus, more variation implies that $a$ increases, and less variation implies the opposite. From this simple example, we can derive a direct relation between variation and the variance of the density function.
There are general features of the actors of the model, which we can manipulate to add complexity to the model. First, there are two main features of the audience, **diversity** (already discussed) and **sophistication**. Different levels of sophistication of agents can be achieved by changing the rules, or by using distinctive distribution functions for the $error$ variable. Second, the performance has two characteristics, complexity and multidimensionality. The former refers to how difficult it is to understand the signal, and the latter indicates the number of dimensions to be interpreted for the audience; for example, clarity, authenticity, and virtuosity (for artistic expressions).
### Complex versions
People usually go to theaters and social activities in groups. This is one channel in which the model has been enriched, which makes us draw additional remarks about standing ovations. In this context, we add the rule that if a member of a group stands, then every other member stands as well. From this follows that
* **Claim 5:** If people go in groups, more people stand.
This follows from the addition of a new rule, especially if groups are heterogeneous, which they are in principle; it only takes one member of a group to stand for her companions to stand.
To fix ideas, imagine a cinema where everyone goes in pairs, specifically dating couples. If your date stands, you would be more likely to stand, and because everyone behaves this way, we get more ovations.
Another version of the simple model is one in which, as in any other auditorium, depending on the row people sits, they have visibility of a subset of the rest of the audience, and conversely, they will be watched by another subset of the audience. For example, a person that sits on the last row, it is watched by no one and sees almost everyone else. In particular, individuals sit on the first row, viz. *celebrities*, are viewed by everyone else, ergo, they exert the greatest influence on the room. Under the assumption that *celebrities* stand more easily, we will have more ovations. More formally,
* **Claim 6:** If people have different visibilities and we have celebrities on the first rows, more people stand.
The behavior of the celebrities is "emulated" for the majority of the agents. Notice if we revert the assumption about _celebrity_; hence, _celebrities_ judge performance harder, and fewer people will stand.
Furthermore, if we add dynamics to this model, we get cascade effects. The following are applications of the Standing Ovation Model:
* Collective action
* Academic performance
* Urban renewal (if you give a limited number of people money to fix their house, you get a cascade effect)
* Fitness/ Health
* Online courses
## Identification problem
The identification problem is the question of determining how the patterns present in the data come from one of the two previous models. Given that we observe segregation, (i.e., people who are alike hang together):
Does separation come from sorting (homophily) or peer effect?
Sometimes the question is easy to answer, take a look at the graph of the political blogosphere during the 2004 US presidential election:

Dots represent different political blogs where the color represents the political inclination of each blog (red for conservative and blue for liberal). Lines between each dot represent hyper-links (citations) between blogs. Political positions separate blogs, conservatives are much more likely to link to conservatives and the same for liberals. The most rational explanation for this sorting is that blog links to other blogs that support their positions, therefore sorting is the likely culprit for the separation present in the data.
__Recommended Books:__ Each [[Bibliography#Connected|Connected]] (peer effects) and [[Bibliography#The Big Sort|The Big Sort]] sorting.
To illustrate, suppose we have two groups under peer effect people change to match the majority of their group, under sorting people change groups to where their type is the majority:

under both processes, results are the same.
Looking at a snapshot of the outcome, we will not be able to tell which model produced the result.