Sub i, sub j

I hope I can say this in a way that makes sense.

One kind of mathematical symbology your eyes eventually get used to is the “Σum over all individuals” concept:

$\begin{matrix} \displaystyle \sum_{i=0}^N \ x_i \\ \\ \displaystyle \sum_i \ x_i \\ \\ \displaystyle \sum_{\text{interesting set}} x_i \\ \\ \displaystyle \sum_i \ \mathrm{weight}_i \cdot x_i \\ \\ \displaystyle \sum_i \sum_j x_{i,j} \\ \\ \displaystyle {1 \over n} \sqrt{ \sum_i (x_i - \bar{x})^2} \end{matrix}$

Yes, at first it’s painful, but eventually it looks no more confusing than the oddly-bent Phoenician-alphabet letters that make up English text.

I believe there is a generally worthwhile pearl of thought-magic wrapped up in the sub-i, sub-j pattern. I mean, a metaphor that’s good for non-mathematicians to introduce into their mind-swamps.

That pearl is a certain connection between the specific and the general–a way of reaching valid generality, without dropping the specificity.

Before I explain it, let me talk a bit more about the formalism and how it’s used. I’ll introduce one word: functional. A functional ƒ maps from a small, large, convoluted, or simple domain onto a one-dimensional codomain (range).

Examples (and non-examples) of functionals:

size – you can measure the volume of a convex hull, the length of an N-dimensional vector, the magnitude of a complex number, the girth of a rod, the supremum of a functional, the sum of a sequence, the length of a sequence, the number of books someone has read, the breadth of books someone has read (is that one-dimensional? maybe not), the complicatedness (Vapnik-Chervonenkis dimension) of a functional, the Gini coefficient of a country’s income distribution, the GNP of a country, the personal incomes of the lowest earning 10% of a country, the placement rate of an MBA programme, the mean post-MBA income differential, the circumference of a ball, the volume of a ball, … and many other kinds of size.
goodness / score – business metrics often rank super-high-dimensional things like the behaviour of a group of team members into a total ordering of desireable through less desireable. When businesses use several different metrics (scores) then that’s not a functional but instead the concatenation of several functionals (into a function).
utility – for homo economicus, all possible choices are totally, linearly ordered by equivalence classes of isoclines.
fitness – all evolutionary traits (a huge, huge space) are cross-producted with an evolutionary environment to give a Fitness Within That Environment: a single score.
angle – if “angle” has meaning (if the space is an inner product space) then angle is a one-dimensional codomain. In the abstract sense of “angle” I could be talking about correlation or … something else that doesn’t seem like a geometrical angle as normally proscribed.
distance … or difference – Intimately related to size, distance is kind of like “size between two things”. If that makes sense.
quantum numbers – four quantum numbers define an electron. Each number (n, l, m, spin) maps to a one-dimensional answer from a finite corpus. Some of the corpora are interrelated though, so maybe it’s not really 1-D.
quantum operators – Actually, some quantum operators are non-examples because they return an element of Hilbert space as the answer. (like the Identity operator). But for example the Energy operator returns a unidimensional value.
ethics – Do I need more non-examples of functionals? A complete ethical theory might return a totally rankable value for any action+context input. But I think it’s more realistic to expect an ethical theory to return a complicated return-value type since ethics hasn’t been completely figured out.
regression analysis – You get several β’s as return values, each mogrified by a t-value. So: not a one-dimensional return type.
logic – in the propositional calculus, declarative sentences return a value from {true, false} or from {true, false, n/a, don’t know yet}. You could argue about whether the latter is one-dimensional. But in modal logic you might return a value from the codomain “list of possible worlds in which proposition is true”, which would definitely not be a 1-dimensional return type.
factor a number – last non-example of a functional. You put in 136 and you get back {1, 2, 4, 8, 17, 34, 68, 136}. Which is 8 numbers rather than 1. (And potentially more: 1239872 has fourteen divisors or seven prime factors, whichever you want to count.)
median – There’s no simple formula for it, but the potential answers come from a codomain of just-one-number, i.e. one parameter, i.e. one dimension.
other descriptive statistics – interquartile range, largest member of the set (max), 72nd percentile, trimean, 5%-winsorised mean, … and so on, are 1-dimensional answers.
integrals – Integrals don’t always evaluate to unidimensional, but they frequently do. “Area under a curve” has a unidimensional answer, even though the curve is infinite-dimensional. In statistics one uses marginalising integrals, which reduce the dimensionality by one. But you also see ∮’s that represent a sequence of ∫∫∫’s reducing to a size-type answer.
variability – Although wiggles are by no means linear, variance (2nd moment of a distribution) measures a certain kind of wiggliness in a linearly ordered, unidimensional way.
autocorrelation – Another form of wiggliness, also characterised by just one number.
Conditional Value-at-Risk – This formula $\int_{0\%}^{10\%} \mathrm{something} \cdot d \, \mathrm{something}$ is a so-called “coherent risk measure”. It’s like the expected value of the lowest decile. Also known as expected tail loss. It’s used in financial mathematics and, like most integrals, it maps to one dimension (expected £ loss).
“the” temperature – Since air is made up of particles, and heat is to do with the motions of those particles, there are really something like 10^23 dynamical orbits that make a room warm or cold (not counting the sun’s rays). “The” temperature is some functional of those–like an average, but exactly what I don’t know.

Functionals can potentially take a bunch of complicated stuff and say one concrete thing about it. For example I could take all the incomes of all the people in Manhattan, apply this functional:

$average income of Manhattanites$

and get the average income of Manhattan.

Obviously there is a huge amount of individual variation among Manhattan’s residents. However, by applying a functional I can get Just One Answer about which we can share a discussion. Complexity = reduced. Not eliminated, but collapsed.

I could apply other functionals to the population, like

count the number of trust fund babies (if “trust fund baby” can be defined)
calculate the fraction of artists (if “artist” can be defined)
calculate the “upper tail risk” (ETL integral from 90% to 100%, which average would include Nueva York’s several billionaires)

Each answer I am getting, despite the wide variation, is a simple, one-dimensional answer. That’s the point of a functional. You don’t have to forget the profundity or specificity of individual or group variation, but you can collapse all the data onto a single, manageable scale (for a time).

The payoff

The sub-i sub-j pattern allows you to think about something both specifically and in general, at once.

Each individual is counted uniquely. The description of each individual (in terms of the parameter/s) is unique.
Yet there is a well-defined, actual generalisation to be made as well. (Or multiple generalisations if the codomain is multi-dimensional.) These are valid generalisations. If you combine together many such generalisations (median, 95th percentile, 5th percentile, interquartile range) then you can quickly get a decent description of the whole.

Kind of like how thinking with probability distributions can help you avoid stereotypes: you can understand the distinctions between

the mean 100m sprint time of all men is faster than the mean 100m sprint time of all women
the medians are rather close, perhaps identical
the top 10% of women run faster than the bottom 80% of men
the variance of male sprint times is greater than the variance of female sprint times
differences in higher moments, should they exist
the CVaR’s of the distributions are probably equivalent
conditional distributions (sub-divisions of sprint times) measured of old men; age 30-42 black women; age 35 Caribbean-born women of any race of non-US nationality who live in the state of Alabama
and so on.

It becomes harder to sustain sexism, racism, and to sustain stereotypes of all sorts. It becomes harder to entertain generalistic, simplistic, model-driven, data-less economic thinking.

For instance, the unemployment rate is the collapse/sum of ∀ lengths of individual unemployment spells: ∫ (length of unemp) • (# of people w/ that unemp length) = ∫ dx • ƒ(x).

Like the dynamic vapor pressure of a warm liquid in a closed container, where different molecules are pushing around in the gas and alternately returning to the soup. The total pressure looks like a constant, but that doesn’t mean the same molecules are gaseous–nor does it mean the same people are unemployed.

(So, for example, knowing that the unemployment rate is higher doesn’t tell you whether there are a few more long-term unemployed people, a lot more short-term unemployed people, or a mix.)
You can generalise about a group using different functionals. The average wealth (mean functional) of an African-American Estadounidense is lower than the average wealth of a German-American Estadounidense, but that doesn’t mean there aren’t wealthy AA’s (max functional) or poor GA’s (min functional).
You don’t have to collapse all the data into just one statistic.

You can also collapse the data into groups, for example collapsing workers into groups based on their industry.

(here the vertical axis = number of Estadounidenses employed in a particular industry – so the collapse is done differently at each time point)

Various facts about Venn Diagrams, calculus, and measure theory constrain the possible logic of these situations. It becomes tempting to start talking about underlying models, variation along a dimension, and “the real causes” of things. Which is fun.

At the same time, it becomes harder to conceive overly simplistic statements like “Kentuckians are poorer than New Yorkers”. Which Kentuckians do you mean? And which New Yorkers? Are you saying the median Kentuckian is poorer than the median New Yorker? Or perhaps that dollar cutoff for the bottom 70% of Kentuckians are poorer than the cutoff to the bottom 50% of New Yorkers? I’m sorry, but there’s too much variation among KY’s and NY’s for the statement to make sense without a more specific functional mapping from the two domains of the people in the states onto a dollar figure.

ADDED: This still isn’t clear enough. A friend read this piece and gave me some helpful feedback. I think maybe what I need to do is explain what the sub-i, sub-j pattern protects against. It protects against making stupid generalisations.

To be clear: in mathematics, a generalisation is good. A general result applies very broadly, and, like the more specific cases, it’s true. Since I talk about both mathematical speech and regular speech here, this might be confusing. But: in mathematics, a generalisation is just as true as the original idea but just applies in more cases. Hence is more likely to apply to real life, more likely to connect to other ideas within mathematics, etc. But as everyone knows, people who “make generalisations” in regular speech are usually getting it wrong.

Here are some stupid generalisations I’ve found on the Web.

Newt Gingrich: “College students are lazy.”
Is that so? I bet that only some college students are lazy.

Maybe you could say something true like “The total number of hours studied divided by total number of students (a functional ℝ⁺^{# students}→ℝ⁺) is lower than it was a generation ago.” That’s true. But look at the quantiles, man! Are there still the same number of studious kids but only more slackers have enrolled? Or do 95% of kids study less? Is it at certain schools? Because I think U Chicago kids are still tearing their hair out and banging their heads against the wall.
Do heterodox economists straw-man mainstream economics?
I’m sure there are some who do and some who don’t.
The bad economy is keeping me unemployed.
That’s foul reasoning. A high general unemployment rate says nothing directly about your sector or your personal skills. It’s a spatial average. Anyway, you should look at the length of personal unemployment spells for
Conservatives say X. Liberals say Y. Libertarians think Z.
Probably not ∀ conservatives say X. Nor ∀ liberals say Y. Nor do ∀ libertarians think Z. Do 70% of liberals say Y? Now that I’m asking you to put numbers to the question, that should make you think about defining who is a liberal and measuring what they say. Not only listening to the other side, but quantifying what they say. Are you so sure that 99% of libertarians think Z now?
The United States needs to focus on creating high-tech jobs.
Are you actually just talking about opportunities for upper-middle-class people in Travis County, TX and Marin County, CA? Or does your idea really apply to Tuscaloosa, Flint, Plano, Des Moines, Bemidji, Twin Falls, Lawrence, Tempe, Provo, Cleveland, Shreveport, and Jacksonville?
Green jobs are the future!
For whom?
Alaskans are enslaved to oil companies.
Meat eaters, environmentalists, blacks, hipsters, … you can find something negative said about almost any group.
Without quantification or specificity, it will almost always be false. With quantification, one must become aware of the atoms that make up a whole–that the unique atoms may clump into natural subgroups; that variation may derive from other associations–that the true story of a group is always richer and more interesting than the imagined stereotypes and mental shorthand.
What’s wrong with the teenage mind? WSJ.
∃ a teenage mind?
French women eat rich food without getting fat. Book.
French parents are better than American parents. WSJ.
What is it about twenty-somethings? NY Times.

If you sub-i, sub-j these statements, you can come up with a more accurate and productive sentence that could move disagreeing parties forward in a conversation.

Unwarranted generalisations are like Star Trek: portraying an entire race as being defined by exactly one personality trait (“Klingons are warlike”, “Ferengi’s ony care about money”). That sucks. The sub-i, sub-j way is more like Jack Kerouac’s On the Road: observing and experiencing individuals for who they are. That’s the way.

Neal Cassady Allen Ginsberg William S Burroughs, old Jack Kerouac

If you want to make true generalisations–well, you’re totally allowed to use a functional. That means the generalisations you make are valid–limited, not overbearing, not reading too much into things, not railroading individuals who contradict your idea in service of your all-important thesis.

OK, maybe I’ve found it: a good explanation of what I’m trying to say. There are valid ways to generalise about groups and there are invalid ways. Invalid is making sweeping over-generalisations that aren’t true. Sub-i, sub-j generalisations are true to the subject while still moving beyond “Everyone is different”.

isomorphismes

shapes, figures, and forms.

Sub i, sub j

BROWSE