Marginal solution

November 27, 2013

Okey dokey. Better reveal the solution to the stat-geek marginalisation quiz. There were sixty two votes.

The popular winner was the economics/marginal profits idea, with 31 votes. Plausible but wrong.

The second most popular was the “marginal interest” idea. Well… this is what the term has more or less drifted into meaning, because (almost) everybody has forgotten the true origin. So… wrong.

Nobody voted for “lost in the mists of time”, which proves you all care. How nice.

Only two people voted for EB Margin being the pseudonym of WR Gossett. This disappointed me, both because it is funny and because it was supposed to be a cunning false trail. WS Gosset in fact published his papers under the name of “Student”, which is why we have the “Student’s t test’.

So of course the correct answer was other. Sorry if that was an annoying tactic, but I think if I’d made the right answer one of the choices, it would have been too obvious. Amongst the 6 suggestions, two were for our amusement :

“I’d write the reason here but there’s not enough room in the margin”
“To marginalise those who don’t know”

and four were were spot on or more or less right

“Refers to margins of a contingency table”
“thought it was to do with averaging rows, with answer stuck in the margin”
“your are projecting the 2D pdf onto the “margin” of the plot”
“Sweeping the probability to the edge (=margin) of the paper?”

Sounds like the first two people knew, and the second two deduced the right answer. If you were one of those people, award yourself an extra biscuit at coffee time, and feel free to announce yourself.

Just to it spell out..  As physicists, we nearly always think in abstract mathematical terms, so we think of  “marginalisation” as a calculus problem – an integral. Even when thinking visually we picture a joint probability distribution as a smooth surface in three dimensions. But early statisticians were often concerned with tables of numbers, and worked on paper. Think of a joint frequency distribution as a grid of numbers in cells. Then  add up a row, and write the answer in the margin. When you have done this for all the rows, read down that margin, and – voila – the marginal distribution for y.

Don’t start me on regression…

A quiz of marginal interest

November 20, 2013

Two things we know.

(1) Scientific terminology is burdened with the baggage of history, which now makes no logical sense. So… early type galaxies are the ones with late type stars? Errr… And which of these terms relates to a sequence in time? Neither. Right. Very helpful.

(2) When you have to teach something, you finally figure out things that have been bugging you for years.

(3) Nobody expects the … oh. Anyway. Often (1) and (2) combine to make a particularly thick fog.

For some time the term “marginalisation” had been nagging at me, but I ignored it because I had other stuff to get on with. I am referring to the term in statistics. You have a probability density function of two variables, f(x,y), but decide that y is “interesting” and x is “uninteresting”. You then integrate over x to get a PDF p(y) for y alone. This known as “marginalising over x”.

So here is the quiz. My guess is only about seven people will want to take it, but I can’t resist it.

Rule (a) Andrew Liddle is not allowed because I already told him the answer in the pub. Rule (b) No Googling. Rule (c) Never talk about  Stats Club.