Yes! Video games are the best way to explain basic topology.

You know those special levels in Super Mario Bros. where the screen doesn’t move with you — you’re just in a “room” and if you go off to the right you arrive back on the left?

That was just a convenience for the game programmers.

But think about this: what’s the difference between a straight line that meets back to its other end and you loop over and over and over the same spot while running forward — and a circle?

(Answer: there is no difference. If we wanted to imagine Mario being a 3-D person, we could — he’d be running around a cylinder (this room he’s jumping to get the coins in would just be maybe 2-3 shoulders wide and dug in a circle underneath the ground. Or the brick platform he’s running on equally wide and it’s built in a cylinder above off the ground. We just don’t see that he’s constantly adjusting to bear a few degrees left as he’s running.

If you drive off the north of the screen you end up at the south, and if you fly off the east you end up on the west.

At first I thought this just meant we were flying around an entire planet Like The Little Prince’s moon or so. (The non-map visuals—the main flying visuals—could go along with this story, since there’s suelo below and cielo above.)

But on further consideration this can’t be the case.

Think about a globe of the Earth: east and west are connected contiguously — but the North Pole is as far away from the South Pole as you can get.

Think about running away from your enemy to the northeast corner and disappearing very quickly off the north to the south, then disappearing just as quickly from east to west. What allows you to do this quick of a dodge?

Think about if the North Pole and the South Pole WERE equivalent. Picture the Earth and start “sucking the two towards each other”.

Why bicontinuity is the right condition for topologicalequivalence (homeomorphism): if continuity of the inverse isn’t required, then a circle could be equivalent to a line (.99999 and 0 would be neighbours) — Minute 8 or so.

The reason is that the matrix of the exterior derivative is equivalent to the transpose of the matrix of the boundary operator. That fact has been known for some time, but its practical consequences have only been understood recently.

[S]uppose you know the boundary of each k-cell in a cell complex in terms of (k−1)-cells, i.e., the boundary operator. Then you also know the exterior derivative of all discrete differential forms (i.e., cochains). So, you know calculus. Smooth or discrete.

I feel vindicated in several ways by the Netflix Engineering team’s recent blog post explaining what they did with the results of the Netflix Prize. What they wrote confirms what I’ve been saying about recommendations as well as my experience designing recommendation engines for clients, in several ways:

Fancy ML techniques don’t matter so much. The winning BellKor/Pragmatic Chaos teams implemented ensemble methods with something like 112 techniques smushed together. You know how many of those the Netflix team implemented? Exactly two: RBM’s and SVD.

If you’re a would-be internet entrepreneur and your idea relies on some ML but you can’t afford a quant to do the stuff for you, this is good news. Forget learning every cranny of research like Pseudo-Markovian Multibagged Quantile Dark Latent Forests! You can watch an hour-long video on OCW by Gilbert Strang which explains SVD and two hour-long Google Tech Talks by Geoff Hinton on RBM’s. RBM’s are basically a superior subset of neural network with a theoretical basis why it’s superior. SVD is a dimension reduction technique from linear algebra. (There are many Science / Nature papers on dimension reduction in biology; if you don’t have a licence there are paper-request fora on Reddit.)

Not that I don’t love reading about awesome techniques, or that something other than SVD isn’t sometimes appropriate. (In fact using the right technique on the right portion of the problem is valuable.) What Netflix people are telling us is that, in terms of a Kaggleistic one-shot on the monolithic data set, the diminishing marginal improvements to accuracy from a mega-ensemble algo don’t count as useful knowledge.

Domain knowledge trumps statistical sophistication. This has always been the case in the recommendation engines I’ve done for clients. We spend most of our time trying to understand the space of your customers’ preferences — the cells, the topology, the metric, common-sense bounds, and so on. You can OO program these characteristics. And (see bottom) doing so seems to improve the ML result a lot.

Another reason you’re probably safe ignoring the bleeding edge of ML research is that most papers develop general techniques, test them on famous data sets, and don’t make use of domain-specific knowledge. You want a specific technique that’s going to work with your customers, not a no-free-lunch-but-optimal-according-to-X academic algorithm. Some Googlers did a sentiment-analysis paper on exactly this topic: all of the text analysis papers they had looked at chose not to optimise on specific characteristics (like keywords or text patterns) known to anyone familiar with restaurant-review data. They were able to achieve a superior solution to that particular problem without fancy new maths, only using common sense and exploration specific to their chosen domain (restaurant reviews).

What you measure matters more than what you squeeze out of the data. The reason I don’t like* Kaggle is that it’s all about squeezing more juice out of existing data. What Netflix has come to understand is that it’s more important to phrase the question differently. The one-to-five-star paradigm is not going to accurately assess their customers’ attitudes toward movies. The similarity space is more like Dr Hinton’s reference to a ten-dimensional library where neighbourhood relationships don’t just go along a Dewey Decimal line but also style, mood, season, director, actors, cinematography, and yes the “People like you” metric (“collaborative filtering”, a spangled bit of jargon).

For them the preferences evolve fairly quickly over time. That has to make it hard. If your users’ preferences evolve over time: good luck, it may be quite hard.

John Wilder Tukey: “To statisticians, hubris should mean the kind of pride that fosters an inflated idea of one’s powers and thereby keeps one from being more than marginally helpful to others. … The feeling of “Give me (or more likely even, give my assistant) the data, and I will tell you what the real answer is!” is one we must all fight against again and again, and yet again.”via John D Cook

Relatedly, a friend of mine who’s doing a Ph.D. in complexity (modularity in Bayesian networks) has been reading the Kaggle fora from time to time. His observation of the Kaggle winners is that they usually win with gross assumptions about either the generating process or the underlying domain. Basically they limit the ML search using common sense and data exploration; that gives them a significant boost in performance (1−AUC).

* I admire @antgoldbloom for following through on his idea and I do think they have a positive impact on the world. Which is much better than the typical “Someone should make X, that would be a great business” or even worse but still typical: “I’ve been saying they should have that!” Still, I do hold to my one point of critique: there’s no back-and-forth in Kaggle’s optimisation.

The fact that most people don’t have most perfect pitch (things sound the same in different keys) may be so that we can understand that, despite pitch differences in male/female adults’ speech and children’s speech, they are saying the same words.

“It’s as if we couldn’t tell the difference between red and blue, but we were highly sensitive to the-difference-between-red-and-orange and the-difference-between-blue-and-green.

View this 3-manifold as an interval of concentric spheres where you have to imagine gluing the inner sphere to the outer sphere.

Near each point on a singular fiber, a regular fiber passes by some fixed number of times, the order of the singular fiber. In the picture above this number is 5 for both singular fibers.

Here they have order 2.

Here they have order 3.

Here they have order 1 and so they aren’t that special. A homeomorphism would make all the fibers appear as radial arcs, the S^1‘s of the S^1 x S^2.

For the conference honoring the 60th birthday ofCaroline Series(only a German wiki?!?), I was one of a handful asked to contribute pictures inspired by her work. First up is my contribution followed by a description. After that are a few more.

I don’t think I will ever be this awesome. Not only are the pictures way easier to understand than some scary symbols, but the text explains their meaning really clearly and in not-too-many words.

Pass the acid—I mean, the advanced mathematics—please. People thought Grigory Perelman was crazy for turning down a million-dollar prize and living like an ascetic. “Why should I jump for a million dollars, when I can control the vacuum space in between the quarks of the universe?” is my paraphrase of his reply. I don’t have a million dollars, nor do I understand all of this ring fiber link knot book page contact braid surgery stuff. But right now I’m honestly not sure which I would prefer: the imagination, or the dinero.

Thanks to Maxime (@2_43112609_1 on twitter) for the pointer.

Sometimes I like to spend an hour looking at something I barely understand. The inside of this guy’s mind has got to be so interesting, but it’s been shaped by geometry rather than words, so it’s very hard for him to express it. The geometry shaping it is also quite less limited than the square space we hit baseballs in, so it’s hard to draw as well.

I can offer some help on grokking what he’s saying, but there’s simply no way to absorb this stuff quickly. That said, I wouldn’t mind being able to imagine the platonic forms inside Bill Thurston’s head.

GLOSSARY-LIKE DISCUSSION

Topology. You want to understand why identifying the left and right side of a wide rectangle (declaring left = right, so that when you leave the left side of the Mario screen you appear again on the right) is the same as cutting a long strip of paper and taping the two ends together.

(There’s a slight variation on that game that results in a famously weird space—the one-sided, single-edged Möbius strip).

Quotient spaces. You want to understand what it means to quotient a space. I can give a few examples. ℝ/ℤ would be the unit real interval [0,1) — kind of a microcosm of the real numbers themselves. The Western chromatic musical scale quotients by 13 notes to make the octaves. It’s not that A4 is “equal” to A8, but it shares the same structural relationship as do E4 and E8.

An orbifold is a manifold that’s been quotiented. Like if you took the plane and made an equivalence class of the vertical [0,1)’s with all the [1,2)’s and [2,3)’s and etc., you would be looking at an infinitely wide strip with all the verticality “wrapped up” in [0,1) — not gone, just wrapped up into one microcosm.

You could also think about Groundhog Day (the analogy doesn’t work precisely). He’s living through the same span of time over and over because it’s been quotiented along the time dimension (the result of the division is a length of one day)

Oh … equivalence classes are another thing you have to know about. I haven’t written about them yet. WVO Quine came up with a sensible definition of “what is 2” using the concept. And as Terence Tao wrote, when one uses a “noise-tolerant” definition — like if a lot of different ways of saying something can be taken to mean the same thing — that’s another example of an equivalence class.

Back to the music theory for a second—there are multiple ways you could set up equivalence classes.

octaves — it’s not as if all “G♯” notes sound the same — but when we talk about octaves it’s usually with reference to the same-sounding-ness of twice-as-fast frequencies

across instruments — the overtone series of a tuba playing C3 differs from the overtone series of a double-bass playing C3. Nor does blowing on your double-bass or plucking your tuba produce the note. But we equivalence-class these differences away and write their parts in mostly the same notation. (Not exactly the same since you never see the word pizzicato on a tuba score, etc.)

inversion — I can do a major A triad as AEG, GAE, EGA … it’s a combinatorics thing; 3! ways. No, they don’t all sound the same, but when I use the word “triad” I am equivalence-classing over the kind of sameness that they do have.

enharmonics — Sure, D♯♯ and F♭ sound the same — but conceptually they’re very different, and the notes around D♯♯ will be different than the notes around F♭.

slight errors — players of the cello or the voice know that pitch is a continuous variable—however we might reasonably call 398 Hz = 400 Hz = A4.

transposition — Certainly composers choose the key of D (or, if they’re Stephen Sondheim, F♯) for a reason — but if a song isn’t within your vocal range you can always subtract or add a certain fixed pitch (in notes-space, not in Hz-space!) from every note and the piece will sound “the same” — not exactly the same, but it will recognisably be a pub song — I mean, the US national anthem

If I say “Hand me that glass”, I don’t mean to reference the glass at a particular orientation, rotation, or place in the room—I mean to equivalence-class ∀ such configurations of the glass—they all mean “that glass”. And if I say “Hand me a glass” — “Which glass? This glass?” — “Any glass!” then I’m equivalence-classing ∀ glasses within a certain distance from you.

Hyperbolic geometry. In square space, four right angles ∟ add up to the whole shebang 360°. But in the logical abstract it needn’t be that way. What if “space” consisted of 3 right angles ∟, or 12? Something to think about.

Oh — and what if it took one number of azimuthal ∢ right-angles to make the whole pie round, and took a different number of planar right-angles to make that whole pie round? Yeah, that would be weird too.

Watch the 20-minute movie Not Knot, where they explain that links—knots made of several (closed/looped/circular) ropes rather than just one rope—biject uniquely to the complement of some hyperbolic geometrical space.

Since hyperbolic geometric spaces had already been explored a bit before the 1980’s, now everyone had a fun tool to unite concepts and ad-lib toward new ones. The new bijection opened up the gates to some easy logical shortcuts. I drew a picture of the way this kind of logic goes in talking about a clever way someone thought of to generate random normals with little computation.

But this is in general how mathematicians solve impossible-sounding problems. I use a little bit of logic in domain X, as long as it’s easy there. Then I use this equivalence that somebody figured out to port the stuff into domain Y. Then I do that’s easy in domain Y. Then I either go back to my original domain or maybe I use some more equivalences to do easy stuff in domain ℤ, ℚ, Linear, and so on—always only using “obvious” logic in the particular domain, and letting the equivalences keep me right as I convert the problem across domains. The “link”-to-hyperbolic-complement-space was one such. Other examples include Fourier-to-regular domain, polynomials-to-sequences, equivalences-across-NP-complete-problems, graphs-to-matrices, matrices-to-characters, Lie-groups-to-matrices, …..

Oops — just used another common maths word without defining it. Bijections are one-to-one mappings from the source domain onto the entire whole of the target domain. For example a strictly monotonic function from ℝ→ℝ uniquely assigns members ∈ℝ to other members ∈ℝ — in such a way that no value is reused and every value is used.

A strictly monotone function injects the source into && surjects the source onto the target—which means it can be inverted. (By contrast, a non-monotonic, up-and-down-looking function, re-uses values, so going in reverse you couldn’t tell which usage the 3 had come from.)

If ∃ a bijection between X and Y, then ∃ a correspondence between X and Y. When mathematicians are trying to speak casually, they will often say something like “You can’t comb a hedgehog” or “You can turn any 3-manifold into a 3-sphere”. “You can do” is their way of saying ∃ a bijecting function that relates the two: ƒ(X)=Y. If ∄ a bijection, then it’s impossible to put X and Y into correspondence — there’s no earthly or heavenly way in which these two things could be made to look alike. For example, maps must fail to correctly show the globe because ∄ a bijection between a globe and a plane. (They also fail because of distortions; that would be asking for a conformal, area-preserving bijection instead of merely a bijection.) They also show how spaces-with-stuff-removed can biject to completely unexpected things. A punctured plane is equivalent to the surface of a cylinder, for instance. (?!?!) The punctured surface of a ball is equivalent to a (not-punctured) plane, for instance. (‽‽) Hey, I don’t make this stuff up, I’m just reporting the facts.

I guess in this talk he is showing different pictures of the associated geometry of various links.

Look up Hopf fibrations, one-point compactification, nilgeometry, solvegeometry, Lie groups (they’re groups, but continuous rather than discrete), Hopf circles, …. on Wikipedia. Be forewarned: this may turn into a months-long reading project.

Complements.Not Knot talks confusingly on this topic (“it’s not empty space, it’s space that’s not even there” … I think that way of talking only makes sense to mathematicians).

As I said in (2), spaces-with-stuff-removed can be homeomorphic to something completely unexpected. If you remove a point from the plane you introduce cylindricity around that point. Kind of unexpected that poking a hole in a square space makes a circular space, but that’s logic for you—always pointing out that illogical-sounding things are in fact inescapably true.

The symbology for complements looks too similar to the symbology for quotients. Sorry, not my decision. ℝℚ = the irrationals ℚ∁. ℂℚ∁ = the curliness of √−1without the ridiculously, insanely thick thickness of the continuum. A manageable space in which not all sequences converge. ℝTranscendentals = Algebraics. Another eminently reasonable number system that does everything you’d want without the messy insanity.

ℚ = all fractions, minus zero. This is a punctured thing. ℝ² = the punctured plane. ℝ³ = the cubic solid we seem to live in (Newton’s rigid rods) minus a point in the center of the universe. I don’t know if ℝ³ bijects to a looped thing like ℝ².

The worldSnoopy. Logically it’s equivalent to the punctured cubic thing I just described. Kind of boring, I thought removing Snoopy would be more devastating.

BTW, you can also adjoin things, like ℚ adjoin i = ℂℚ∁ mentioned above. I like this one if you can’t tell. ℝ adjoin ∞ is the one-point compactification of the line (as long as ∞ is defined to be ± ∞ so you can get there from the left or right)

Symmetries. The peace sign has a 3-way symmetry. Mirror images are 2-way symmetries. You could draw a flower with a 5-fold symmetry or a 12-fold symmetry and so on. The concept itself isn’t confusing, but the way Thurston and Not Knot talk fluidly, assuming without making explicit the implications of identification, quotienting by symmetry, topological gluing, point/line removal, and complementation together, is overwhelming.

The methods of topology, when applied to cultural analysis, provide a rigorous, yet unabashedly humble investigation of the nature of cultural relationships.

The existential quantifier ∃ of logic (the propositional calculus) and the image operation along a continuous function ƒ from topology turn out to be essentially the same operation: from a categorical point of view they are both adjoint functors.