Archive for the ‘math’ Category

Surreal canonical linear orders

Monday, June 6th, 2011

((Click here for a pdf of the paper.))

This is a math post to serve as an introduction to the paper linked above.

Many sets of numbers, like the integers, the rationals, or the reals, can be totally ordered, which means there is a binary relation < that satisfies these properties:

  1. For any two elements a, b, exactly one of these is true: a < b, a = b, or a > b.
  2. If a < b and b < c, then a < c.

To give a less obvious example, think of complex numbers a+bi. Normally we don’t think of them as being totally ordered, but we can say a+bi < c+di iff (a < c or a=c and b < d). Another example that everybody knows, but maybe hasn't thought of as a "total order," is alphabetical order, which tells you exactly how to sort any list of words, even if they're words you've never seen before (as long as you know the alphabet, and the order of that alphabet).

The set of rational numbers Q has an interesting property. Given any other countable set X with a total order on X, it’s possible to embed X into Q. In other words, there is a function f:XQ such that f is 1-1 and x < y ⇒ f(x) < f(y). To show why this is an interesting property, try to find a function f:QZ which does the same thing, where Z is the set of all integers.

Well, don’t try too hard, because it turns out that’s impossible. I’ll explain why. Suppose, for the sake of contradiction, that you have a 1-1 function f:QZ with x < y ⇒ f(x) < f(y). Then f(0) and f(1) must be integers. But there are infinitely many rational numbers in the interval (0,1), while there can only be finitely many numbers between f(0) and f(1), so such a function can’t exist.

The point is that, there’s something special about Q that really captures the full complexity of countable total orders. Georg Cantor had a name for the key property: he suggested we say an order is everywhere dense if it has the property that, for any x < y, there is another element z with x < z < y. In other words, between every two elements is another.

The question I’m exploring in the paper is this: Is there an analogue of Q, and of this “everywhere dense” property, for higher cardinalities? Using the axiom of choice, and the generalized continuum hypothesis, the answer is definitely yes, and the proofs are explained in the paper.

To give a preview of how to extend Q to higher cardinalities: Let S1 denote the set of binary strings indexed by any countable ordinal number, ordered lexicographically with the proviso that, if s is a binary prefix of t, then t < s when the first bit after s is 0; otherwise t > s. It’s easy to see how to embed the real [0,1] interval into S1 using binary notation, and we can also embed all of R into [0,1] by using any order-preserving bounded function, such as arctan(x)/(π/2). But the other direction is impossible: for example, the first uncountable ordinal, ω1, is part of S1, as the set of all 1-only strings, indexed by all the countable ordinals. However, ω1 can’t be embedded in R. If ω1 fit in R, then it could be embedded in [0,1]; let xi denote the distance between f(xi) and f(xi+1), and we end up with an uncountable set of positive reals whose sum is finite. To show this last feat is impossible, there must exist an n such that the number of elements xi > 1/n is uncountable; but then clearly the sum cannot be finite.

By the way, the name “surreal” comes in because the canonical orders are key subsets of the class of surreal numbers.

Crazy Cut Solutions

Tuesday, December 14th, 2010

Here are the solutions for the crazy cut puzzles I posted last week.  Did you solve them all?

Pizza Puzzle: Solution

Saturday, November 27th, 2010

This post explains solutions for the pizza puzzle. If you haven’t read that puzzle yet, you should try it first!

There are multiple ways to solve this one. For these solutions, we’ll treat the radius of the pizza as length 1. Even if it had a different radius, we could scale all the work in a solution accordingly, and the end result would be the same, since it’s a percentage of the total area.

First, let’s draw another hexagon inside the pizza, with vertices at the center of each pepperoni circle.

The bold hexagon is the one we already see on the pizza; the dashed hexagon is drawn with vertices on the center of each pepperoni circle.  We’ve also drawn new circles at the corner of each vertex of the bold hexagon. Since each triangle in the large hexagon is equilateral, and two of its sides are radii of the pizza circle, each edge of the bold hexagon has length 1. Each outside circle (the new circles) have radius half that, which means they each have area A = π r² = π/4 (that should read “pi over 4,” in case the “pi” isn’t showing up correctly for you). The total area of the outside circles is thus six times that, which is 3π/2.

Notice that the large bold hexagon and its circles are similar to the small dashed hexagon and its circles (the pepperoni circles). If we can find the ratio of areas between the large hexagon and the small hexagon, this will also tell us the ratio of areas between the outer circles and the inner circles. This is a general rule of geometry; as another example of this rule, see the next figure. Whenever shapes are resized together, the scale factor of their areas is the same for each shape – either all are doubled, or all are halved, etc.

If area(big M-shape) = 2×area(little M-shape), then area(big circle) = 2×area(little circle).

Let’s take a closer look at one of the equilateral triangles in the big hexagon, from the first figure. The bold and dashed lines here show how this is a close-up on the bold and dashed hexagons.

There are six copies of the wavy triangle in the large equilateral triangle, and two copies of it in the smaller equilateral triangle (lower-right).  So the area of the smaller equilateral triangle is exactly 1/3rd the area of the larger. This means the area of the pepperoni circles is also a third of the area of the outer circles. Hence the pepperoni area is exactly (1/3) · (3π/2) = π/2, which means they take up half the area of pizza!

I personally enjoy this first solution because it involves no knowledge of trigonometry. However, there is another way to find the answer that looks more explicitly at the angles involved. In the next figure, we see the same wavy triangle from the last drawing.

We could find an equation between r and R through either the Pythagorean theorem, or using the fact that

r/R = sin(π/6) = 1/2.

The altitude of the large equilateral triangle is sin(π/3) = √3 /2 = r+R = 3r, so r = √3/ 6. Notice that r is the radius of a pepperoni circle, so the area of each pepperoni is π/12. The area of six pepperoni circles is then π/2, confirming it’s half the area of the pizza.

A Pizza Puzzle

Wednesday, November 17th, 2010

A mathematical pizzeria sells pizzas with six slices, each slice containing a maximally-sized equilateral triangle, as shown in the figure. Their pepperoni pizza features one maximally-sized circle of pepperoni per slice. What fraction of the area is taken up by the pepperoni circles?

The pirate-catching puzzle

Sunday, May 2nd, 2010

This weekend I heard a good mathy puzzle from George Miller.  Here’s a version I found online, attributed to Howard Lederer:

You’re on a government ship, looking for a pirate ship.  You know that the pirate ship travels at a constant speed, and you know what that speed is.  Your ship can travel twice as fast as the pirate ship.  Moreover, you know that the pirate ship travels along a straight line, but you don’t know what that line is.  It’s very foggy, so foggy that you see nothing.  But then!  All of a sudden, and for just an instant, the fog clears enough to let you determine the exact position of the pirate ship.  Then, the fog closes in again and you remain (forever) in the thick fog.  Although you were able to determine the position of the pirate ship during that fog-free moment, you were not able to determine its direction.  How will you navigate your government ship so that you will capture the pirate ship?

If you wanted, you could give a convincing but math-lite answer to this.  But you can also do better, so I’m going to ask my own version of the question:

Using your answer to the above question, what is the longest amount of time that may pass from the instant you saw the pirate ship until you capture it?  Suppose the pirate ship travels at 1 km/minute, and you first see it 3 km away.  How many minutes, at most, until you capture it?

I’ll post my answer on Friday (in 5 days).

there is no random – only zuul!

Friday, May 1st, 2009

The curve described by a simple molecule of air or vapor is regulated in a manner just as certain as the planetary orbits; the only difference between them is that which comes from our ignorance.

Probability is relative, in part to this ignorance, in part to our knowledge.

-Pierre Simon de Laplace, in A Philosophical Essay on Probabilities

I’ve always been fascinated by the question of determinism. Is everything about the future already contained in the state of the world today?

Addressing this question, one has to come to terms with what it could mean for the world to be non-deterministic. Intuitively, it means that certain events are non-predictable – we can think of them as random. And here is where we have to confront the fact that any useful model of the world, even a probabilistic one, has to be formally deterministic.

Why? Because, in math, there is no random. When a mathematician or statistician studies random behavior, they consider functions whose input is thought of as random. And by thinking of an input as random, the intuition and practical applications follow.

For example, suppose you ask what the standard deviation is for the random process of choosing the number 5 a third of the time, and the number 17 the rest of the time. You can model this with a function f:[0,1]->{5,17} which maps [0,1/3) to 5 and [1/3,1] to 17. From here you can study the random behavior of this experiment to your heart’s content. The point is that the function f is completely deterministic, and there is nothing here to give us a single example of an actually random number.

So when it comes to a probabilistic model of the universe, we would still have to use a deterministic function. To include the idea of randomness, we could add an extra input, which we think of as unknown or external to the world. For a moment, pretend that the world moves forward in discrete time units. Let x be the state of the world, y an unknown input, and the formula

x’ = f(x,y)

gives the state of the world a moment later. The presence of y is the non-determinism.

The thing that’s easy to forget is that probability theory does not dictate the way the world will behave. Rather, it is nothing more than a way to do something with a balance of partial knowledge, and an awareness of our own ignorance.

Here’s a quick thought experiment to help illustrate the difference:

Choose a number between 1 and 1000. I’ll also choose a number in the same range. What’s the probability that they’re the same?

It’s a trick question. They either are the same, or they’re not. Of course, if you assume that we’re both equally likely to choose any of the 1000 numbers, then you can say the odds are 1 in 1000. But the point is that this model is entirely in your mind. Since it’s a one-off experiment, and we don’t really even know that these numbers are chosen in any manner we could call random, there’s no real basis for that probability model. Maybe I always choose 7 and you always choose 23, and the numbers were bound to be mismatched from the start. The point is that, while probability is incredibly useful, it is wise to keep in mind – as Laplace did – that it is a mitigation of the unknown.

One last thought experiment toward comprehending what a probabilistic model might mean:

Some physicists play with the idea of parallel universes, so let’s do something like that. Suppose I toss a coin (heads/tails = H/T) several times in a row, and write down the outcomes in order. If each outcome is random, then we can model that in at least two different ways. One way is to assume that there is an external source of randomness, and that every result pulls in information from that random source – this would be analogous to the extra input to the function f(x,y) as above. Another way is to assume that every time a choice is to be made, the universe forks into two parallel versions: one in which the coin lands H, and another where it lands T. There is never a decision to be made, so no extra information is needed. Which model is better? Is there a way to choose between these two models, if we were in that world?

Mathematically, there is no difference; perhaps we may perceive the ideas as different, but the resulting formal theories would be identical. Why? Try this out: write down every H/T possibility for any number of coin tosses. For 3 tosses, you would get HHH, HHT, HTH, HTT, THH, THT, TTH, TTT. Most of those outcomes look random. In fact, you wouldn’t be amazed at any of them. Do the same thing for 50 tosses in a row. 50 heads in a row would be surprising, but the number of outcomes besides that surprising one is orders of magnitudes larger than the total number of humans who have ever lived so far.

In other words, no matter what, most of the outcomes will look random. In fact, they have to, by any reasonable definition of looking random. This is simply because a probability is nothing more than the number of outcomes including a certain event divided by the total number of outcomes (implicitly assuming they are equally likely). So any very likely event, such as there being about as many tails as heads, will by definition include most of the H/T sequences you write down.

A random model looks the same as a parallel model.

So when I say there is no random (only zuul), what I really mean is: probability theory and all that we posit about randomness is just the study of deterministic functions where you really don’t know the input, but you pretend to know it’s distribution.