Catalan Numbers A presentation by What is a Catalan number? They are a sequence of numbers that arise in various problems. The terms of the sequence can be calculated by the formula: Notice how the terms of the sequence generated grow rapidly: Cn = 1, 1, 2, 5, 14, 42, 132, 429, 1430, ….. There are other different variations of the formula but these are all equivalent, including: Natural number proof We shall now see that the Catalan numbers are natural numbers as this is not instantly obvious. We shall do this by proving the previous formula: Expressing Cn as the difference of two binomial coefficients, we have thus proved that it is in fact a natural number. Catalan numbers in Pascal’s triangle The Catalan series can be found in Pascal’s triangle: 1 1 1 1 Looking at the numbers in the central column and the adjacent column, you will notice that the difference of these numbers produces the Catalan sequence. Numbers in the central column - 2–1= 1 6–4= 2 20 – 15 = 5 70 – 56 = 14 252 – 210 = 42 History of the Catalan numbers These numbers were first discovered by Leonhard Euler in the 18th century while he was trying to see how many ways a polygon with n+2 sides can be divided into n triangles without any of the lines intersecting. They were later named after Eugène Catalan in 1838 after he defined the sequence and found a more elegant formula. He also worked on the polygon problem but later found that the Catalan numbers appeared when looking at the problem of counting the number of ways a group of n letters could be fitted into parentheses. Eugène Charles Catalan 1814-1894 Appearances of Catalan number The Catalan numbers appear within combinatorical problems in mathematics. A few of the main problems we will be looking at in closer detail include: The Parenthesis problem Rooted binary trees The Polygon problem The Grid problem Parentheses In 1838 Eugene Catalan solved the following problem: How many different ways is it possible to arrange n pairs of parentheses so that they “make sense”? We shall say that a string of parentheses makes sense (or is valid) if it follows these rules: i) There are an equal number of open and closed parentheses in the string. ii) Counting from the left, the number of closed parentheses do not exceed the number of open parentheses at any point, for example: Parentheses Examples By this definition, the following are valid chains of parentheses: However these are not: Catalan demonstrated that the number of possible ways of ordering n pairs of parentheses like this is precisely Cn, the nth Catalan number. Parentheses (continued) We can check this manually for small values of n: n=0 n=1 n=2 n=3 * ( ). ( ( ) ), ( ) ( ). ( ( ( ) ) ), ( ( ) ) ( ), ( ( ) ( ) ), ( ) ( ( ) ), ( ) ( ) ( ). This table confirms Catalan’s results for C0 to C3. We shall now see a proof for the general case. 1 way 1 way 2 ways 5 ways Proof For clarity, throughout this proof we will let O denote an open parenthesis and let C denote a closed parenthesis. The total number of different arrangements of n Os and n Cs is This however includes the invalid cases (such as OCOCCO) that we are not interested with. We must now calculate the number of invalid cases and subtract this number from to find our answer. Proof (continued) Consider an invalid string of n pairs of parentheses. The first ‘fault’ is some C that is preceded by an equal number of Os and Cs, say k of each. Hence the ‘faulty’ C lies in the (2k +1)th position in the string of parentheses. We can then take these first (2k+1) terms and switch each one so that every O becomes a C and vice versa. Following this process we now have an arrangement of (n+1) Os and (n-1) Cs. Proof (continued) Conversely, any arrangement of (n+1) Os and (n-1) Cs can be rewritten as an invalid sequence of n pairs of parentheses. We do this by noting the first time Os outnumber Cs by one and switching each term, up to and including that point. Therefore the number of invalid sequences of n pairs of parentheses is equal to the total number of arrangements of (n+1) Os and (n-1) Cs. This is equal to . Subtracting this value we see that the number of ways of arranging n pairs of parentheses is equal to The Recurrence Relation It is possible to determine Catalan numbers by expressing them in terms of previous values. Here we shall prove that = For example we can calculate C4 using this recurrence relation. Assume we know that C3 = 5, C2 = 2, C1 = 1, C0 = 1. C4 = C0C3 +C1C2+ C2C1 + C3C0 = 1*5 + 1*2 + 2*1 + 5*1 = 14. . We shall now see that the previous problem regarding parentheses satisfies this recurrence relation. Recurrence Relation Proof It is clear that in any valid string of parentheses, the first character must be an open parenthesis ‘(‘. Later in the string, for validity, there must be a corresponding closed parenthesis ‘)’. Any further pairs of parentheses must lie either between the initial pair, or after them. If we wished to arrange (n+1) pairs of parentheses so that they make sense we would place an initial pair, and then a further n pairs of parentheses in the places marked A and B in the diagram. A and B must be valid strings of parentheses themselves and clearly either can contain up to n pairs, however if A contains a string of k pairs of parentheses, B must contain n-k pairs. Proof (continued) Hence the possibilities are: A contains n pairs B contains 0 pairs CnC0 possibilities A contains (n-1) pairs B contains 1 pair Cn-1C1 possibilities A contains 0 pairs B contains n pairs C0Cn possibilities Giving a total of CnC0 +Cn-1C1 + … + C0Cn possibilities. Hence = Rooted Binary Trees A binary tree is a rooted tree where each node has two descendants, the left child and the right child. Except for the end (shaded) nodes. Problem: How many rooted binary trees can be made with n+1 end nodes? n=0 n=1 n=2 n=3 The root of the problem n=1 DUDU The rules of these sequences of D’s and U’s are such that no initial part of the sequence has more U’s than D’s. This is exactly the same as for the O’s and C’s in the Parentheses examples. Polygons This problem involves the number of ways an n+2-sided polygon can be divided into n triangles by adding straight non-intersecting lines between the vertices. E.g. In the case of n=2: There are clearly only two possible ways, C2=2. Further Polygons n=3 C3=5 n=4 n=5 C4=14 C5=42 Connecting Polygons and Rooted Binary Tree This can be done by firstly planting the root of the tree to one of the edges of the polygon: Connecting Polygons and Rooted Binary Tree Next adding nodes to each triangle and linking these together with arcs corresponds to the branches of the binary tree with degree 3: Connecting Polygons and Rooted Binary Tree Finally by inserting the end nodes (leaves) of degree 1, it completes the connection for the Polygon to its equivalent Rooted Binary Tree: Grids This problem looks at the number of ways to cross an n x n grid, in the shortest way, starting in the bottom left corner and going to the top right, without crossing the diagonal line. Finish Cannot make moves along these lines Start Must only make moves along these lines A typical path: Here is an example of a route that might be taken: And this is equivalent to OCOOOCCOCOCC ( ) ( ( ( ) ) ( ) ( ) ) C O Summary We have given a few examples of counting problems in which the Catalan numbers arise but there are many other problems where they can be found. We have seen how they are related when the problem is viewed in a different way. If we remember the condition that ‘In a sequence of 2n items with n A’s and n B’s no initial part of the sequence has more B’s than A’s.’ Then the number of ways of counting these is the nth Catalan number. We can make a new example… Our Catalan Example Suppose two groups of mathematics students each perform a presentation, say one regarding Catalan numbers (group A) and another regarding eclipses (group B). Assume an audience of 2n people had to vote for their favourite project, n choosing group A, the other n choosing group B. How many different ways can the votes be counted so that the eclipses group are never ahead? The answer is the nth Catalan number Cn.