# Algorithms: Balancing

Balancing in algorithms refers to minimizing the complexity of an algorithm by
making sure that its constituent parts share the load efficiently. It is *not* a
technique for solving problems. Instead it helps us understand how an existing
solution may be optimized.

## The theory of balancing

Say there is a problem of size \(n\). The problem is such that it can be broken down into a sequence of smaller problems. There are many ways the problem can be broken down:

Solve the problem in 1 chunk: $$T(n) = f(n)$$

Or solve the problem in chunks of 5: $$T(n) = f(5) + f(n/5)$$

Of course, the relations above are not unique. There are a multitude of ways that problems can be abstracted. But the question arises: what division of load is best?

Let’s assume we want to break down the problem into \(g(n)\) chunks. Then the size of the sub-problem becomes \(n/g(n)\). The time to solve the problem becomes: $$T(n) = f(g(n)) + f(n/g(n))$$

In \(big-O\) notation: $$O(T(n)) = O(f(g(n))) + O(f(n/g(n)))$$

We need to minimize \(O(T(n))\). Notice that the sum will be dominated by whichever term is larger on the right hand side. Which means that \(f(g(n))\) and \(f(n/g(n))\) must be within a constant factor of each other. Essentially, in \(big-O\) terms: $$f(g(n)) = f(n/g(n))$$

Solving this for \(g(n)\) gives us the ideal size for partitioning the problem. For simplicity, assume that \(f(n) = n\), then: $$g(n) = {n \over g(n)}$$ $$\therefore g(n) = \sqrt{n}$$

## Example: The egg dropping problem

**Problem**: *Say you have 2 eggs and a building with \(n\) storeys. You
want to find the storey that will cause the egg to break when dropped from it.
What is the fastest way to figure it out?*

**Solution**: A trivial approach is to drop one egg from storeys
1 to \(n\) until it breaks. This is going to take \(n = O(n)\) attempts.
Good, but we can do better.

There are two eggs. Let’s drop one egg every 5 storeys. Then if the egg breaks on the \(k^{th}\) attempt we will know that the ‘fatal’ storey is between \((k-1)\times 5\) and \(k \times 5\) storeys. Then the second egg will be dropped from the 5 storeys from \((k-1)\times 5\) to \(k \times 5 - 1\). Therefore the total attempts would be: $${n \over 5} + 5$$

Which is less than \(n\) but in \(big-O\) notation has the same complexity: $$O({n \over 5} + 5) = O(n)$$

Our approach with using two eggs is sound. It is reducing total attempts. Let’s generalize the solution. Say the egg is dropped every \(g(n)\) storeys for a total of \(n/g(n)\) attempts. Then, like before, the second egg will only be dropped \(g(n)\) times. This gives the total attempts as: $${n \over g(n)} + g(n)$$

To minimize the total complexity (attempts) the two stages of the solution need to be equally partitioned so one stage does not dominate the other. $$\therefore {n \over g(n)} = g(n)$$ $$g(n) = \sqrt{n}$$

Thus if the first egg makes \(\sqrt{n}\) drops over the same interval, then the second egg will have to make only \(\sqrt{n}\) drops, giving the total of: $$\sqrt{n} + \sqrt{n} = O(\sqrt{n}) \lt O(n)$$

## Example: Graph colouring

**Problem:** *Colour a 3-Colourable graph in polynomial time with as few colours
as possible.*

**Solution:** A graph is said to be \(n\)-colourable if all vertices can be
assigned 1 of \(n\) colours without adjacent vertices having the same colour.
Graph colouring is an NP-Complete
problem (except for 1 and 2 colouring). This means that an optimal solution
cannot be found in polynomial time. Colouring a 3-colourable graph with exactly
3 colours might be hard, but we can attempt to use as few colours as possible in
polynomial time.

One approach is called Greedy colouring. We look at all vertices in a sequence. Each vertex is assigned the first “available” colour. A colour is “available” if it is not assigned to any of the vertex’s neighbours. So if a graph has a maximum degree \(d\), then the worst case scenario for greedy colouring will take \(d+1\) colours.

The greedy approach, however, is not leveraging what we know about our graph: it is 3-colourable. Which means that every vertex’s neighbourhood is 2-colourable! 2-colouring is a simple problem. Essentially do any traversal of a graph and switch between 2 colours for each new vertex. We can use this to convert our problem into a sequence of 2- and greedy- colourings.

Here is how the new solution works: consider all vertices of degree \(\gt g(n)\). For each such vertex, 2-colour its neighbourhood. Never use those colours again. Delete the neighbourhood from the graph. Greedily colour the remaining graph. The 2-colouring step will happen at most \(n/g(n)\) times (since we remove at least \(g(n)\) vertices each step). So it will use \(O(n/g(n))\) colours. After the 2-colouring step, only vertices with degree < \(g(n)\) will remain, which will take \(O(g(n))\) colours. So the total number of colours will be: $$O({n \over g(n)}) + O(g(n))$$

Balancing both stages gives us: $${n \over g(n)} = g(n) \Rightarrow g(n) = \sqrt{n}$$

Therefore, 2-colouring all vertices with degree \(\gt \sqrt{n}\) and greedy
colouring the remaining vertices will take \(O(\sqrt{n})\) colours. This
is called *Widgerson’s Algorithm* after (surprise!) Avi Widgerson.

Balancing may not apply to all approaches. Nonetheless it is a powerful tool for analysis of algorithms.

*This article was written from my notes of Dr. Jeremy
Spinrad’s excellent
lecture on balancing.*