Submodular optimization has been receiving a lot of attention in machine learning recently. It a nice way to generalize convex optimization to sets, and has the nice property that very simple greedy algorithm is : 1) pretty close to optimal and 2) pretty much the best one can do.

If $\Omega$ is a set, a set function $f: 2^{\Omega} \rightarrow R$ is submodular if it satisfies one of the following equivalent conditions:

1. If $S, T \subset \Omega$, then $f(S)+f(T) > f(S \cup T) + f(S \cap T)$.

2. If $S \subset T$ and $x \in \Omega/T$, then $f(T \cup x) - f(T) < f(S \cup x) - f(S)$.

The conditions capture the intuition for diminishing margins of return.

We call a submodular function monotone if $\forall S \subset T, f(S). Monotone submodular optimization include problems such as Max Cover, which says given a collection of sets and $k$, find the $k$ sets whose union is the greatest. Max Cover is NP-hard. What’s very nice is that any monotone submodular optimization can be approximated to within $1-1/e$ by a simple Greedy algorithm.

The Greedy algorithm works in iterations. In the first round, find the single element $x_1 \in \Omega$ to maximize $f(x_1)$. Next, given $x_1$, find $x_2$ to maximize $f(x_1 \cup x_2)$ and so on. It’s actually pretty easy to show by induction that this give s $1-1/e$ approximation to the optimal solution for any fixed $k$.