## Determinantal Point Process.

Review of 3 papers from Kulesza and Taskar on DPP.

Determinantal Point Process$P(A \subseteq Y)=det(K_{A})$, where Y is a discrete set of points, and K is a kernel on Y.

The benefit of DPP is that its partition function can be computed in closed form, as well as marginals and conditionals. To sample from a DPP, we need to use the eigen-decomposition of K. It captures repulsion between points to enforce diversity.

The 3 papers of Taskar looks at a few variations on this theme:

1. Structured DPP. K can be decomposed as $K_{ij}=q(y_{i})q(y_{j})\phi(y_{i})^{t}\phi(y_{j})$, where we think of q as the (scalar) quality of a point and $\phi$ as a feature vector. The structure comes in where we can have $y=\{y_i\}$ and decompose $q(y)=\Pi q(y_i)$ and $\phi(y)=\sum \phi(y_i)$. Applied it to estimate pose (odd choice). Not sure the details.
2. k-DPP. The regular DPP can generate subsets of points of various sizes. k-DPP basically constrains it to generate exactly k points. Applied to find sets of diverse images. The learning is over a convex sum of different kernels (i.e. learn the weights on a training set).
3. Learning. Treat the quality $q(y)=exp(\theta^{t}f(y))$, and optimize over $\theta$ from training set. Applied to text summarization.