


samples in $d$ dimensions can have a nonstandard $\mathcal)$ accuracy. Such mis-specified settings can lead to singularity in the Fisher information matrix, and moreover, the maximum likelihood estimator based on $n$ i.i.d. We consider over-specified settings in which the number of fitted components is larger than the number of components in the true distribution. Examples include suitably separated Gaussian mixture models and mixtures of linear regressions. A line of recent work has analyzed the behavior of the Expectation-Maximization (EM) algorithm in the well-specified setting, in which the population likelihood is locally strongly concave around its maximizing argument.
