Post

Semi-Supervised Gaussian mix model Learning

半监督高斯混合模型学习

以下是论文中 EM 算法的完整步骤。

1. 联合对数似然函数 (Joint Log-Likelihood Function):

\[L(\Theta) = \sum_{x_i \in \mathcal{X}_u} \log \sum_{l=1}^L \alpha_l p(x_i | \theta_l) + \sum_{x_i \in \mathcal{X}_l} \log \sum_{l=1}^L \alpha_l p(x_i | \theta_l) \beta_{c_i | l}\]

其中: \(\Theta = \{\alpha_1, ..., \alpha_L, \theta_1, ..., \theta_L\}\)是模型参数,满足 \(\sum_{l=1}^L \alpha_l = 1\) 且 \(p(x_i | \theta_l)\) 是第 \(l\) 个分量的概率密度函数。

\(\theta_l = \{\mu_l, \Sigma_l\}\) 是第 \(l\)个分量的参数,\(\mu_l\) 是均值向量,\(\Sigma_l\) 是协方差矩阵。

\(\beta_{c_i \mid l} = P(c_i \mid l) = P(c_i \mid x_i, m_i = l)\) 是给定分量 \(l\) 时属于类别 \(c_i\) 的后验概率。

2. E 步 (Expectation Step):

计算每个数据点属于每个分量的后验概率。对于无标签数据 \(x_i \in \mathcal{X}_u\), 后验概率为:

\[P(l | x_i, \theta^t) = \frac{\alpha_l^t p(x_i | \theta_l^t)}{\sum_{l=1}^L \alpha_l^t p(x_i | \theta_l^t)} \quad x_i \in \mathcal{X}_u\]

对于有标签数据 \(x_i \in \mathcal{X}_l\), 后验概率为:

\[P(l | x_i, c_i, \theta^t) = \frac{\alpha_l^t p(x_i | \theta_l^t) \beta_{c_i | l}^t}{\sum_{l=1}^L \alpha_l^t p(x_i | \theta_l^t) \beta_{c_i | l}^t} \quad x_i \in \mathcal{X}_l\]

3. M 步 (Maximization Step):

使用 E 步中计算的后验概率更新模型参数。

\[\mu_l^{t+1} = \frac{1}{N \alpha_l^t} \left( \sum_{x_i \in \mathcal{X}_l} x_i P(l | x_i, c_i, \theta^t) + \sum_{x_i \in \mathcal{X}_u} x_i P(l | x_i, \theta^t) \right)\] \[\Sigma_l^{t+1} = \frac{1}{N \alpha_l^t} \left( \sum_{x_i \in \mathcal{X}_l} M_{il}^t P(l | x_i, c_i, \theta^t) + \sum_{x_i \in \mathcal{X}_u} M_{il}^t P(l | x_i, \theta^t) \right)\] \[\alpha_l^{t+1} = \frac{1}{N} \left( \sum_{x_i \in \mathcal{X}_l} P(l | x_i, c_i, \theta^t) + \sum_{x_i \in \mathcal{X}_u} P(l | x_i, \theta^t) \right)\] \[\beta_{k|l}^{t+1} = \frac{\sum_{x_i \in \mathcal{X}_l, c_i = k} P(l | x_i, c_i, \theta^t)}{\sum_{x_i \in \mathcal{X}_l} P(l | x_i, c_i, \theta^t)}\]

其中 \(M_{il}^t \equiv (x_i - \mu_l^t)(x_i - \mu_l^t)^T\)。

总结:

EM 算法通过迭代执行 E 步和 M 步来估计参数。在 E 步中,根据当前参数计算数据点属于每个分量的后验概率;在 M 步中,根据 E 步计算的后验概率更新参数。算法持续迭代直到收敛或达到最大迭代次数。

This post is licensed under CC BY 4.0 by the author.