线性判别分析(Linear Discriminant Analysis,LDA)是一种经典的线性分类方法。注意机器学习中还有一种用于NLP主题模型建模的潜在狄利克雷分布(Latent Dirichlet Allocation)也简称为LDA,大家在学习的时候注意区分。不同于上一讲谈到的PCA降维使用最大化方差的思想,LDA的基本思想是将数据投影到低维空间后,使得同一类数据尽可能接近,不同类数据尽可能疏远。所以,LDA是一种有监督的线性分类算法。
LDA原理与推导
下图描述了PCA和LDA直观上的区别。
并代入到可得:
考虑到矩阵数值解的稳定性,我们可以对其进行奇异值分解,即:
最后求其逆即可得到。
根据上述推导,我们可以整理LDA完整的算法流程为:
-
对数据按类别分组 -
分别计算每组样本的均值和协方差 -
计算类间散度矩阵 -
计算均值差 -
SVD方法计算类间散度矩阵的逆 -
根据计算。 -
计算投影后的数据点
读者还可以自行考虑将LDA推广到多分类情况,这里不再展开推导。
LDA基本实现
按照前述LDA算法流程,我们可以给基于numpy来实现一个简单的LDA模型。基本关 键点包括计算分组均值与协方差、类间散度矩阵和SVD分解等。具体实现过程如下代 码所示:
1 2 3 4 5 6 |
<span class="code-snippet_outer">import numpy as np</span></code><code> </code><code><span class="code-snippet_outer"><span class="code-snippet__class"><span class="code-snippet__keyword">class</span> <span class="code-snippet__title">LDA</span>():</span></span></code><code><span class="code-snippet_outer"> <span class="code-snippet__function"><span class="code-snippet__keyword">def</span> <span class="code-snippet__title">__init__</span><span class="code-snippet__params">(<span class="code-snippet__keyword">self</span>)</span></span>:</span></code><code><span class="code-snippet_outer"> <span class="code-snippet__keyword">self</span>.w = None</span></code><code> </code><code><span class="code-snippet_outer"> <span class="code-snippet__function"><span class="code-snippet__keyword">def</span> <span class="code-snippet__title">calculate_covariance_matrix</span><span class="code-snippet__params">(<span class="code-snippet__keyword">self</span>, X, Y=None)</span></span>:</span></code><code><span class="code-snippet_outer"> <span class="code-snippet__comment"># 计算协方差矩阵</span></span></code><code><span class="code-snippet_outer"> m = X.shape[<span class="code-snippet__number">0</span>]</span></code><code><span class="code-snippet_outer"> X = X - np.mean(X, axis=<span class="code-snippet__number">0</span>)</span></code><code><span class="code-snippet_outer"> Y = X <span class="code-snippet__keyword">if</span> Y == None <span class="code-snippet__keyword">else</span> Y - np.mean(Y, axis=<span class="code-snippet__number">0</span>)</span></code><code><span class="code-snippet_outer"> <span class="code-snippet__keyword">return</span> <span class="code-snippet__number">1</span> / m * np.matmul(X.T, Y)</span></code><code> </code><code><span class="code-snippet_outer"> <span class="code-snippet__comment"># 对数据进行向量转换</span></span></code><code><span class="code-snippet_outer"> <span class="code-snippet__function"><span class="code-snippet__keyword">def</span> <span class="code-snippet__title">transform</span><span class="code-snippet__params">(<span class="code-snippet__keyword">self</span>, X, y)</span></span>:</span></code><code><span class="code-snippet_outer"> <span class="code-snippet__keyword">self</span>.fit(X, y)</span></code><code><span class="code-snippet_outer"> X_transform = X.dot(<span class="code-snippet__keyword">self</span>.w)</span></code><code><span class="code-snippet_outer"> <span class="code-snippet__keyword">return</span> X_transform</span></code><code> </code><code><span class="code-snippet_outer"> <span class="code-snippet__comment"># LDA拟合过程</span></span></code><code><span class="code-snippet_outer"> <span class="code-snippet__function"><span class="code-snippet__keyword">def</span> <span class="code-snippet__title">fit</span><span class="code-snippet__params">(<span class="code-snippet__keyword">self</span>, X, y)</span></span>:</span></code><code><span class="code-snippet_outer"> <span class="code-snippet__comment"># 按类划分</span></span></code><code><span class="code-snippet_outer"> X<span class="code-snippet__number">0</span> = X[y == <span class="code-snippet__number">0</span>]</span></code><code><span class="code-snippet_outer"> X1 = X[y == <span class="code-snippet__number">1</span>]</span></code><code> </code><code><span class="code-snippet_outer"> <span class="code-snippet__comment"># 分别计算两类数据自变量的协方差矩阵</span></span></code><code><span class="code-snippet_outer"> sigma<span class="code-snippet__number">0</span> = <span class="code-snippet__keyword">self</span>.calculate_covariance_matrix(X<span class="code-snippet__number">0</span>)</span></code><code><span class="code-snippet_outer"> sigma1 = <span class="code-snippet__keyword">self</span>.calculate_covariance_matrix(X1)</span></code><code><span class="code-snippet_outer"> <span class="code-snippet__comment"># 计算类内散度矩阵</span></span></code><code><span class="code-snippet_outer"> Sw = sigma<span class="code-snippet__number">0</span> + sigma1</span></code><code> </code><code><span class="code-snippet_outer"> <span class="code-snippet__comment"># 分别计算两类数据自变量的均值和差</span></span></code><code><span class="code-snippet_outer"> u<span class="code-snippet__number">0</span>, u1 = X1.mean(<span class="code-snippet__number">0</span>), X2.mean(<span class="code-snippet__number">0</span>)</span></code><code><span class="code-snippet_outer"> mean_diff = np.atleast_1d(u<span class="code-snippet__number">0</span> - u1)</span></code><code> </code><code><span class="code-snippet_outer"> <span class="code-snippet__comment"># 对类内散度矩阵进行奇异值分解</span></span></code><code><span class="code-snippet_outer"> U, S, V = np.linalg.svd(Sw)</span></code><code><span class="code-snippet_outer"> <span class="code-snippet__comment"># 计算类内散度矩阵的逆</span></span></code><code><span class="code-snippet_outer"> Sw<span class="code-snippet__number">_</span> = np.dot(np.dot(V.T, np.linalg.pinv(S)), U.T)</span></code><code><span class="code-snippet_outer"> <span class="code-snippet__comment"># 计算w</span></span></code><code><span class="code-snippet_outer"> <span class="code-snippet__keyword">self</span>.w = Sw<span class="code-snippet__number">_</span>.dot(mean_diff)</span></code><code> </code><code><span class="code-snippet_outer"> <span class="code-snippet__comment"># LDA分类预测</span></span></code><code><span class="code-snippet_outer"> <span class="code-snippet__function"><span class="code-snippet__keyword">def</span> <span class="code-snippet__title">predict</span><span class="code-snippet__params">(<span class="code-snippet__keyword">self</span>, X)</span></span>:</span></code><code><span class="code-snippet_outer"> y_pred = []</span></code><code><span class="code-snippet_outer"> <span class="code-snippet__keyword">for</span> sample <span class="code-snippet__keyword">in</span> <span class="code-snippet__symbol">X:</span></span></code><code><span class="code-snippet_outer"> h = sample.dot(<span class="code-snippet__keyword">self</span>.w)</span></code><code><span class="code-snippet_outer"> y = <span class="code-snippet__number">1</span> * (h < <span class="code-snippet__number">0</span>)</span></code><code><span class="code-snippet_outer"> y_pred.append(y)</span></code><code><span class="code-snippet_outer"> <span class="code-snippet__keyword">return</span> y_pred</span></code><code> |
本文只给出二分类的LDA的推导和基本实现,读者可自行将其推广到多分类的情形,这里不做过多展开。sklearn中为LDA算法提供了sklearn.lda.LDA接口可供调用,实际应用时直接调用该接口即可。
参考资料:
周志华 机器学习