OCSVM 学习笔记
前言
OCSVM(OneClass SVM) 算法是一种经典的异常检测算法,基本原理与 SVM 类似,与 SVM 关注的二分类问题不同的是,就像它的名字 OneClass SVM 那样,OCSVM 只有一个分类,这也正是异常检测所需要的,我们不关注那些异常的数据点。
如果你不了解 SVM 可以点击这里: 支持向量机(SVM)学习笔记
OCSVM 简介
Our strategy is to map the data into the feature space corresponding to the kernel, and to separate them from the origin with maximum margin.
OCSVM 的基本思想是: 将数据映射到与内核相对应的特征空间,在数据与原点间构建超平面,并且最大化超平面到零点的距离
类似的,我们可以得到最优化问题的目标函数和优化条件:
min w ∈ F , ξ ∈ R n , ρ ∈ R 1 2 ∥ w ∥ 2 + 1 v n ∑ i = 1 n ξ i − ρ s . t . ( w ⋅ Φ ( x i ) ) ≥ ρ − ξ i , ξ i ≥ 0 \begin{aligned}
&\min_{w \in F, \xi\in R^n, \rho \in R} \frac{1}{2}\Vert w\Vert^2 + \frac{1}{vn}\sum_{i=1}^n\xi_i - \rho\\
&s.t.\quad (w \cdot \Phi(x_i)) \geq \rho - \xi_i , \quad \xi_i\geq 0
\end{aligned}
w ∈ F , ξ ∈ R n , ρ ∈ R min 2 1 ∥ w ∥ 2 + v n 1 i = 1 ∑ n ξ i − ρ s . t . ( w ⋅ Φ ( x i )) ≥ ρ − ξ i , ξ i ≥ 0
其中 ξ i \xi_i ξ i 为松弛变量,v ∈ ( 0 , 1 ) v\in (0,1) v ∈ ( 0 , 1 ) 用于调节松弛程度 ,F 为特征空间:
k ( x , y ) = ( Φ ( x ) ⋅ Φ ( y ) ) k(x, y) = (\Phi (x) \cdot \Phi (y))
k ( x , y ) = ( Φ ( x ) ⋅ Φ ( y ))
目标函数和优化条件之后,类似的,写出拉格朗日方程,得到对偶问题求解:
min α 1 2 ∑ i j α i α j k ( x i , x j ) s . t . 0 ≤ α i ≤ 1 v n , ∑ i α i = 1 \begin{aligned}
&\min_{\alpha} \frac {1}{2} \sum_{ij} \alpha_i \alpha_j k (x_i , x_j) \\
&s.t.\quad 0 \leq \alpha_i \leq \frac{1}{vn} , \sum_{i}\alpha_i = 1
\end{aligned}
α min 2 1 ij ∑ α i α j k ( x i , x j ) s . t . 0 ≤ α i ≤ v n 1 , i ∑ α i = 1
另外一方面,有分类函数 f(x)
f ( x ) = sgn ( ( w ⋅ Φ ( x ) ) − ρ ) = sgn ( ∑ i = 1 n α i k ( x i , x ) − ρ ) f(x) = \operatorname {sgn} ((w \cdot \Phi (x) ) - \rho ) = \operatorname { s g n } ( \sum _ { i = 1 } ^ { n } \alpha _ { i } k ( x_i , x) - \rho )
f ( x ) = sgn (( w ⋅ Φ ( x )) − ρ ) = sgn ( i = 1 ∑ n α i k ( x i , x ) − ρ )
算法示例
我们可以通过 sklearn
简单的实现 OCSVM。
1 2 3 4 5 6 7 >>> from sklearn.svm import OneClassSVM>>> X = [[0 ], [0.44 ], [0.45 ], [0.46 ], [1 ]]>>> clf = OneClassSVM(gamma='auto' ).fit(X)>>> clf.predict(X)array([-1 , 1 , 1 , 1 , -1 ]) >>> clf.score_samples(X)array([1.7798 ..., 2.0547 ..., 2.0556 ..., 2.0561 ..., 1.7332 ...])
下面是 sklearn
上的一个简单案例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 import numpy as npimport matplotlib.pyplot as pltimport matplotlib.font_managerfrom sklearn import svmxx, yy = np.meshgrid(np.linspace(-5 , 5 , 500 ), np.linspace(-5 , 5 , 500 )) X = 0.3 * np.random.randn(100 , 2 ) X_train = np.r_[X + 2 , X - 2 ] X_test = np.r_[X + 2 , X-2 ] X_outliers = np.random.uniform(low=0.1 , high=4 , size=(20 , 2 )) clf = svm.OneClassSVM(nu=0.1 , kernel='rbf' , gamma=0.1 ) clf.fit(X_train) y_pred_train = clf.predict(X_train) y_pred_test = clf.predict(X_test) y_pred_outliers = clf.predict(X_outliers) n_error_train = y_pred_train[y_pred_train == -1 ].size n_error_test = y_pred_test[y_pred_test == -1 ].size n_error_outlier = y_pred_outliers[y_pred_outliers == 1 ].size Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) Z = Z.reshape(xx.shape) plt.title("Novelty Detection" ) plt.contourf(xx, yy, Z, levels=np.linspace(Z.min (), 0 , 7 ), cmap=plt.cm.PuBu) a = plt.contour(xx, yy, Z, levels=[0 , Z.max ()], colors='palevioletred' ) s =40 b1 = plt.scatter(X_train[:, 0 ], X_train[:, 1 ], c='white' , s=s, edgecolors='k' ) b2 = plt.scatter(X_test[:, 0 ], X_test[:, 1 ], c='blueviolet' , s=s, edgecolors='k' ) c = plt.scatter(X_outliers[:, 0 ], X_outliers[:, 1 ], c='gold' , s=s, edgecolors='k' ) plt.axis('tight' ) plt.xlim((-5 , 5 )) plt.ylim((-5 , 5 )) plt.legend([a.collections[0 ], b1, b2, c], ["learned frontier" , 'training observations' , "new regular observations" , "new abnormal observations" ], loc="upper left" , prop=matplotlib.font_manager.FontProperties(size=11 )) plt.xlabel( "error train: %d/200; errors novel regular: %d/40; errors novel abnormal:%d/40" %( n_error_train, n_error_test, n_error_outlier) ) plt.show()
参考文献
B. Schölkopf, R. Williamson, A. Smola, J. Shawe-Taylor, and J. Platt, “Support vector method for novelty detection,” in Advances in Neural Information Processing Systems, vol. 12. Denver, CO, USA: MIT Press, Aug./Sep. 1999, pp. 582–588.