机器学习训练营——机器学习爱好者的自由交流空间(入群联系qq:2279055353)
在数据的多变量统计和聚类里,谱聚类技术使用数据的相似矩阵的谱(即,相似矩阵的特征值)降低特征的维度。然后,在降维后的空间使用k-means
聚类。其中的相似矩阵度量了数据集的任何两点的相似性。
这个例子产生一个带有连接的圆圈区域的图像,使用谱聚类法(spectral clustering
)分隔出这些区域。在这种情况下,谱聚类法解决了标准的图像分割问题:图像被当作一个由三维像素连接的图,谱聚类算法等于是选择图分割的区域,最小化沿着分割的梯度改变率。因为该算法试图平衡区域的大小,如果我们分隔的圆圈大小不同,分割就等于失败了。
<>实例详解
首先,从scikit-learn的feature_extraction导入image, 从clusters导入谱聚类函数spectral_clustering
. 在image里定义4个圆圈区域。
import numpy as np import matplotlib.pyplot as plt from
sklearn.feature_extraction import image from sklearn.cluster import
spectral_clustering l = 100 x, y = np.indices((l, l)) center1 = (28, 24)
center2 = (40, 50) center3 = (67, 58) center4 = (24, 70) radius1, radius2,
radius3, radius4 = 16, 14, 15, 14 circle1 = (x - center1[0]) ** 2 + (y -
center1[1]) ** 2 < radius1 ** 2 circle2 = (x - center2[0]) ** 2 + (y -
center2[1]) ** 2 < radius2 ** 2 circle3 = (x - center3[0]) ** 2 + (y -
center3[1]) ** 2 < radius3 ** 2 circle4 = (x - center4[0]) ** 2 + (y -
center4[1]) ** 2 < radius4 ** 2
在这里,我们想要分隔出定义的4个区域,而不是把它们从背景里分隔出来。因此,我们使用一个mask限制背景。
# 4 circles img = circle1 + circle2 + circle3 + circle4 # We use a mask that
limits to the foreground: the problem that we are # interested in here is not
separating the objects from the background, # but separating them one from the
other. mask = img.astype(bool) img = img.astype(float) img += 1 + 0.2 *
np.random.randn(*img.shape) # Convert the image into a graph with the value of
the gradient on the # edges. graph = image.img_to_graph(img, mask=mask) # Take
a decreasing function of the gradient: we take it weakly # dependent from the
gradient the segmentation is close to a voronoi graph.data = np.exp(-graph.data
/ graph.data.std()) # Force the solver to be arpack, since amg is numerically #
unstable on this example labels = spectral_clustering(graph, n_clusters=4,
eigen_solver='arpack') label_im = np.full(mask.shape, -1.) label_im[mask] =
labels plt.matshow(img) plt.matshow(label_im) plt.show()
下面的程序分隔出2个圆圈区域。
# 2 circles img = circle1 + circle2 mask = img.astype(bool) img =
img.astype(float) img += 1 + 0.2 * np.random.randn(*img.shape) graph =
image.img_to_graph(img, mask=mask) graph.data = np.exp(-graph.data /
graph.data.std()) labels = spectral_clustering(graph, n_clusters=2,
eigen_solver='arpack') label_im = np.full(mask.shape, -1.) label_im[mask] =
labels plt.matshow(img) plt.matshow(label_im) plt.show()
阅读更多精彩内容,请关注微信公众号:统计学习与大数据
热门工具 换一换