Hierarchical Clustering algorithm derived from the R package ‘amap’ [Amap].
The condensed distance matrix y can be computed by pdist() function in scipy (<http://docs.scipy.org/doc/scipy/reference/spatial.distance.html>)
Hierarchical Cluster.
Initialization.
Parameters : |
|
---|
Cuts the tree into several groups by specifying the cut height.
Parameters : |
|
---|---|
Returns : |
|
Performs hierarchical clustering on the condensed distance matrix y.
Parameters : |
|
---|
[Amap] | amap: Another Multidimensional Analysis Package, http://cran.r-project.org/web/packages/amap/index.html |
Memory-saving Hierarchical Clustering derived from the R and Python package ‘fastcluster’ [fastcluster].
Memory-saving Hierarchical Cluster (only euclidean distance).
This method needs O(NP) memory for clustering of N point in R^P.
Initialization.
Parameters : |
|
---|
Returns the hierarchical clustering encoded as a linkage matrix. See scipy.cluster.hierarchy.linkage.
Cuts the tree into several groups by specifying the cut height.
Parameters : |
|
---|---|
Returns : |
|
Performs hierarchical clustering.
Parameters : |
|
---|
[fastcluster] | Fast hierarchical clustering routines for R and Python, http://cran.r-project.org/web/packages/fastcluster/index.html |
k-means clustering.
Parameters : |
|
---|---|
Returns : |
|
Example:
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> import mlpy
>>> np.random.seed(0)
>>> mean1, cov1, n1 = [1, 5], [[1,1],[1,2]], 200 # 200 points, mean=(1,5)
>>> x1 = np.random.multivariate_normal(mean1, cov1, n1)
>>> mean2, cov2, n2 = [2.5, 2.5], [[1,0],[0,1]], 300 # 300 points, mean=(2.5,2.5)
>>> x2 = np.random.multivariate_normal(mean2, cov2, n2)
>>> mean3, cov3, n3 = [5, 8], [[0.5,0],[0,0.5]], 200 # 200 points, mean=(5,8)
>>> x3 = np.random.multivariate_normal(mean3, cov3, n3)
>>> x = np.concatenate((x1, x2, x3), axis=0) # concatenate the samples
>>> cls, means, steps = mlpy.kmeans(x, k=3, plus=True)
>>> steps
13
>>> fig = plt.figure(1)
>>> plot1 = plt.scatter(x[:,0], x[:,1], c=cls, alpha=0.75)
>>> plot2 = plt.scatter(means[:,0], means[:,1], c=np.unique(cls), s=128, marker='d') # plot the means
>>> plt.show()