KMeans Clustering

#lannister/machinelearning

The k-means algorithm takes k as a parameter, divides n objects into k clusters, so that the clusters have a high degree of similarity, and the similarity between clusters Degree is low. The processing process is as follows:
1. Randomly select k points as the initial cluster center;
2. For the remaining points, according to their distance from the cluster center, they are classified into the nearest cluster< br>3. For each cluster, calculate the mean of all points as the new cluster center
4. Repeat 2 and 3 until the cluster center no longer changes

Clustering is to find the relations /connections between data without labels.
K-means is one of the most widely used algorithms.

K-means for non-separated clusters(T-shirt sizing)

  • Find closest centroids

     




    1





    < div class="line">


    2








    3







    < br /> 4








    5








    6








    7








    8








    9< br />







    10








    11


    < /td>

     




    K = size(centroids, 1) ;








    distance = zeros(K, 1);% to store and return the min distance








    idx = zeros(size(X,1), 1);











    for i = 1:size(X, 1)








    for k = 1:K








    distance(k) = sqrt(sum((X(i,:)-centroids (k,:)).^2));








    end








    [mini, index] = min(distance);








    idx(i) = index;




    < br />



    end


  • Compute Means

     


    < br />
    1








    2








    3








    4







    < br /> 5



    < br />



    6








    7








    8


     




    [mn] = size(X);








    centroids = zeros( K, n);






    < br />



    fo rk=1:K







    < br /> log


    Large column


    KMeans Clusteringic = idx==k;








    centroids(k,:) = 1/sum(logic)* sum(X.*logic);








    % sum(logic) is the number of examples assigned to kth centroid








    end


  • Randomly initi alize cluster centroids

     




    1








    2








    3





    < br />

    4


     




    centroids = zeros(K, size(X, 2));








    randidx = randperm(size(X, 1));








    % Take the first K examples as centroids


    < /div>





    centroids = X(randidx(1:K), :);< br />

  • K -Means Clustering on Pixels

     




    1








    2








    3

    < br />






    4








    5








    6








    7


    < /div>





    8








    9








    10








    11


    12
    13
    14
    15
     




    % Run K-Means








    for i=1:max_iters














    % Output progress








    fprintf('K-Means iteration %d/%d...n', i, max_iters );








    if exist('OCTAVE_VERSION')








    fflush(stdout);








    end
    < br />












    % For each example in X, assign it to the closest centroid








    idx = findClosestCentroids(X, centroids);< br />













    % Given the memberships, compute new centroids








    centroids = computeCentroids( X, idx, K);








    end


1

2

3

4

5

6

7

8

9

10

11

K = size(centroids, 1);

distance = zeros(K, 1);% to store and return the min distance

idx = zeros(size(X,1), 1);

for i = 1:size(X, 1)

for k = 1:K

distance( k) = sqrt(sum((X(i,:)-centroids(k,:)).^2));

end

[mini, index] = min (distance);

idx(i) = index;

end

1

2

3

4

5

6

7

8

[mn] = size(X);

centroids = zeros(K, n);< /p>

for k=1:K

log big column KMeans Clusteringic = idx==k;

centroids(k,:) = 1/sum(logic)*sum(X.*logic);

% sum(logic) is the number of examples assigned to kth centroid

end

< p>1

2

3

4

centroids = zeros(K, size(X, 2));

p>

randidx = randperm(size(X, 1));

% Take the first K examples as centroids

centroids = X(randidx(1:K), :);

1

2

3

4

5

6

7

8

9

10

11

12< /p>

13

14

15

% Run K-Means

for i=1:max_iters

% Output progress

fprintf(‘K-Means iteration %d/%d…n’, i, max_iters);

< p>if exist(‘OCTAVE_VERSION’)

fflush(stdout);

end

% For each example in X, assign it to the closest centroid

idx = findClosestCentroids(X, centroids);

% Given the memberships, compute new centroids

centroids = computeCentroids(X, idx, K);

end

< /p>

Leave a Comment

Your email address will not be published.