K-Mean Clustering

sets · ‎11-03-2019

Hello!

I have implemented K-Mean clustering algorithm and I want to bifurcate the data into 5 data groups.I have attached my required.png file and my code but the code doesn,t generate the desired output every time.I am initialization my centroids withing the span of data generated and it is all random. Terminating my loop when there is no change between the previous and currently calculated centers of clusters.

Rahulbala · ‎11-04-2019

Hello

This is the basic problem with K-means clustering, it lacks consistency and it is not repeatable.We might get different outputs eachtime.

Then why K-means clustering is popular ? - the answer is simple.It is faster and it is always an introduction for a course in unsupervised learning. Check this.

I was curious to know what was happening with the clustering method, So I have tried implementing the same, it works a bit differently and gives the desired output 8/10 times (cant give accurate figure)

At the first iteration, we have to make sure that the random centroids are taken only once.

I have attached the VIs (Please download OpenG Array toolkit if you are not using it). Try exploring different clustering methods.

-Rahul

Hit KUDOS for Thanks

sets · ‎11-23-2019

But the results are still not consistent.....I have come up with a solution in which i iterate the whole process multiple times and check the states if they are same and then converge towards a sol....will share the code

alexderjuengere · ‎11-23-2019

@Rahulbala wrote:

This is the basic problem with K-means clustering, it lacks consistency and it is not repeatable.We might get different outputs eachtime.

this is only true, if you use random values for initialisation.

@sets wrote:

I am initialization my centroids withing the span of data generated and it is all random.

have you tried to reproduce your results using the same initial values?

Rahulbala · ‎11-23-2019

Traditionally,random initialisation is part of K-means clustering algorithm. Fixing the initial values will definitely give you the same result everytime. What you do mean by same initial values(like same index values whatever data is given as input) ?

We can make sure that we do a better selection of initial values by using techniques like Naive Sharding centroid algorithm. This will make sure that the initial values are good enough for clustering.

-Rahul

Hit KUDOS for Thanks

alexderjuengere · ‎11-24-2019

@Rahulbala wrote:

Fixing the initial values will definitely give you the same result everytime. What you do mean by same initial values(like same index values whatever data is given as input) ?

k-means is going to converge to a solution or rather a local minimum. always.

but the quality of this solution may differ dramatically from trial to trial, because the found local minimum must not be the optimal local minimum.

this is because not every randomly picked starting point for a centroid will converge to the actual centroid.

here, the actual centroids are given in required.png ‏287 KB

I don't have the TSA Toolkit, so in

K-Mean Clustering.vi ‏40 KB

I had to change

to

furthermore, you should use a For-Loop here:

so, now we can look easily on the initial value vector, and how it affects he found solution:

this instance did converge in 3 steps to a not so well solution:

this instance did converge in 3 steps to the optimal solution:

attached .vi is back-saved to LabView 2010

Rahulbala · ‎11-24-2019

Great but even now I am getting randomised outputs.

-Rahul

Hit KUDOS for Thanks

alexderjuengere · ‎11-24-2019

@Rahulbala wrote:

Great but even now I am getting randomised outputs.

that's not the point.

set already figured out on his own, how to cope with this conduct:

@sets wrote:

But the results are still not consistent.....I have come up with a solution in which i iterate the whole process multiple times and check the states if they are same and then converge towards a sol....will share the code

rcpacini · ‎09-11-2024

Removed OpenG & SubVI dependencies, added visualizations, & basic statistics.

This K-Means Clustering is for a 1D Data Set array with optional normalization (using the data set or an absolute min & max scale to output 0 to 1 or -1 to 1 respectively). The graph shows the data set clusters and the cursor shows the mean for each cluster.

VI Snippet in LabVIEW 2018.

LabVIEW

K-Mean Clustering

K-Mean Clustering

Re: K-Mean Clustering

Re: K-Mean Clustering

Re: K-Mean Clustering

Re: K-Mean Clustering

Re: K-Mean Clustering

Re: K-Mean Clustering

Re: K-Mean Clustering

Re: K-Mean Clustering