E.g. clustering a 1-d normally distributioned data (10k samples) with k-means (6 clusters) results in groups with very different numbers of points assigned to each clustergroup (700 to 2400).
"Since data points are represented by the index of their closest centroid, commonly occurring data have low error, and rare data high error."
This contradicts the first quote:
If all clusters have the same number of points assigned (As the first quote states), than rarely occuring data is quantized with the same precision as frequently occuring data.
I am confused. I hope i am correct with my concerns here. <span style="font-size: smaller;" class="autosigned">— Preceding [[Wikipedia:Signatures|unsigned]] comment added by [[Special:Contributions/129.206.66.37|129.206.66.37]] ([[User talk:129.206.66.37|talk]]) 08:24, 2 August 2013 (UTC)</span><!-- Template:Unsigned IP --> <!--Autosigned by SineBot-->