I am working with latitude-longitude data.

My objective is to make clusters based on the distance between two points.

Now distance between two different point is =ACOS(SIN(lat1)*SIN(lat2)+COS(lat1)*COS(lat2)*COS(lon2-lon1))*6371

How to use k means in R. Is there any way I can override distance calculation in that process? Jun 21, 2018 426 views

## 1 answer to this question.

K-means is based on variance minimization. The sum-of-variance formula equals the sum of squared Euclidean distances, but the converse, for other distances, will not hold.

If you want to have a k-means like an algorithm for other distances (where the mean is not an appropriate estimator), use k-medoids (PAM). In contrast to k-means, k-medoids will converge with arbitrary distance functions!

For Manhattan distance, you can also use K-medians. The median is an appropriate estimator for L1 norms (the median minimizes the sum-of-differences; the mean minimizes the sum-of-squared-distances).

For your particular use case, you could also transform your data into 3D space, then use (squared) Euclidean distance and thus k-means. But your cluster centers will be somewhere underground! answered Jun 21, 2018 by
• 6,190 points

## How to use a function to repeat a set of procedures on specific set of columns in a data frame?

You can parse the strings to symbols. ...READ MORE

## In a dpylr pipline how to use sample and seq?

For avoiding rowwise(), I prefer to use ...READ MORE

## How to use group by for multiple columns in dplyr, using string vector input in R?

data = data.frame(   zzz11def = sample(LETTERS[1:3], 100, replace=TRUE),   zbc123qws1 ...READ MORE

## Which function can I use to clear the console in R and RStudio ?

Description                   Windows & Linux           Mac Clear console                      Ctrl+L ...READ MORE

## k means vs KNN

K-means clustering is basically an unsupervised clustering ...READ MORE

+1 vote

## How to handle Nominal Data?

Nominal data is basically data which can ...READ MORE

+1 vote

## How to handle outliers

There are multiple ways to handle outliers ...READ MORE