K-means is based on **variance minimization**. The sum-of-variance formula equals the *sum of squared Euclidean distances*, but the converse, **for other distances, will ***not* hold.

If you want to have a k-means like an algorithm for other distances (where the mean is not an appropriate estimator), use **k-medoids** (PAM). In contrast to k-means, k-medoids will converge with arbitrary distance functions!

For Manhattan distance, you can also use K-medians. The median is an appropriate estimator for L1 norms (the median minimizes the sum-of-differences; the mean minimizes the sum-of-squared-distances).

For your particular use case, you could also transform your data into 3D space, then use (squared) Euclidean distance and thus k-means. But your cluster centers will be somewhere underground!