Im trying to plot the distance graph for a given value of min-points. I'm specifically looking for the
knee and corresponding epsilon values. I've been trying to use sklearn for my cause, but I can't seem to
find any function that returns these specific values. Can someone help me?
Apr 10, 2018 in Python 4,821 views

## 2 answers to this question.

Hi there, instead of sklearn you could use this function to get what you want

import numpy as np
import pandas as pd
import math

def k_distances(X, n=None, dist_func=None):
"""Function to return array of k_distances.

X - DataFrame matrix with observations
n - number of neighbors that are included in returned distances (default number of attributes + 1)
dist_func - function to count distance between observations in X (default euclidean function)
"""
if type(X) is pd.DataFrame:
X = X.values
k=0
if n == None:
k=X.shape[1]+2
else:
k=n+1

if dist_func == None:
# euclidean distance square root of sum of squares of differences between attributes
dist_func = lambda x, y: math.sqrt(
np.sum(
np.power(x-y, np.repeat(2,x.size))
)
)

Distances = pd.DataFrame({
"i": [i//10 for i in range(0, len(X)*len(X))],
"j": [i%10 for i in range(0, len(X)*len(X))],
"d": [dist_func(x,y) for x in X for y in X]
})
return np.sort([g[1].iloc[k].d for g in iter(Distances.groupby(by="i"))])

X should be pandas.DataFrame or numpy.ndarray. n is number of neighbors that are in d-neighborhood. You should know this number. By default is number of attributes + 1.

To plot the distance using python use matplotlib

import matplotlib.pyplot as plt

d = k_distances(X,n,dist_func)
plt.plot(d)
plt.ylabel("k-distances")
plt.grid(True)
plt.show()
answered Apr 10, 2018 by
• 7,720 points

selected Oct 12, 2018 by Omkar

You probably want to use the matrix operations provided by numpy to speed up your distance matrix calculation.

```def k_distances2(x, k):
dim0 = x.shape[0]
dim1 = x.shape[1]
p=-2*x.dot(x.T)+np.sum(x**2, axis=1).T+ np.repeat(np.sum(x**2, axis=1),dim0,axis=0).reshape(dim0,dim0)
p = np.sqrt(p)
p.sort(axis=1)
p=p[:,:k]
pm= p.flatten()
pm= np.sort(pm)
return p, pm
m, m2= k_distances2(X, 2)
plt.plot(m2)
plt.ylabel("k-distances")
plt.grid(True)
plt.show()```
answered Oct 12, 2018 by
• 4,780 points
I want to  measure the distance
of the  object  in graphically  using  raspberry  pi

## How can I define a multidimensional array in python using ctype?

Here's one quick-and-dirty method: >>> A = ((ctypes.c_float ...READ MORE

## How can I rename multiple files in a certain directory using Python?

import os from optparse import OptionParser, Option class MyOption ...READ MORE

## How can I record the X,Y limits of a displayed X,Y plot using the matplotlib show() module?

A couple hours after posting this question ...READ MORE

## How can I lookup hostname using the IP address with a timeout in Python?

Good question. I actually was stuck with ...READ MORE

## how do i change string to a list?

suppose you have a string with a ...READ MORE

## how can i randomly select items from a list?

You can also use the random library's ...READ MORE

+1 vote

## how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

## how do i use the enumerate function inside a list?

Enumerate() method adds a counter to an ...READ MORE