DBSCAN algorithm and clustering algorithm for data mining

Question

How do you implement DBSCAN algorithm on categorical data (mushroom data set)?

And what is a one pass clustering algorithm?

Dev · Answer 1 · Mar 4, 2022

You can use any distance function with DBSCAN without making any changes. Because the indexing will be more difficult, the complexity will most likely be O(n2).

DBSCAN, on the other hand, only computes distances, compares them to a threshold, and counts things if you look closely. This is a significant strength of it; all you need to do is design a distance function and thresholds to apply it to a variety of data types.
DBSCAN is a pairwise distance-based algorithm, hence I doubt there is a one-pass variant. Some of these computations can be pruned (this is where the index comes in), but you still need to compare each item to each other, so it's O(n log n) and not one-pass.

The original k-means algorithm, I believe, was a one-pass algorithm. Your first means consist of the first k things. You choose the nearest mean for each new object and update it (incrementally) with the new object for each new object. This was a "one-pass" process as long as you didn't iterate over your data set again. (However, the outcome will be worse than lloyd-style k-means.)