What is the best distance measure to be used while clustering?
A. The choice of the distance measure depends on the area of application and the sort of similarities one would like to detect. For example, if the gene expression measurements for all samples in one gene are three times the expression measurements in the other gene, those two genes would be considered distant using Euclidean distance metric, but close using correlation coefficient (because correlation coefficient considers only change pattern). Manhattan distance is more robust against outliers. Euclidean distance is the preferred one to successfully group similar data items.