Clustering high-dimensional data

mskhmskh MemberPosts:13Learner I
edited March 2019 inHelp
Hi,
I try to use DBSACN to detect the outlier in my data set but it is difficult to set the parameters (epsilon,min points). Does anyone have an idea to solve the problem? it is possible to consider two clustering algorithms and each algorithm only consider sub-attributes of data set and i detect the outlier based on the results of two clustering algorithm?
Thanks

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn
    DBSCAN is definitely one of those algorithms where you need to have domain expertise to set the parameters properly to get good results. You might want to try a simpler clustering algorithm first like k-means or hierarchical.
    If you want to try to use two clustering algorithms based on different attributes, you'll need to multiply/split your dataset and feed one set of attributes to the first algorithm and a different set of attributes to the second algorithm, get the assigned clusters, and then join the two datasets back together again to compare.
    Brian T.
    Lindon Ventures
    Data Science Consulting from Certified RapidMiner Experts
    sgenzer David_A mskh
  • mskhmskh MemberPosts:13Learner I
    Thank you so much@Telcontar120.
    I am a beginner in rapidminer. Will the results be different, if i use two clustering algorithms based on different attributes instead of one clustering algorithm on those attributes?
    Thank you
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn
    It is not possible to give a definitive answer without seeing the data, but in general you would not necessarily expect to get the same results if you are using different subsets of attributes or different clustering algorithms. You mentioned that you wanted to do this in your earlier comments, that is all.
    Brian T.
    Lindon Ventures
    Data Science Consulting from Certified RapidMiner Experts
    mskh
Sign InorRegisterto comment.