Class DBSCAN

  • All Implemented Interfaces:
    java.io.Serializable, java.lang.Cloneable, Clusterer, CapabilitiesHandler, OptionHandler, RevisionHandler, TechnicalInformationHandler

    public class DBSCAN
    extends AbstractClusterer
    implements OptionHandler, TechnicalInformationHandler
    Basic implementation of DBSCAN clustering algorithm that should *not* be used as a reference for runtime benchmarks: more sophisticated implementations exist! Clustering of new instances is not supported. More info:

    Martin Ester, Hans-Peter Kriegel, Joerg Sander, Xiaowei Xu: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Second International Conference on Knowledge Discovery and Data Mining, 226-231, 1996.

    BibTeX:

     @inproceedings{Ester1996,
        author = {Martin Ester and Hans-Peter Kriegel and Joerg Sander and Xiaowei Xu},
        booktitle = {Second International Conference on Knowledge Discovery and Data Mining},
        editor = {Evangelos Simoudis and Jiawei Han and Usama M. Fayyad},
        pages = {226-231},
        publisher = {AAAI Press},
        title = {A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise},
        year = {1996}
     }
     

    Valid options are:

     -E <double>
      epsilon (default = 0.9)
     -M <int>
      minPoints (default = 6)
     -I <String>
      index (database) used for DBSCAN (default = weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase)
     -D <String>
      distance-type (default = weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclideanDataObject)
    Version:
    $Revision: 9434 $
    Author:
    Matthias Schubert (schubert@dbs.ifi.lmu.de), Zhanna Melnikova-Albrecht (melnikov@cip.ifi.lmu.de), Rainer Holzmann (holzmann@cip.ifi.lmu.de)
    See Also:
    Serialized Form
    • Constructor Detail

      • DBSCAN

        public DBSCAN()
    • Method Detail

      • buildClusterer

        public void buildClusterer​(Instances instances)
                            throws java.lang.Exception
        Generate Clustering via DBSCAN
        Specified by:
        buildClusterer in interface Clusterer
        Specified by:
        buildClusterer in class AbstractClusterer
        Parameters:
        instances - The instances that need to be clustered
        Throws:
        java.lang.Exception - If clustering was not successful
      • clusterInstance

        public int clusterInstance​(Instance instance)
                            throws java.lang.Exception
        Classifies a given instance.
        Specified by:
        clusterInstance in interface Clusterer
        Overrides:
        clusterInstance in class AbstractClusterer
        Parameters:
        instance - The instance to be assigned to a cluster
        Returns:
        int The number of the assigned cluster as an integer
        Throws:
        java.lang.Exception - If instance could not be clustered successfully
      • numberOfClusters

        public int numberOfClusters()
                             throws java.lang.Exception
        Returns the number of clusters.
        Specified by:
        numberOfClusters in interface Clusterer
        Specified by:
        numberOfClusters in class AbstractClusterer
        Returns:
        int The number of clusters generated for a training dataset.
        Throws:
        java.lang.Exception - if number of clusters could not be returned successfully
      • listOptions

        public java.util.Enumeration listOptions()
        Returns an enumeration of all the available options..
        Specified by:
        listOptions in interface OptionHandler
        Returns:
        Enumeration An enumeration of all available options.
      • setOptions

        public void setOptions​(java.lang.String[] options)
                        throws java.lang.Exception
        Sets the OptionHandler's options using the given list. All options will be set (or reset) during this call (i.e. incremental setting of options is not possible).

        Valid options are:

         -E <double>
          epsilon (default = 0.9)
         -M <int>
          minPoints (default = 6)
         -I <String>
          index (database) used for DBSCAN (default = weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase)
         -D <String>
          distance-type (default = weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclideanDataObject)
        Specified by:
        setOptions in interface OptionHandler
        Parameters:
        options - The list of options as an array of strings
        Throws:
        java.lang.Exception - If an option is not supported
      • getOptions

        public java.lang.String[] getOptions()
        Gets the current option settings for the OptionHandler.
        Specified by:
        getOptions in interface OptionHandler
        Returns:
        String[] The list of current option settings as an array of strings
      • databaseForName

        public Database databaseForName​(java.lang.String database_Type,
                                        Instances instances)
        Returns a new Class-Instance of the specified database
        Parameters:
        database_Type - String of the specified database
        instances - Instances that were delivered from WEKA
        Returns:
        Database New constructed Database
      • dataObjectForName

        public DataObject dataObjectForName​(java.lang.String database_distanceType,
                                            Instance instance,
                                            java.lang.String key,
                                            Database database)
        Returns a new Class-Instance of the specified database
        Parameters:
        database_distanceType - String of the specified distance-type
        instance - The original instance that needs to hold by this DataObject
        key - Key for this DataObject
        database - Link to the database
        Returns:
        DataObject New constructed DataObject
      • setMinPoints

        public void setMinPoints​(int minPoints)
        Sets a new value for minPoints
        Parameters:
        minPoints - MinPoints
      • setEpsilon

        public void setEpsilon​(double epsilon)
        Sets a new value for epsilon
        Parameters:
        epsilon - Epsilon
      • getEpsilon

        public double getEpsilon()
        Returns the value of epsilon
        Returns:
        double Epsilon
      • getMinPoints

        public int getMinPoints()
        Returns the value of minPoints
        Returns:
        int MinPoints
      • getDatabase_distanceType

        public java.lang.String getDatabase_distanceType()
        Returns the distance-type
        Returns:
        String Distance-type
      • getDatabase_Type

        public java.lang.String getDatabase_Type()
        Returns the type of the used index (database)
        Returns:
        String Index-type
      • setDatabase_distanceType

        public void setDatabase_distanceType​(java.lang.String database_distanceType)
        Sets a new distance-type
        Parameters:
        database_distanceType - The new distance-type
      • setDatabase_Type

        public void setDatabase_Type​(java.lang.String database_Type)
        Sets a new database-type
        Parameters:
        database_Type - The new database-type
      • epsilonTipText

        public java.lang.String epsilonTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • minPointsTipText

        public java.lang.String minPointsTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • database_TypeTipText

        public java.lang.String database_TypeTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • database_distanceTypeTipText

        public java.lang.String database_distanceTypeTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • globalInfo

        public java.lang.String globalInfo()
        Returns a string describing this DataMining-Algorithm
        Returns:
        String Information for the gui-explorer
      • getTechnicalInformation

        public TechnicalInformation getTechnicalInformation()
        Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
        Specified by:
        getTechnicalInformation in interface TechnicalInformationHandler
        Returns:
        the technical information about this class
      • toString

        public java.lang.String toString()
        Returns a description of the clusterer
        Overrides:
        toString in class java.lang.Object
        Returns:
        a string representation of the clusterer
      • main

        public static void main​(java.lang.String[] args)
        Main Method for testing DBSCAN
        Parameters:
        args - Valid parameters are: 'E' epsilon (default = 0.9); 'M' minPoints (default = 6); 'I' index-type (default = weka.clusterers.forOPTICSAndDBScan.Databases.SequentialDatabase); 'D' distance-type (default = weka.clusterers.forOPTICSAndDBScan.DataObjects.EuclideanDataObject);