{"id":1289,"date":"2017-07-03T23:34:34","date_gmt":"2017-07-03T23:34:34","guid":{"rendered":"http:\/\/intelligentonlinetools.com\/blog\/?p=1289"},"modified":"2017-07-07T10:36:14","modified_gmt":"2017-07-07T10:36:14","slug":"algorithms-metrics-and-online-tool-for-clustering","status":"publish","type":"post","link":"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/","title":{"rendered":"Algorithms, Metrics and Online Tool for Clustering"},"content":{"rendered":"<p>One of the key techniques of exploratory data mining is <b>clustering<\/b> \u2013 separating instances into distinct groups based on some measure of similarity. [1] In this post we will review how we can do clustering, evaluate and visualize results using online <a href=http:\/\/intelligentonlinetools.com\/cgi-bin\/analytics\/ml.cgi target=_blank>ML Sandbox<\/a> tool from this website. This tool allows to run some machine learning algorithms without coding and setup\/install. The following components will be explored:<\/p>\n<p><strong>Clustering Algorithms<\/strong><br \/>\n<strong>K-means Clustering Algorithm<\/strong> &#8211; is well known algorithm as the idea of this algorithm goes back to 1957. [2] The algorithm requires to input number of clusters and data. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean.[2].  Below are shown results of K-means clustering of Iris dataset (only 2 dimensions shown) and clustering result for S1 dataset (see dataset section for more details).<\/p>\n<p><figure id=\"attachment_1321\" aria-describedby=\"caption-attachment-1321\" style=\"width: 290px\" class=\"wp-caption alignnone\"><img data-attachment-id=\"1321\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/kmeans-clustering-iris\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/kmeans-clustering-iris.png\" data-orig-size=\"852,812\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"kmeans clustering iris\" data-image-description=\"&lt;p&gt;k-means clustering Iris dataset&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;k-means clustering Iris dataset&lt;\/p&gt;\n\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/kmeans-clustering-iris-300x286.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/kmeans-clustering-iris.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/kmeans-clustering-iris-300x286.png\" alt=\"\" width=\"300\" height=\"286\" class=\"size-medium wp-image-1321\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/kmeans-clustering-iris-300x286.png 300w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/kmeans-clustering-iris-768x732.png 768w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/kmeans-clustering-iris.png 852w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><figcaption id=\"caption-attachment-1321\" class=\"wp-caption-text\">k-means clustering Iris dataset<\/figcaption><\/figure><br \/>\nFig 1. K-means clustering of Iris dataset<\/p>\n<p><img data-attachment-id=\"1322\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/kmeans-clustering-s1-dataset\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/Kmeans-CLustering-S1-dataset-.png\" data-orig-size=\"922,648\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Kmeans CLustering S1 dataset\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/Kmeans-CLustering-S1-dataset--300x211.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/Kmeans-CLustering-S1-dataset-.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/Kmeans-CLustering-S1-dataset--300x211.png\" alt=\"\" width=\"300\" height=\"211\" class=\"alignnone size-medium wp-image-1322\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/Kmeans-CLustering-S1-dataset--300x211.png 300w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/Kmeans-CLustering-S1-dataset--768x540.png 768w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/Kmeans-CLustering-S1-dataset-.png 922w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><br \/>\nFig 2. K-means clustering of S1 dataset<\/p>\n<p><strong>Affinity Propagation<\/strong> &#8211; performs affinity propagation clustering of data. In statistics and data mining, affinity propagation (AP) is a clustering algorithm based on the concept of &#8220;message passing&#8221; between data points. Unlike clustering algorithms such as k-means or k-medoids, affinity propagation does not require the number of clusters to be determined or estimated before running the algorithm. Similar to k-medoids, affinity propagation finds &#8220;exemplars&#8221;, members of the input set that are representative of clusters.[3]<\/p>\n<p><strong>Hierarchical clustering (HC)<\/strong> &#8211; (also called hierarchical cluster analysis or HCA) is a method of cluster analysis which seeks to build a hierarchy of clusters. Strategies for hierarchical clustering generally fall into two types:<\/p>\n<ul>\n<li><strong>Agglomerative:<\/strong> This is a &#8220;bottom up&#8221; approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy.<\/li>\n<li><strong>Divisive:<\/strong> This is a &#8220;top down&#8221; approach: all observations start in one cluster, and splits are performed recursively as one moves down the hierarchy.<\/li>\n<\/ul>\n<p>In general, the merges and splits are determined in a greedy manner. The results of hierarchical clustering are usually presented in a dendrogram. [4]<\/p>\n<p><strong>Birch algorithm<\/strong> &#8211; Back in the 1990s considerable effort has been put into improving the performance of existing algorithms. Among them is BIRCH (Zhang et al., 1996) [5]<\/p>\n<p>BIRCH (balanced iterative reducing and clustering using hierarchies) is an unsupervised data mining algorithm used to perform hierarchical clustering over particularly large data-sets. An advantage of BIRCH is its ability to incrementally and dynamically cluster incoming, multi-dimensional metric data points in an attempt to produce the best quality clustering for a given set of resources (memory and time constraints). In most cases, BIRCH only requires a single scan of the database. [6]<\/p>\n<p><strong>Performance metrics for clustering algorithms<\/strong><\/p>\n<p><strong>Silhouette<\/strong> refers to a method of interpretation and validation of consistency within clusters of data. The technique provides a succinct graphical representation of how well each object lies within its cluster. It was first described by Peter J. Rousseeuw in 1986.<\/p>\n<p>The silhouette value is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation). The silhouette ranges from -1 to 1, where a high value indicates that the object is well matched to its own cluster and poorly matched to neighboring clusters. If most objects have a high value, then the clustering configuration is appropriate. If many points have a low or negative value, then the clustering configuration may have too many or too few clusters.<br \/>\nThe silhouette can be calculated with any distance metric, such as the Euclidean distance or the Manhattan distance.[7]<\/p>\n<p>Here is the python source code how to calculate the silhouette value for k-means clustering  <\/p>\n<pre><code>\r\nfrom sklearn import cluster\r\nfrom sklearn import metrics\r\nimport numpy as np\r\n\r\n\r\nk=2\r\ndata = np.array([[1, 2],\r\n              [5, 8],\r\n              [1.5, 1.8],\r\n              [8, 8],\r\n              [1, 0.6],\r\n              [9, 11]])\r\n    \r\n  \r\n\r\nkmeans = cluster.KMeans(n_clusters=k)\r\nkmeans.fit(data)\r\n\r\n\r\nlabels = kmeans.labels_\r\ncentroids = kmeans.cluster_centers_\r\n\r\nprint (\"Cluster id labels for inputted data\")\r\nprint (labels)\r\nprint (\"Centroids data\")\r\nprint (centroids)\r\n\r\nprint (\"\\nScore (Opposite of the value of X on the K-means objective which is Sum of distances of samples to their closest cluster center):\")\r\nprint (kmeans.score(data))\r\n\r\nsilhouette_score = metrics.silhouette_score(data, labels, metric='euclidean')\r\n\r\nprint (\"Silhouette_score: \")\r\nprint (silhouette_score)\r\n<\/code><\/pre>\n<p><strong>Score<\/strong> (Opposite of the value of X on the K-means objective which is Sum of distances of samples to their closest cluster center) &#8211;  Sum of distances of samples to their closest cluster center.<\/p>\n<p>Large distances corresponds to a big variety in data samples and if the number of data samples is significantly higher than the number of clusters. On the contrary, if all data samples were the same, you would always get a zero distance regardless of number of clusters. [8]<\/p>\n<p><strong>Cophenetic correlation<\/strong> &#8211; In statistics, and especially in biostatistics, cophenetic correlation (more precisely, the cophenetic correlation coefficient) is a measure of how faithfully a dendrogram preserves the pairwise distances between the original unmodeled data points. Although it has been most widely applied in the field of biostatistics (typically to assess cluster-based models of DNA sequences, or other taxonomic models), it can also be used in other fields of inquiry where raw data tend to occur in clumps, or clusters. This coefficient has also been proposed for use as a test for nested clusters.[9]<\/p>\n<p><strong>Datasets<\/strong><br \/>\nThe following two datasets will be used:<br \/>\n<strong>The Iris flower data set or Fisher&#8217;s Iris data set<\/strong> is a multivariate data set &#8211; well know data set with N = 150 and k=3  [10] The data set consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor)   [10]<\/p>\n<p><strong>S1<\/strong> &#8211; Synthetic 2-d data with N=5000 vectors and k=15 Gaussian cluster [11]<\/p>\n<p><strong>Experiments<\/strong><br \/>\nUsing <a href=http:\/\/intelligentonlinetools.com\/cgi-bin\/analytics\/ml.cgi target=_blank>ML Sandbox<\/a> tool and above clustering algorithms and datasets the clustering was performed. Screenshots of results of clustering from the tool were collected and presented here (Fig 1-6, Fig 1,2 are shown above)<\/p>\n<p><img data-attachment-id=\"1316\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/ap-clustering-iris\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/AP-Clustering-iris.png\" data-orig-size=\"841,647\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"AP Clustering iris\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/AP-Clustering-iris-300x231.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/AP-Clustering-iris.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/AP-Clustering-iris-300x231.png\" alt=\"\" width=\"300\" height=\"231\" class=\"alignnone size-medium wp-image-1316\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/AP-Clustering-iris-300x231.png 300w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/AP-Clustering-iris-768x591.png 768w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/AP-Clustering-iris.png 841w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><br \/>\nFig 3. AP clustering of Iris dataset<\/p>\n<p><img data-attachment-id=\"1319\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/clustering-ap-iris-data-results\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/CLustering-AP-iris-data-results.png\" data-orig-size=\"1118,233\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"CLustering AP iris data results\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/CLustering-AP-iris-data-results-300x63.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/CLustering-AP-iris-data-results-1024x213.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/CLustering-AP-iris-data-results-300x63.png\" alt=\"\" width=\"300\" height=\"63\" class=\"alignnone size-medium wp-image-1319\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/CLustering-AP-iris-data-results-300x63.png 300w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/CLustering-AP-iris-data-results-768x160.png 768w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/CLustering-AP-iris-data-results-1024x213.png 1024w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/CLustering-AP-iris-data-results-940x198.png 940w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/CLustering-AP-iris-data-results.png 1118w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><br \/>\nFig 4.  AP clustering results of Iris dataset<\/p>\n<p><img data-attachment-id=\"1328\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/hc-clustering-iris\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-Iris.png\" data-orig-size=\"1068,861\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"HC Clustering Iris\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-Iris-300x242.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-Iris-1024x826.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-Iris-300x242.png\" alt=\"\" width=\"300\" height=\"242\" class=\"alignnone size-medium wp-image-1328\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-Iris-300x242.png 300w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-Iris-768x619.png 768w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-Iris-1024x826.png 1024w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-Iris.png 1068w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><br \/>\nFig 5. HC Clustering Iris dataset<\/p>\n<p><img data-attachment-id=\"1329\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/hc-clustering-s1-dataset\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-S1-dataset.png\" data-orig-size=\"1086,881\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"HC Clustering S1 dataset\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-S1-dataset-300x243.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-S1-dataset-1024x831.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-S1-dataset-300x243.png\" alt=\"\" width=\"300\" height=\"243\" class=\"alignnone size-medium wp-image-1329\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-S1-dataset-300x243.png 300w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-S1-dataset-768x623.png 768w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-S1-dataset-1024x831.png 1024w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/HC-Clustering-S1-dataset.png 1086w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><br \/>\nFig 6. HC clustering S1 dataset<\/p>\n<p>Below in the summary of the above clustering experiments. <\/p>\n<table>\n<tr>\n<td><\/td>\n<td colspan=2>Kmeans (sklearn.cluster)<\/td>\n<td>AP (sklearn.cluster)<\/td>\n<td>HC (scipy.cluster)<\/td>\n<td>Birch (sklearn.cluster)<\/td>\n<\/tr>\n<tr>\n<td><\/td>\n<td>Score <font size=-2>(Opposite of Sum of distances of samples to their closest cluster center)<\/font><\/td>\n<td>Silhouette_score<\/td>\n<td>Silhouette_score<\/td>\n<td>Cophenetic Correlation Coefficient: <\/td>\n<td>Silhouette_score<\/td>\n<\/tr>\n<tr>\n<td>Iris dataset, 150, D4<\/td>\n<td>-78.85<\/td>\n<td>0.55<\/td>\n<td>0.52<\/td>\n<td>0.87<\/td>\n<td>0.50<\/td>\n<\/tr>\n<tr>\n<td>S1 dataset, 5000, D2<\/td>\n<td>-8.92e+12 <\/td>\n<td>0.71<\/td>\n<td>*<\/td>\n<td>0.69<\/td>\n<td>0.71<\/td>\n<\/tr>\n<\/table>\n<p>*AP did not work well on S1 dataset (but worked well on iris dataset) however there are some other optional parameters that can be used to resolve this. Probably need to be adjust preference parameter. Currently the tool does not allow change it.<\/p>\n<p>From documentation [12] Preference is parameter that can be array-like, shape (n_samples,) or float, and is optional. Preferences for each point &#8211; points with larger values of preferences are more likely to be chosen as exemplars. The number of exemplars, ie of clusters, is influenced by the input preferences value. If the preferences are not passed as arguments, they will be set to the median of the input similarities.<\/p>\n<p><a href=http:\/\/intelligentonlinetools.com\/cgi-bin\/analytics\/ml.cgi target=_blank><strong>ML Sandbox<\/strong><\/a><br \/>\nThe above tool was used for clustering data. You need just select algorithm, enter your data and click run. Below are detailed instructions for clustering.<br \/>\n<strong>How to use the ML Sandbox<\/strong><br \/>\n1. Open URL: <a href=http:\/\/intelligentonlinetools.com\/cgi-bin\/analytics\/ml.cgi target=_blank>ML Sandbox<\/a><br \/>\n2. Select Clustering method<\/p>\n<p><img data-attachment-id=\"1318\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/clustering-how-to-use-tool-step1\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step1.png\" data-orig-size=\"675,318\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"clustering &#8211; how to use tool step1\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step1-300x141.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step1.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step1-300x141.png\" alt=\"\" width=\"300\" height=\"141\" class=\"alignnone size-medium wp-image-1318\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step1-300x141.png 300w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step1.png 675w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/p>\n<p>3. Enter data (you can use default small dataset or copy and paste your dataset or dataset from other sites like iris, S1 see links in the references section)<br \/>\n4. Click Run Now<br \/>\n<img data-attachment-id=\"1317\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/clustering-how-to-use-tool-step-2\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step-2.png\" data-orig-size=\"1041,690\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"clustering &#8211; how to use tool step 2\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step-2-300x199.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step-2-1024x679.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step-2-300x199.png\" alt=\"\" width=\"300\" height=\"199\" class=\"alignnone size-medium wp-image-1317\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step-2-300x199.png 300w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step-2-768x509.png 768w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step-2-1024x679.png 1024w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step-2.png 1041w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><br \/>\n5. Click View Run Results<br \/>\n6. If you do not see results, click refresh button at top left corner. Depending on data set and algorithm you might need wait for a minute or two and click refresh.<br \/>\n<img data-attachment-id=\"1320\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/clustering-how-to-use-tool-step3\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step3.png\" data-orig-size=\"726,469\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"clustering how to use tool step3\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step3-300x194.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step3.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step3-300x194.png\" alt=\"\" width=\"300\" height=\"194\" class=\"alignnone size-medium wp-image-1320\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step3-300x194.png 300w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/clustering-how-to-use-tool-step3.png 726w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/p>\n<p><strong>Conclusion<\/strong><br \/>\nWe looked at different clustering methods, metrics performance and visualization of clustering results for different datasets. All of this can be done within online tool <a href=http:\/\/intelligentonlinetools.com\/cgi-bin\/analytics\/ml.cgi target=_blank>ML Sandbox<\/a>  Feel free to play with this tool and your data to explore your datasets. Also feel free to provide any feedback or suggestions.<\/p>\n<p><strong>References<\/strong><br \/>\n1. <a href=https:\/\/blog.biolab.si\/2015\/12\/02\/hierarchical-clustering-a-simple-explanation\/>    Hierarchical Clustering: A Simple Explanation<\/a><br \/>\n2. <a href=https:\/\/en.wikipedia.org\/wiki\/K-means_clustering target=\"_blank\">k-means clustering<\/a><br \/>\n3. <a href=https:\/\/en.wikipedia.org\/wiki\/Affinity_propagation target=\"_blank\">Affinity_propagation<\/a><br \/>\n4. <a href=https:\/\/en.wikipedia.org\/wiki\/Hierarchical_clustering target=\"_blank\">Hierarchical clustering<\/a><br \/>\n5. <a href=https:\/\/en.wikipedia.org\/wiki\/Cluster_analysis target=\"_blank\">Cluster_analysis<\/a><br \/>\n6. <a href=https:\/\/en.wikipedia.org\/wiki\/BIRCH target=\"_blank\">BIRCH<\/a><br \/>\n7. <a href=https:\/\/en.wikipedia.org\/wiki\/Silhouette_(clustering) target=\"_blank\">Silhouette (clustering)<\/a><br \/>\n8. <a href=https:\/\/stackoverflow.com\/questions\/32370543\/understanding-score-returned-by-scikit-learn-kmeans target=_blank>understanding-score-returned-by-scikit-learn-kmeans<\/a><br \/>\n9. <a href=https:\/\/en.wikipedia.org\/wiki\/Cophenetic_correlation target=_blank>Cophenetic correlation<\/a><br \/>\n10. <a href=https:\/\/en.wikipedia.org\/wiki\/Iris_flower_data_set target=_blank>Iris flower data set<\/a><br \/>\n11. <a href=https:\/\/cs.joensuu.fi\/sipu\/datasets\/ target=_blank>Clustering benchmark datasets<\/a><br \/>\n12 <a href=http:\/\/scikit-learn.org\/stable\/modules\/generated\/sklearn.cluster.AffinityPropagation.html#sklearn.cluster.AffinityPropagation  target=_blank>AffinityPropagation<\/a> <\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the key techniques of exploratory data mining is clustering \u2013 separating instances into distinct groups based on some measure of similarity. [1] In this post we will review how we can do clustering, evaluate and visualize results using online ML Sandbox tool from this website. This tool allows to run some machine learning &#8230; <a title=\"Algorithms, Metrics and Online Tool for Clustering\" class=\"read-more\" href=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":[]},"categories":[2,6,9,10],"tags":[21,19,20,18],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Algorithms, Metrics and Online Tool for Clustering - Machine Learning Applications<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Algorithms, Metrics and Online Tool for Clustering - Machine Learning Applications\" \/>\n<meta property=\"og:description\" content=\"One of the key techniques of exploratory data mining is clustering \u2013 separating instances into distinct groups based on some measure of similarity. [1] In this post we will review how we can do clustering, evaluate and visualize results using online ML Sandbox tool from this website. This tool allows to run some machine learning ... Read more\" \/>\n<meta property=\"og:url\" content=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/\" \/>\n<meta property=\"og:site_name\" content=\"Machine Learning Applications\" \/>\n<meta property=\"article:published_time\" content=\"2017-07-03T23:34:34+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2017-07-07T10:36:14+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/kmeans-clustering-iris-300x286.png\" \/>\n<meta name=\"author\" content=\"owygs156\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"owygs156\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/\",\"url\":\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/\",\"name\":\"Algorithms, Metrics and Online Tool for Clustering - Machine Learning Applications\",\"isPartOf\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#website\"},\"datePublished\":\"2017-07-03T23:34:34+00:00\",\"dateModified\":\"2017-07-07T10:36:14+00:00\",\"author\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478\"},\"breadcrumb\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"http:\/\/intelligentonlinetools.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Algorithms, Metrics and Online Tool for Clustering\"}]},{\"@type\":\"WebSite\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#website\",\"url\":\"http:\/\/intelligentonlinetools.com\/blog\/\",\"name\":\"Machine Learning Applications\",\"description\":\"Artificial intelligence, data mining and machine learning for building web based tools and services.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"http:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478\",\"name\":\"owygs156\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g\",\"contentUrl\":\"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g\",\"caption\":\"owygs156\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Algorithms, Metrics and Online Tool for Clustering - Machine Learning Applications","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/","og_locale":"en_US","og_type":"article","og_title":"Algorithms, Metrics and Online Tool for Clustering - Machine Learning Applications","og_description":"One of the key techniques of exploratory data mining is clustering \u2013 separating instances into distinct groups based on some measure of similarity. [1] In this post we will review how we can do clustering, evaluate and visualize results using online ML Sandbox tool from this website. This tool allows to run some machine learning ... Read more","og_url":"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/","og_site_name":"Machine Learning Applications","article_published_time":"2017-07-03T23:34:34+00:00","article_modified_time":"2017-07-07T10:36:14+00:00","og_image":[{"url":"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/kmeans-clustering-iris-300x286.png"}],"author":"owygs156","twitter_card":"summary_large_image","twitter_misc":{"Written by":"owygs156","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/","url":"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/","name":"Algorithms, Metrics and Online Tool for Clustering - Machine Learning Applications","isPartOf":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/#website"},"datePublished":"2017-07-03T23:34:34+00:00","dateModified":"2017-07-07T10:36:14+00:00","author":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478"},"breadcrumb":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/"]}]},{"@type":"BreadcrumbList","@id":"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"http:\/\/intelligentonlinetools.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Algorithms, Metrics and Online Tool for Clustering"}]},{"@type":"WebSite","@id":"http:\/\/intelligentonlinetools.com\/blog\/#website","url":"http:\/\/intelligentonlinetools.com\/blog\/","name":"Machine Learning Applications","description":"Artificial intelligence, data mining and machine learning for building web based tools and services.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"http:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478","name":"owygs156","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/image\/","url":"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g","contentUrl":"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g","caption":"owygs156"}}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p7h1IJ-kN","jetpack-related-posts":[{"id":450,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/","url_meta":{"origin":1289,"position":0},"title":"Bio-Inspired Optimization for Text Mining-2","date":"August 3, 2016","format":false,"excerpt":"Numerical One Dimensional Example In the previous code Bio-Inspired Optimization for Text Mining-1 Motivation we implemented source code for optimization some function using bio-inspired algorithm. Now we need to put actual function for clustering. In clustering we want to group our clusters in such way that the distance from each\u2026","rel":"","context":"In &quot;Data Mining&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":426,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/07\/29\/bio-inspired-optimization-for-text-mining-1\/","url_meta":{"origin":1289,"position":1},"title":"Bio-Inspired Optimization for Text Mining-1","date":"July 29, 2016","format":false,"excerpt":"Motivation Optimization problem studies maximizing or minimizing some function y=f(x) with some range of choices available for x. Biologically inspired (bio-inspired) algorithms for optimization problems are now widely used. A few examples of such optimization are: particle swarm optimization (PSO) that is based on the swarming behavior of fish and\u2026","rel":"","context":"In &quot;Data Mining&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":256,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/06\/05\/using-python-for-mining-data-from-twitter-visualization-and-other-enchancements\/","url_meta":{"origin":1289,"position":2},"title":"Using Python for Data Visualization of Clustering Results","date":"June 5, 2016","format":false,"excerpt":"In one of the previous post http:\/\/intelligentonlinetools.com\/blog\/2016\/05\/28\/using-python-for-mining-data-from-twitter\/ python source code for mining Twitter data was implemented. Clustering was applied to put tweets in different groups using bag of words representation for the text. The results of clustering were obtained via numerical matrix. Now we will look at visualization of clustering\u2026","rel":"","context":"In &quot;Artificial Intelligence&quot;","img":{"alt_text":"Data Visualization for Clustering Results","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2016\/06\/data-visualization1-300x220.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":227,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/05\/28\/using-python-for-mining-data-from-twitter\/","url_meta":{"origin":1289,"position":3},"title":"Using Python for Mining Data From Twitter","date":"May 28, 2016","format":false,"excerpt":"Twitter is increasingly being used for business or personal purposes. With Twitter API there is also an opportunity to do data mining of data (tweets) and find interesting information. In this post we will take a look how to get data from Twitter, prepare data for analysis and then do\u2026","rel":"","context":"In &quot;Artificial Intelligence&quot;","img":{"alt_text":"Frequency of Hashtags","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2016\/05\/Frequency-of-Hashtags-300x171.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":521,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/08\/26\/bio-inspired-optimization-for-text-mining-4\/","url_meta":{"origin":1289,"position":4},"title":"Bio-Inspired Optimization for Text Mining-4","date":"August 26, 2016","format":false,"excerpt":"Clustering Text Data In previous post Bio-Inspired Optimization was applied for clustering of numerical data. In this post text data will be used for clustering. So python source code will be modified for clustering of text data. This data will be initialized in the beginning of this python script with\u2026","rel":"","context":"In &quot;Machine Learning&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":71,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/02\/13\/5-ways-of-web-user-modeling\/","url_meta":{"origin":1289,"position":5},"title":"5 Ways of Web User Modeling","date":"February 13, 2016","format":false,"excerpt":"Web user modeling can be done in many different ways. Below will be described some of them and will be provided the links to the resources. One of the way of modeling web user behavior is to use Markov chains. This theory allows us to find the probability of clicking\u2026","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/1289"}],"collection":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/comments?post=1289"}],"version-history":[{"count":46,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/1289\/revisions"}],"predecessor-version":[{"id":2323,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/1289\/revisions\/2323"}],"wp:attachment":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/media?parent=1289"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/categories?post=1289"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/tags?post=1289"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}