{"id":450,"date":"2016-08-03T01:53:43","date_gmt":"2016-08-03T01:53:43","guid":{"rendered":"http:\/\/intelligentonlinetools.com\/blog\/?p=450"},"modified":"2016-08-08T00:44:24","modified_gmt":"2016-08-08T00:44:24","slug":"bio-inspired-optimization-for-text-mining-2","status":"publish","type":"post","link":"http:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/","title":{"rendered":"Bio-Inspired Optimization for Text Mining-2"},"content":{"rendered":"<p><strong>Numerical One Dimensional Example<\/strong><br \/>\nIn the previous code <a href=http:\/\/intelligentonlinetools.com\/blog\/2016\/07\/29\/bio-inspired-optimization-for-text-mining-1\/>Bio-Inspired Optimization for Text Mining-1 Motivation<\/a> we implemented source code for optimization some function using bio-inspired algorithm. Now we need to put actual function for clustering. In clustering we want to group our clusters in such way that the distance from each data to its centroid was minimal.<br \/>\nHere is what Wikipedia is saying about clustering as optimization problem:<\/p>\n<p>In centroid-based clustering, clusters are represented by a central vector, which may not necessarily be a member of the data set. When the number of clusters is fixed to k, k-means clustering gives a formal definition as an optimization problem: find the  k cluster centers and assign the objects to the nearest cluster center, such that the squared distances from the cluster are minimized.<\/p>\n<p>The optimization problem itself is known to be NP-hard, and thus the common approach is to search only for approximate solutions. A particularly well known approximative method is Lloyd&#8217;s algorithm,[8] often actually referred to as &#8220;k-means algorithm&#8221;. It does however only find a local optimum, and is commonly run multiple times with different random initializations. Variations of k-means often include such optimizations as choosing the best of multiple runs, but also restricting the centroids to members of the data set (k-medoids), choosing medians (k-medians clustering), choosing the initial centers less randomly (K-means++) or allowing a fuzzy cluster assignment (Fuzzy c-means).  [1]<\/p>\n<p>Based on the above our function will calculate total sum of distances from each data to its centroid. For centroid we select the nearest centroid. We do this inside of function <em>evaluate<\/em>. The script iterates though each centroid (for loop:  for c in cand) and keeps track of minimal distance. After loop is done it updates total fitness (fit variable)<\/p>\n<p>Below is the code for clustering one dimensional data. Data is specified in array data.<br \/>\nFunction <em>generate<\/em> defines how many clusters we want to get. The example is using 2 through the number nr_inputs<br \/>\nThe Bounder has 0, 10 which is based on the data, the max number in the data is 8.<\/p>\n<pre><code>\r\n# -*- coding: utf-8 -*-\r\n\r\n# Clustering for one dimensional data\r\n## http:\/\/pythonhosted.org\/inspyred\/examples.html#ant-colony-optimization\r\n## https:\/\/aarongarrett.github.io\/inspyred\/reference.html#benchmarks-benchmark-optimization-functions\r\n\r\nfrom time import time\r\nfrom random import Random\r\nimport inspyred\r\n\r\n\r\n\r\ndata = [4,5,5,8,8,8]\r\n\r\ndef my_observer(population, num_generations, num_evaluations, args):\r\n    best = max(population)\r\n    print('{0:6} -- {1} : {2}'.format(num_generations, \r\n                                      best.fitness, \r\n                                      str(best.candidate)))\r\n\r\ndef generate(random, args):\r\n      nr_inputs = 2\r\n      return [random.uniform(0, 2) for _ in range(nr_inputs)]\r\n    \r\n\r\n    \r\ndef evaluate(candidates, args):\r\n    \r\n   fitness = []\r\n    \r\n   for cand in candidates:  \r\n     fit=0  \r\n     for d in range(len(data)):\r\n         distance=10000\r\n         for c in cand:\r\n             temp=(data[d]-c)**2\r\n             if temp < distance :\r\n                  distance=temp\r\n         fit=fit + distance\r\n     fitness.append(fit)          \r\n   return fitness  \r\n\r\n\r\ndef main(prng=None, display=False):\r\n    if prng is None:\r\n        prng = Random()\r\n        prng.seed(time()) \r\n    \r\n   \r\n    \r\n    \r\n   \r\n    ea = inspyred.swarm.PSO(prng)\r\n    ea.observer = my_observer\r\n    ea.terminator = inspyred.ec.terminators.evaluation_termination\r\n    ea.topology = inspyred.swarm.topologies.ring_topology\r\n    final_pop = ea.evolve(generator=generate,\r\n                          evaluator=evaluate, \r\n                          pop_size=8,\r\n                          bounder=inspyred.ec.Bounder(0, 10),\r\n                          maximize=False,\r\n                          max_evaluations=2000,\r\n                          neighborhood_size=3)\r\n                         \r\n\r\n   \r\n\r\nif __name__ == '__main__':\r\n    main(display=True)\r\n<\/code><\/pre>\n<p>Output result<\/p>\n<pre><code>\r\n 0.6666666666666666 : [8.000000006943864, 4.666666665568784]\r\n<\/code><\/pre>\n<p>Thus we applied bio-inspired optimisation algorithm for clustering problem. In the next post we will extend the source code to several dimensional data.<\/p>\n<p><strong>References<\/strong><br \/>\n1.  <a href=https:\/\/en.wikipedia.org\/wiki\/Cluster_analysis target=\"_blank\">Centroid-based clustering<\/a><br \/>\n2. <a href=http:\/\/intelligentonlinetools.com\/blog\/2016\/07\/29\/bio-inspired-optimization-for-text-mining-1\/>Bio-Inspired Optimization for Text Mining-1 Motivation<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Numerical One Dimensional Example In the previous code Bio-Inspired Optimization for Text Mining-1 Motivation we implemented source code for optimization some function using bio-inspired algorithm. Now we need to put actual function for clustering. In clustering we want to group our clusters in such way that the distance from each data to its centroid was &#8230; <a title=\"Bio-Inspired Optimization for Text Mining-2\" class=\"read-more\" href=\"http:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":[]},"categories":[2,9,12,10],"tags":[],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Bio-Inspired Optimization for Text Mining-2 - Machine Learning Applications<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Bio-Inspired Optimization for Text Mining-2 - Machine Learning Applications\" \/>\n<meta property=\"og:description\" content=\"Numerical One Dimensional Example In the previous code Bio-Inspired Optimization for Text Mining-1 Motivation we implemented source code for optimization some function using bio-inspired algorithm. Now we need to put actual function for clustering. In clustering we want to group our clusters in such way that the distance from each data to its centroid was ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/\" \/>\n<meta property=\"og:site_name\" content=\"Machine Learning Applications\" \/>\n<meta property=\"article:published_time\" content=\"2016-08-03T01:53:43+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2016-08-08T00:44:24+00:00\" \/>\n<meta name=\"author\" content=\"owygs156\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"owygs156\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/\",\"url\":\"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/\",\"name\":\"Bio-Inspired Optimization for Text Mining-2 - Machine Learning Applications\",\"isPartOf\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#website\"},\"datePublished\":\"2016-08-03T01:53:43+00:00\",\"dateModified\":\"2016-08-08T00:44:24+00:00\",\"author\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478\"},\"breadcrumb\":{\"@id\":\"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"http:\/\/intelligentonlinetools.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Bio-Inspired Optimization for Text Mining-2\"}]},{\"@type\":\"WebSite\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#website\",\"url\":\"http:\/\/intelligentonlinetools.com\/blog\/\",\"name\":\"Machine Learning Applications\",\"description\":\"Artificial intelligence, data mining and machine learning for building web based tools and services.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"http:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478\",\"name\":\"owygs156\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g\",\"contentUrl\":\"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g\",\"caption\":\"owygs156\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Bio-Inspired Optimization for Text Mining-2 - Machine Learning Applications","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/","og_locale":"en_US","og_type":"article","og_title":"Bio-Inspired Optimization for Text Mining-2 - Machine Learning Applications","og_description":"Numerical One Dimensional Example In the previous code Bio-Inspired Optimization for Text Mining-1 Motivation we implemented source code for optimization some function using bio-inspired algorithm. Now we need to put actual function for clustering. In clustering we want to group our clusters in such way that the distance from each data to its centroid was ... Read more","og_url":"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/","og_site_name":"Machine Learning Applications","article_published_time":"2016-08-03T01:53:43+00:00","article_modified_time":"2016-08-08T00:44:24+00:00","author":"owygs156","twitter_card":"summary_large_image","twitter_misc":{"Written by":"owygs156","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/","url":"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/","name":"Bio-Inspired Optimization for Text Mining-2 - Machine Learning Applications","isPartOf":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/#website"},"datePublished":"2016-08-03T01:53:43+00:00","dateModified":"2016-08-08T00:44:24+00:00","author":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478"},"breadcrumb":{"@id":"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/intelligentonlinetools.com\/blog\/2016\/08\/03\/bio-inspired-optimization-for-text-mining-2\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"http:\/\/intelligentonlinetools.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Bio-Inspired Optimization for Text Mining-2"}]},{"@type":"WebSite","@id":"http:\/\/intelligentonlinetools.com\/blog\/#website","url":"http:\/\/intelligentonlinetools.com\/blog\/","name":"Machine Learning Applications","description":"Artificial intelligence, data mining and machine learning for building web based tools and services.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"http:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478","name":"owygs156","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/image\/","url":"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g","contentUrl":"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g","caption":"owygs156"}}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p7h1IJ-7g","jetpack-related-posts":[{"id":521,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/08\/26\/bio-inspired-optimization-for-text-mining-4\/","url_meta":{"origin":450,"position":0},"title":"Bio-Inspired Optimization for Text Mining-4","date":"August 26, 2016","format":false,"excerpt":"Clustering Text Data In previous post Bio-Inspired Optimization was applied for clustering of numerical data. In this post text data will be used for clustering. So python source code will be modified for clustering of text data. This data will be initialized in the beginning of this python script with\u2026","rel":"","context":"In &quot;Machine Learning&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":498,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/08\/13\/bio-inspired-optimization-for-text-mining-3\/","url_meta":{"origin":450,"position":1},"title":"Bio-Inspired Optimization for Text Mining-3","date":"August 13, 2016","format":false,"excerpt":"Clustering Numerical Multidimensional Data In this post we will implement Bio Inspired Optimization for clustering multidimensional data. We will use two dimensional data array \"data\" however the code can be used for any reasonable size of array. To do this parameter num_dimensions should be set to data array dimension. We\u2026","rel":"","context":"In &quot;Data Mining&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":426,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/07\/29\/bio-inspired-optimization-for-text-mining-1\/","url_meta":{"origin":450,"position":2},"title":"Bio-Inspired Optimization for Text Mining-1","date":"July 29, 2016","format":false,"excerpt":"Motivation Optimization problem studies maximizing or minimizing some function y=f(x) with some range of choices available for x. Biologically inspired (bio-inspired) algorithms for optimization problems are now widely used. A few examples of such optimization are: particle swarm optimization (PSO) that is based on the swarming behavior of fish and\u2026","rel":"","context":"In &quot;Data Mining&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1289,"url":"http:\/\/intelligentonlinetools.com\/blog\/2017\/07\/03\/algorithms-metrics-and-online-tool-for-clustering\/","url_meta":{"origin":450,"position":3},"title":"Algorithms, Metrics and Online Tool for Clustering","date":"July 3, 2017","format":false,"excerpt":"One of the key techniques of exploratory data mining is clustering \u2013 separating instances into distinct groups based on some measure of similarity. [1] In this post we will review how we can do clustering, evaluate and visualize results using online ML Sandbox tool from this website. This tool allows\u2026","rel":"","context":"In &quot;Data Mining&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/07\/kmeans-clustering-iris-300x286.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":256,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/06\/05\/using-python-for-mining-data-from-twitter-visualization-and-other-enchancements\/","url_meta":{"origin":450,"position":4},"title":"Using Python for Data Visualization of Clustering Results","date":"June 5, 2016","format":false,"excerpt":"In one of the previous post http:\/\/intelligentonlinetools.com\/blog\/2016\/05\/28\/using-python-for-mining-data-from-twitter\/ python source code for mining Twitter data was implemented. Clustering was applied to put tweets in different groups using bag of words representation for the text. The results of clustering were obtained via numerical matrix. Now we will look at visualization of clustering\u2026","rel":"","context":"In &quot;Artificial Intelligence&quot;","img":{"alt_text":"Data Visualization for Clustering Results","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2016\/06\/data-visualization1-300x220.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":227,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/05\/28\/using-python-for-mining-data-from-twitter\/","url_meta":{"origin":450,"position":5},"title":"Using Python for Mining Data From Twitter","date":"May 28, 2016","format":false,"excerpt":"Twitter is increasingly being used for business or personal purposes. With Twitter API there is also an opportunity to do data mining of data (tweets) and find interesting information. In this post we will take a look how to get data from Twitter, prepare data for analysis and then do\u2026","rel":"","context":"In &quot;Artificial Intelligence&quot;","img":{"alt_text":"Frequency of Hashtags","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2016\/05\/Frequency-of-Hashtags-300x171.png?resize=350%2C200","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/450"}],"collection":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/comments?post=450"}],"version-history":[{"count":16,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/450\/revisions"}],"predecessor-version":[{"id":496,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/450\/revisions\/496"}],"wp:attachment":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/media?parent=450"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/categories?post=450"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/tags?post=450"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}