{"id":1470,"date":"2017-11-13T00:39:03","date_gmt":"2017-11-13T00:39:03","guid":{"rendered":"http:\/\/intelligentonlinetools.com\/blog\/?p=1470"},"modified":"2017-11-26T14:36:32","modified_gmt":"2017-11-26T14:36:32","slug":"getting-data-driven-insights-from-blog-data-analysis-with-feature-selection","status":"publish","type":"post","link":"http:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/","title":{"rendered":"Getting Data-Driven Insights from Blog Data Analysis with Feature Selection"},"content":{"rendered":"<p><b>Machine learning<\/b> algorithms are widely used in every business &#8211; object recognition, marketing analytics, analyzing data in numerous applications to get useful insights.  In this post one of machine learning techniques is applied to <b>analysis of blog post data<\/b> to predict significant features for key metrics such as page views.<\/p>\n<p>You will see in this post simple example that will help to understand how to use <b>feature selection<\/b> with python code. Instructions how to quickly run online feature selection algorithm will be provided also. (no sign up is needed) <\/p>\n<p><strong>Feature Selection<\/strong><br \/>\nIn machine learning and statistics, <b>feature selection<\/b>, also known as variable selection, attribute selection or variable subset selection, is the process of selecting a subset of relevant features (variables, predictors) for use in model construction.[1]. Using feature selection we can identify  most influential variables for our metrics. <\/p>\n<p><strong>The Problem &#8211; Blog Data and the Goal<\/strong><br \/>\nFor example for each post you can have the following <strong>independent variables<\/strong>, denoted usually <strong>X<\/strong> <\/p>\n<ol>\n<li>Number of words in the post<\/li>\n<li>Post Category (or group or topic) <\/li>\n<li>Type of post (for example: list of resources, description of algorithms )<\/li>\n<li>Year when the post was published<\/li>\n<\/ol>\n<p>The list can go on.<br \/>\nAlso for each posts there are some metrics data or <strong>dependent variables<\/strong> denoted by <strong>Y<\/strong>.  Below is an example:<\/p>\n<ol>\n<li>Number of views<\/li>\n<li>Times on page<\/li>\n<li>Revenue $ amount associated with the page view<\/li>\n<\/ol>\n<p>The goal is to identify how X impacts on Y or predict Y based on X. Knowing most significant X can provide insights on what actions need to be taken to improve Y.<br \/>\nIn this post we will use <strong>feature selection<\/strong> from python ski-learn library.  This technique allows to rank the features based on their influence on Y.  <\/p>\n<p><strong>Example with Simple Dataset<\/strong><br \/>\nFirst let&#8217;s look at artificial dataset below. It is small and only has few columns so you can see some correlation between X and Y even without running algorithm. This allows us to test the results of algorithm to confirm that it is running correctly.<\/p>\n<pre><code>\r\nX1\tX2\tY\r\nred\t1\t100\r\nred\t2\t99\r\nred\t1\t85\r\nred\t2\t100\r\nred\t1\t79\r\nred\t2\t100\r\nred\t1\t100\r\nred\t1\t85\r\nred\t2\t100\r\nred\t1\t79\r\nblue\t2\t22\r\nblue\t1\t20\r\nblue\t2\t21\r\nblue\t1\t13\r\nblue\t2\t10\r\nblue\t1\t22\r\nblue\t2\t20\r\nblue\t1\t21\r\nblue\t2\t13\r\nblue\t1\t10\r\nblue\t1\t22\r\nblue\t2\t20\r\nblue\t1\t21\r\nblue\t2\t13\r\nblue\t1\t10\r\nblue\t2\t22\r\nblue\t1\t20\r\nblue\t2\t21\r\nblue\t1\t13\r\ngreen\t2\t10\r\ngreen\t1\t22\r\ngreen\t2\t20\r\ngreen\t1\t21\r\ngreen\t2\t13\r\ngreen\t1\t10\r\ngreen\t2\t22\r\ngreen\t1\t20\r\ngreen\t1\t13\r\ngreen\t2\t22\r\ngreen\t1\t20\r\ngreen\t2\t21\r\ngreen\t1\t13\r\ngreen\t2\t10\r\n<\/code><\/pre>\n<p><strong>Categorical Data<\/strong><br \/>\nYou can see from the above data that our example has categorical data (column X1)  which require special treatment when we use ski-learn library. Fortunately we have function <em><strong>get_dummies(dataframe)<\/strong><\/em> that converts categorical variables to numerical using <strong>one hot encoding<\/strong>. After convertion instead of one column with <em>blue<\/em>, <em>green<\/em> and <em>red<\/em> we will get 3 columns with 0,1 for each color. Below is the dataset with new columns:<\/p>\n<pre><code>\r\nN   X2  X1_blue  X1_green  X1_red    Y\r\n0    1      0.0       0.0     1.0  100\r\n1    2      0.0       0.0     1.0   99\r\n2    1      0.0       0.0     1.0   85\r\n3    2      0.0       0.0     1.0  100\r\n4    1      0.0       0.0     1.0   79\r\n5    2      0.0       0.0     1.0  100\r\n6    1      0.0       0.0     1.0  100\r\n7    1      0.0       0.0     1.0   85\r\n8    2      0.0       0.0     1.0  100\r\n9    1      0.0       0.0     1.0   79\r\n10   2      1.0       0.0     0.0   22\r\n11   1      1.0       0.0     0.0   20\r\n12   2      1.0       0.0     0.0   21\r\n13   1      1.0       0.0     0.0   13\r\n14   2      1.0       0.0     0.0   10\r\n15   1      1.0       0.0     0.0   22\r\n16   2      1.0       0.0     0.0   20\r\n17   1      1.0       0.0     0.0   21\r\n18   2      1.0       0.0     0.0   13\r\n19   1      1.0       0.0     0.0   10\r\n20   1      1.0       0.0     0.0   22\r\n21   2      1.0       0.0     0.0   20\r\n22   1      1.0       0.0     0.0   21\r\n23   2      1.0       0.0     0.0   13\r\n24   1      1.0       0.0     0.0   10\r\n25   2      1.0       0.0     0.0   22\r\n26   1      1.0       0.0     0.0   20\r\n27   2      1.0       0.0     0.0   21\r\n28   1      1.0       0.0     0.0   13\r\n29   2      0.0       1.0     0.0   10\r\n30   1      0.0       1.0     0.0   22\r\n31   2      0.0       1.0     0.0   20\r\n32   1      0.0       1.0     0.0   21\r\n33   2      0.0       1.0     0.0   13\r\n34   1      0.0       1.0     0.0   10\r\n35   2      0.0       1.0     0.0   22\r\n36   1      0.0       1.0     0.0   20\r\n37   1      0.0       1.0     0.0   13\r\n38   2      0.0       1.0     0.0   22\r\n39   1      0.0       1.0     0.0   20\r\n40   2      0.0       1.0     0.0   21\r\n41   1      0.0       1.0     0.0   13\r\n42   2      0.0       1.0     0.0   10\r\n<\/code><\/pre>\n<p>If you run python script (provided in this post) you will get feature score like below.<br \/>\nColumns:<br \/>\nX2  X1_blue  X1_green  X1_red<br \/>\nscores:<br \/>\n[  0.925   5.949   4.502  33.   ]<\/p>\n<p>So it is showing that column with red color is most significant and this makes sense if you look at data.<\/p>\n<p><strong>How to Run Script<\/strong><br \/>\nTo run script you need put data in csv file and update filename location in the script.<br \/>\nAdditionally you need to have dependent variable Y in most right column and it should be labeled by &#8216;Y&#8217;.<br \/>\nThe script is using option &#8216;all&#8217; for number of features, but you can change some number if needed.<\/p>\n<p><strong>Example with Dataset from Blog<\/strong><br \/>\nNow we can move to actual dataset from this blog. It took a little time to prepare data but this is just for the first time. Going forward I am planning to record data regularly after I create post or at least on weekly basis. Here are the fields that I used:<\/p>\n<ol>\n<li>Number of words in the post &#8211;  this is something that the blog is providing<\/li>\n<li>Category or group or topic &#8211; was added manually<\/li>\n<li>Type of post &#8211; I used few groups for this<\/li>\n<li>Number of views &#8211; was taken from Google Analytics<\/li>\n<\/ol>\n<p>For the first time I just used data from 19 top posts.<\/p>\n<p><strong>Results<\/strong><br \/>\nBelow you can view results.  The results are showing word count as significant, which could be expected, however I would think that score should be less. The results show also higher score for posts with text and code vs the posts with mostly only code (Type_textcode 10.9 vs Type_code 5.0)<\/p>\n<p>Feature         Score<br \/>\nWordsCount\t2541.55769<br \/>\nGroup_DecisionTree\t18<br \/>\nGroup_datamining\t18<br \/>\nGroup_machinelearning\t18<br \/>\nGroup_spreadsheet\t18<br \/>\nGroup_TSCNN\t17<br \/>\nGroup_python\t16<br \/>\nGroup_TextMining\t12.25<br \/>\nType_textcode\t10.88888889<br \/>\nGroup_API\t10.66666667<br \/>\nGroup_Visualization\t9.566666667<br \/>\nGroup_neuralnetwork\t5.333333333<br \/>\nType_code\t5.025641026<\/p>\n<p><strong>Running Online<\/strong><br \/>\nIn case you do not want to play with python code, you can run feature selection online at <a href=\"http:\/\/intelligentonlinetools.com\/cgi-bin\/analytics\/ml.cgi\" target=\"_balnk\">ML Sandbox<\/a><br \/>\nAll that you need is just enter data into the data field, here are the instructions:<\/p>\n<ol>\n<li>Go to <a href=\"http:\/\/intelligentonlinetools.com\/cgi-bin\/analytics\/ml.cgi\" target=\"_balnk\">ML Sandbox<\/a><\/li>\n<li>Select <em>Feature Extraction<\/em> next <em>Other<\/em><\/li>\n<li>Enter data (first row should have headers) OR click &#8220;<em>Load Default Values<\/em>&#8221; to load the example data from this post. See screenshot below<\/li>\n<li>Click &#8220;<em>Run Now<\/em>&#8220;.<\/li>\n<li>Click &#8220;<em>View Run Results<\/em>&#8220;<\/li>\n<li>If you do not see yet data wait for a minute or so and click &#8220;<em>Refresh Page<\/em>&#8221; and you will see results<\/li>\n<p>Note: your dependent variable Y should be in most right column and should have header <em><strong>Y<\/strong><\/em> Also do not use space in the words (header and data)<\/p>\n<img data-attachment-id=\"1482\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/feature_selection\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/11\/feature_selection.png\" data-orig-size=\"1088,918\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"feature_selection\" data-image-description=\"&lt;p&gt;Running Feature Selection Online&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;Running Feature Selection Online&lt;\/p&gt;\n\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/11\/feature_selection-300x253.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/11\/feature_selection-1024x864.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/11\/feature_selection-300x253.png\" alt=\"\" width=\"600\" height=\"506\" class=\"size-medium wp-image-1482\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/11\/feature_selection-300x253.png 300w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/11\/feature_selection-768x648.png 768w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/11\/feature_selection-1024x864.png 1024w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/11\/feature_selection.png 1088w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/>\n<p><strong>Conclusion<\/strong><br \/>\nIn this post we looked how one of <strong>machine learning<\/strong> techniques &#8211; <strong>feature selection<\/strong> can be applied for analysis blog post data to predict significant features that can help choose better actions. We looked also how do this if one or more columns are <strong>categorical<\/strong>. The source code was tested on simple categorical and numerical example and provided in this post. Alternatively you can run same algorithm <strong>online<\/strong> at  <a href=\"http:\/\/intelligentonlinetools.com\/cgi-bin\/analytics\/ml.cgi\" target=\"_balnk\">ML Sandbox<\/a><\/p>\n<p>Do you run any analysis on blog data? What method do you use and how do you pull data from blog? Feel free to submit any comments or suggestions. <\/p>\n<p><strong>References<\/strong><br \/>\n1. <a href=\"https:\/\/en.wikipedia.org\/wiki\/Feature_selection\" target=\"_blank\">Feature Selection<\/a> Wikipedia<br \/>\n2. <a href=\"https:\/\/machinelearningmastery.com\/feature-selection-machine-learning-python\/\" target=\"_blank\">Feature Selection For Machine Learning in Python<\/a><\/p>\n<pre><code>\r\n# -*- coding: utf-8 -*-\r\n\r\n# Feature Extraction with Univariate Statistical Tests\r\nimport pandas\r\nimport numpy\r\nfrom sklearn.feature_selection import SelectKBest\r\nfrom sklearn.feature_selection import chi2\r\n\r\n\r\nfilename = \"C:\\\\Users\\\\Owner\\\\data.csv\"\r\ndataframe = pandas.read_csv(filename)\r\n\r\ndataframe=pandas.get_dummies(dataframe)\r\ncols = dataframe.columns.tolist()\r\ncols.insert(len(dataframe.columns)-1, cols.pop(cols.index('Y')))\r\ndataframe = dataframe.reindex(columns= cols)\r\n\r\nprint (dataframe)\r\nprint (len(dataframe.columns))\r\n\r\n\r\narray = dataframe.values\r\nX = array[:,0:len(dataframe.columns)-1]  \r\nY = array[:,len(dataframe.columns)-1]   \r\nprint (\"--X----\")\r\nprint (X)\r\nprint (\"--Y----\")\r\nprint (Y)\r\n# feature extraction\r\ntest = SelectKBest(score_func=chi2, k=\"all\")\r\nfit = test.fit(X, Y)\r\n# summarize scores\r\nnumpy.set_printoptions(precision=3)\r\nprint (\"scores:\")\r\nprint(fit.scores_)\r\n\r\nfor i in range (len(fit.scores_)):\r\n    print ( str(dataframe.columns.values[i]) + \"    \" + str(fit.scores_[i]))\r\nfeatures = fit.transform(X)\r\n\r\nprint (list(dataframe))\r\n\r\nnumpy.set_printoptions(threshold=numpy.inf)\r\nprint (\"features\")\r\nprint(features)\r\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Machine learning algorithms are widely used in every business &#8211; object recognition, marketing analytics, analyzing data in numerous applications to get useful insights. In this post one of machine learning techniques is applied to analysis of blog post data to predict significant features for key metrics such as page views. You will see in this &#8230; <a title=\"Getting Data-Driven Insights from Blog Data Analysis with Feature Selection\" class=\"read-more\" href=\"http:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":[]},"categories":[2,9,10],"tags":[31,18,27],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Getting Data-Driven Insights from Blog Data Analysis with Feature Selection - Machine Learning Applications<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Getting Data-Driven Insights from Blog Data Analysis with Feature Selection - Machine Learning Applications\" \/>\n<meta property=\"og:description\" content=\"Machine learning algorithms are widely used in every business &#8211; object recognition, marketing analytics, analyzing data in numerous applications to get useful insights. In this post one of machine learning techniques is applied to analysis of blog post data to predict significant features for key metrics such as page views. You will see in this ... Read more\" \/>\n<meta property=\"og:url\" content=\"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/\" \/>\n<meta property=\"og:site_name\" content=\"Machine Learning Applications\" \/>\n<meta property=\"article:published_time\" content=\"2017-11-13T00:39:03+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2017-11-26T14:36:32+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/11\/feature_selection-300x253.png\" \/>\n<meta name=\"author\" content=\"owygs156\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"owygs156\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/\",\"url\":\"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/\",\"name\":\"Getting Data-Driven Insights from Blog Data Analysis with Feature Selection - Machine Learning Applications\",\"isPartOf\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#website\"},\"datePublished\":\"2017-11-13T00:39:03+00:00\",\"dateModified\":\"2017-11-26T14:36:32+00:00\",\"author\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478\"},\"breadcrumb\":{\"@id\":\"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"http:\/\/intelligentonlinetools.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Getting Data-Driven Insights from Blog Data Analysis with Feature Selection\"}]},{\"@type\":\"WebSite\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#website\",\"url\":\"http:\/\/intelligentonlinetools.com\/blog\/\",\"name\":\"Machine Learning Applications\",\"description\":\"Artificial intelligence, data mining and machine learning for building web based tools and services.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"http:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478\",\"name\":\"owygs156\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g\",\"contentUrl\":\"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g\",\"caption\":\"owygs156\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Getting Data-Driven Insights from Blog Data Analysis with Feature Selection - Machine Learning Applications","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/","og_locale":"en_US","og_type":"article","og_title":"Getting Data-Driven Insights from Blog Data Analysis with Feature Selection - Machine Learning Applications","og_description":"Machine learning algorithms are widely used in every business &#8211; object recognition, marketing analytics, analyzing data in numerous applications to get useful insights. In this post one of machine learning techniques is applied to analysis of blog post data to predict significant features for key metrics such as page views. You will see in this ... Read more","og_url":"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/","og_site_name":"Machine Learning Applications","article_published_time":"2017-11-13T00:39:03+00:00","article_modified_time":"2017-11-26T14:36:32+00:00","og_image":[{"url":"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/11\/feature_selection-300x253.png"}],"author":"owygs156","twitter_card":"summary_large_image","twitter_misc":{"Written by":"owygs156","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/","url":"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/","name":"Getting Data-Driven Insights from Blog Data Analysis with Feature Selection - Machine Learning Applications","isPartOf":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/#website"},"datePublished":"2017-11-13T00:39:03+00:00","dateModified":"2017-11-26T14:36:32+00:00","author":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478"},"breadcrumb":{"@id":"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/intelligentonlinetools.com\/blog\/2017\/11\/13\/getting-data-driven-insights-from-blog-data-analysis-with-feature-selection\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"http:\/\/intelligentonlinetools.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Getting Data-Driven Insights from Blog Data Analysis with Feature Selection"}]},{"@type":"WebSite","@id":"http:\/\/intelligentonlinetools.com\/blog\/#website","url":"http:\/\/intelligentonlinetools.com\/blog\/","name":"Machine Learning Applications","description":"Artificial intelligence, data mining and machine learning for building web based tools and services.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"http:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478","name":"owygs156","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/image\/","url":"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g","contentUrl":"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g","caption":"owygs156"}}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p7h1IJ-nI","jetpack-related-posts":[{"id":2167,"url":"http:\/\/intelligentonlinetools.com\/blog\/2018\/07\/28\/inferring-causes-effects-daily-data\/","url_meta":{"origin":1470,"position":0},"title":"Inferring Causes and Effects from Daily Data","date":"July 28, 2018","format":false,"excerpt":"Doing different activities we often are interesting how they impact each other. For example, if we visit different links on Internet, we might want to know how this action impacts our motivation for doing some specific things. In other words we are interesting in inferring importance of causes for effects\u2026","rel":"","context":"In &quot;Data Mining&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1516,"url":"http:\/\/intelligentonlinetools.com\/blog\/2017\/11\/23\/regression-and-classification-decision-trees-building-with-python-and-running-online\/","url_meta":{"origin":1470,"position":1},"title":"Regression and Classification Decision Trees &#8211; Building with Python and Running Online","date":"November 23, 2017","format":false,"excerpt":"According to survey [1] Decision Trees constitute one of the 10 most popular data mining algorithms. Decision trees used in data mining are of two main types: Classification tree analysis is when the predicted outcome is the class to which the data belongs. Regression tree analysis is when the predicted\u2026","rel":"","context":"In &quot;Data Mining&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/11\/decision_tree_11_2017-300x283.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":966,"url":"http:\/\/intelligentonlinetools.com\/blog\/2017\/02\/18\/building-decision-trees-in-python\/","url_meta":{"origin":1470,"position":2},"title":"Building Decision Trees in Python","date":"February 18, 2017","format":false,"excerpt":"A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm. Decision trees are commonly used in operations research, specifically in decision analysis, to\u2026","rel":"","context":"In &quot;Artificial Intelligence&quot;","img":{"alt_text":"Decision Tree","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/02\/dt_post1_N_CTQ_Cost_regr1-2-use-this-300x103.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":2253,"url":"http:\/\/intelligentonlinetools.com\/blog\/2018\/09\/06\/ml-applications\/","url_meta":{"origin":1470,"position":3},"title":"Everyday Examples of Machine Learning Applications","date":"September 6, 2018","format":false,"excerpt":"Artificial Intelligence and Machine Learning applications is one of the most hottest topics in the industry today. Robots, self driving cars, intelligent chatbots and many other innovations are coming to our work and life. In this post we will look at few machine learning less known applications that were covered\u2026","rel":"","context":"In &quot;Machine learning applications&quot;","img":{"alt_text":"Topic modeling with textacy","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/09\/Topic-modeling-with-textacy-e1536508581929.png?resize=350%2C200&ssl=1","width":350,"height":200},"classes":[]},{"id":1446,"url":"http:\/\/intelligentonlinetools.com\/blog\/2017\/11\/06\/10-new-top-resources-on-machine-learning-from-around-the-web\/","url_meta":{"origin":1470,"position":4},"title":"10 New Top Resources on Machine Learning from Around the Web","date":"November 6, 2017","format":false,"excerpt":"For this post I put new and most interesting machine learning resources that I recently found on the web. This is the list of useful resources in such areas like stock market forecasting, text mining, deep learning, neural networks and getting data from Twitter. Hope you enjoy the reading. 1.\u2026","rel":"","context":"In &quot;Machine Learning&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2194,"url":"http:\/\/intelligentonlinetools.com\/blog\/2018\/08\/11\/applied-machine-learning-classification-for-decision-making\/","url_meta":{"origin":1470,"position":5},"title":"Applied Machine Learning Classification for Decision Making","date":"August 11, 2018","format":false,"excerpt":"Making the good decision is the challenge that we often have. So, in this post, we will look at how applied machine learning classification can be used for the process of decision making. The simple and quick approach to make decision is follow our past experience of similar situations. Usually\u2026","rel":"","context":"In &quot;Artificial Intelligence&quot;","img":{"alt_text":"Decision Making","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/08\/signs-1172211_640-e1534384498570.jpg?resize=350%2C200","width":350,"height":200},"classes":[]}],"_links":{"self":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/1470"}],"collection":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/comments?post=1470"}],"version-history":[{"count":39,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/1470\/revisions"}],"predecessor-version":[{"id":2165,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/1470\/revisions\/2165"}],"wp:attachment":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/media?parent=1470"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/categories?post=1470"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/tags?post=1470"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}