Integrating Sentiment Analysis API Python Django into Web Application

In this post we will learn how to use sentiment analysis with API python from paralleldots.com. We will look at running this API from python environment on laptop and also in web application environment with python Django on pythonanywhere hosting site.

In the one of previous post we set python Django project for chatbot. Here we will add file to this environment. Setting the chatbot files from previous project is not necessary. We just need folder structure.

Thus in this post we will reuse and extend some python Django knowledge that we got in the previous post. We will learn how to pass parameters from user form to server and back to user form, how to serve images, how to have logic block in web template.

ParrallelDots [1] provides several machine learning APIs such as named entity recognition (NER), intent identification, text classification and sentiment analysis. In this post we will explore sentiment analysis API and how it can be deployed on web server with Diango python.

Running Text Analysis API Locally

First we need install the library:
pip install paralleldots

We need also obtain key. It is free and no credit card required.

Now we run code as below

import paralleldots
paralleldots.set_api_key("XXXXXXXXXXX")

# for single sentence
text="the day is very nice"
lang_code="en"
response=paralleldots.sentiment(text,lang_code)

print(response)
print (response['sentiment'])
print (response['code'])
print (response['probabilities']['positive'])
print (response['probabilities']['negative'])
print (response['probabilities']['neutral'])


# for multiple sentence as array
text=["the day is very nice,the day is very good,this is the best day"]
response=paralleldots.batch_sentiment(text)
print(response)

Output:
{'probabilities': {'negative': 0.001, 'neutral': 0.002, 'positive': 0.997}, 'sentiment': 'positive', 'code': 200}
positive
200
0.997
0.001
0.002
{'sentiment': [{'negative': 0.0, 'neutral': 0.001, 'positive': 0.999}], 'code': 200}

This is very simple. Now we will deploy on web hosting site with python Django.

Deploying API on Web Hosting Site

Here we will build web form. Using this web form user can enter some text which will be passed to semantic analysis API. The result of analysis will be passed back to user and image will be shown based on result of sentiment analysis.
First we need install paralleldots library. To install the paralleldots module for Python 3.6, we’d run this in a Bash console (not in a Python one): [2]
pip3.6 install –user paralleldots

Note it is two dashes before user.
Now create or update the following files:

views.py

In this file we are getting user input from web form and sending it to API. Based on sentiment output from API we select image filename.

from django.shortcuts import render

import paralleldots

def do_sentiment_analysis(request):
    user_sent=""
    user_input=""
    fname="na"
    if request.POST:
       user_input=request.POST.get('user_input', '')
       lang_code="en"
       paralleldots.set_api_key("XXXXXXXXX")
       user_response=paralleldots.sentiment(user_input,lang_code)
       user_sent=user_response['sentiment']

       if (user_sent == 'neutral'):
             fname=  "emoticon-1634586_640.png"
       elif (user_sent == 'negative'):
             fname = "emoticon-1634515_640.png"
       elif (user_sent == 'positive'):
             fname = "smiley-163510_640.jpg"
       else:
             fname="na"

    return render(request, 'my_template_img.html', {'resp': user_sent, 'fname':fname, 'user_input':user_input})

my_template_img.html
Create new file my_template_img.html This file will have web input form for user to enter some text. We have also if statement here because we do not want display image when the form is just opened and no submission is done.

<html>
<form method="post">
    {% csrf_token %}

    <textarea rows=10 cols=50 name="user_input">{{user_input}}</textarea>
    <br>
    <label>{{resp}}</label>
    <br>
    <button type="submit">Submit</button>


  {% if "_640" in fname %}
     <img src="/media/{{fname}}" width="140px" height="100px">
  {% endif %}
</form>
</html>

Media folder
In the media folder download images to represent negative, neutral and positive. We can find images on pixabay site.

So the folder can look like this. Note if we use different file names we will need adjust the code.

 emoticon-1634515_640.png
 emoticon-1634586_640.png
 smiley-163510_640.jpg
 		
 

urls.py
This file is located under /home/username/projectname/projectname. Add import line to this file and also include pattern for do_sentiment_analysis:

from views import do_sentiment_analysis

urlpatterns = [
 
url(r'^press_my_buttons/$', press_my_buttons),
url(r'^do_sentiment_analysis/$', do_sentiment_analysis),

]

settings.py

This file is also located under /home/username/projectname/projectname
Make sure it has the following

STATIC_URL = '/static/'

MEDIA_ROOT = u'/home/username/projectname/media'
MEDIA_URL = '/media/'

STATIC_ROOT = u'/home/username/projectname/static'
STATIC_URL = '/static/'

Now when all is set, just access link. In case we use pythonanywhere it will be: http://username.pythonanywhere.com/do_sentiment_analysis/

Enter some text into text box and click Submit. We will see the output of API for sentiment analysis result and image based on this sentiment. Below are some screenshots.

Conclusion
We integrated machine learning sentiment analysis API from parallelDots into our python Diango web environment. We built web user input form that can send data to this API and receive output from API to show it to user. While building this we learned some Django things:
how to pass parameters from user form to server and back to user form,
how to serve images,
how to have logic block in web template.
We can build now different web applications that would use API service from ParallelDots. And we are able now integrate emotion analysis from text into our website.

References

ParallelDots
Installing New Modules
Handling Media Files in Django
Django Book
How to Create a Chatbot with ChatBot Open Source and Deploy It on the Web – Here we set project folder that we use in this post



Applied Machine Learning Classification for Decision Making

Making the good decision is the challenge that we often have. So, in this post, we will look at how applied machine learning classification can be used for the process of decision making.

The simple and quick approach to make decision is follow our past experience of similar situations. Usually we use compiling a list of pros and cons, asking someone for help or searching on the web. According to [1] we have two systems in our brain: logical and intuitive system:

“With every decision you take, every judgement you make, there is a battle in your mind – a battle between intuition and logic.”

“Most of the beliefs or opinions you have come from an automatic response. But then your logical mind invents a reason why you think or believe something.”

Most of the time our intuitive system is working efficiently, taking charge of all the thousands of decisions we make each day. But our intuitive system can be biased in many ways. For example it can be biased toward the latest unsuccessful outcome.

Besides this our memory can not remember a lot of information so we can not use efficiently all our past information and experience. That’s why people created tools like decision matrix that can help to improve decision making. In the next section we will look at techniques that facilitates using our rational decision making.

Decision Making

Decision Matrix

More advanced approach for making decision is score each possible option. In this approach we score each option against some criteria or feature. For example for the candidate product that we need to buy we are looking at price, quality, service and safety features. This approach results in creating decision matrix for analysis of possible options.

Below is an example of decision matrix [3] for choosing strategy for building some software project. Here we score 4 options based on the time to build and cost. The score is on the scale 0 (worst) – 100 (best). After we score each cell for time and cost rows, we can get sum of scores, rank and then make our choice (see last 3 rows)

Decision Matrix Example
Decision Matrix Example

There are also other, similar to decision matrix tools: belief decision matrix[3], Pugh Matrix[4].
These tools allow do comparison analysis of available options vs features and this enforces our mind logically evaluate and rank as much as possible pros and cons based on some numerical metrics.

However there are some limitations also. Incorrect selection criteria will obviously lead to the wrong conclusion. Poorly defined criteria can have multiple interpretations. For example too low can mean different things. With many rows or columns it becomes labor intensive to fill out the matrix.

Machine Learning Approach for Making Decision

Machine learning techniques can also help us improve decision making and even solve some of the above limitations. For example with Feature Engineering we can evaluate what criteria are important for decision.

In machine learning making decision can be viewed as assigning or predicting correct label (for example buy, not buy) based on data for the item features. In the field of machine learning or AI this is known as classification problem.

Classification algorithms learn correct decisions from data. Below is the example of training data that we input to machine learning classification algorithm. (Xij represent some numerical values)

Classification problem

Our options (decisions) are now represented by class label (most right column), criteria are represented by features. So we now switched columns with rows. Using training data like above we train classifier and then use it to choose the class (make decision) for new data.

There are different classification algorithms such as decision tree, SVM, Naive Bayes, neural network classification. In this post we will look at classification with neural network.
We will use Keras neural network with 2 dense layers.

Per Keras documentation[5] Dense layer implements the operation:
output = activation(dot(input, kernel) + bias)
where activation is the element-wise activation function passed as the activation argument,
kernel is a weights matrix created by the layer,
and bias is a bias vector created by the layer (only applicable if use_bias is True).

So we can see that the dense layer is performing similar math that we were doing in decision matrix.

Python Source Code for Neural Network Classification Algorithms

Finally, below is the python source code for classification problem. To test this code we will use iris dataset. This dataset has 3 classes and 4 features. Our task here to make the classifier able to assign correct label class.

# -*- coding: utf-8 -*-

from keras.utils import to_categorical
from sklearn import datasets

iris = datasets.load_iris()

# Create feature matrix
X = iris.data
print (X)

# Create target vector
y = iris.target
y = to_categorical(y, num_classes=3)
print (y)

from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split( X, y, test_size=0.33, random_state=42)

print (x_train)
print (y_train)

from keras.models import Sequential
from keras.layers import Dense

model = Sequential()
model.add(Dense(32, input_shape=(4,), activation='relu',  name='L1'))
model.add(Dense(3, activation='softmax', name='L2'))
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

print('Model Summary:')
print(model.summary())

model.fit(x_train, y_train, verbose=2, batch_size=10, epochs=100)
output = model.evaluate(x_test, y_test)

print('Final test loss: {:4f}'.format(output[0]))
print('Final test accuracy: {:4f}'.format(output[1]))


Below are results of neural network run.

Results of neural network run
Results of neural network run

As we can see our classifier is able to make correct decisions with 98% accuracy.

Thus we investigated different approaches for making decision. We saw how machine learning can be applied to this too. Specifically we looked at neural network classification algorithm for selecting correct label.
I would love to hear what types of decision making tools do you use for making decisions? Also feel free to provide feedback or suggestions.

References
1. How do we really make decisions?
2. What Is a Decision Matrix? Definition and Examples
3. Decision matrix From Wikipedia, the free encyclopedia
4. The Systems Engineering Tool Box
5. Keras Documentation



Inferring Causes and Effects from Daily Data

Doing different activities we often are interesting how they impact each other. For example, if we visit different links on Internet, we might want to know how this action impacts our motivation for doing some specific things. In other words we are interesting in inferring importance of causes for effects from our daily activities data.

In this post we will look at few ways to detect relationships between actions and results using machine learning algorithms and python.

Our data example will be artificial dataset consisting of 2 columns: URL and Y.
URL is our action and we want to know how it impacts on Y. URL can be link0, link1, link2 wich means links visited, and Y can be 0 or 1, 0 means we did not got motivated, and 1 means we got motivated.

The first thing we do hot-encoding link0, link1, link3 in 0,1 and we will get 3 columns as below.

Sample of data after one hot encoding
Sample of data after one hot encoding

So we have now 3 features, each for each URL. Here is the code how to do hot-encoding to prepare our data for cause and effect analysis.

filename = "C:\\Users\\drm\\data.csv"
dataframe = pandas.read_csv(filename)

dataframe=pandas.get_dummies(dataframe)
cols = dataframe.columns.tolist()
cols.insert(len(dataframe.columns)-1, cols.pop(cols.index('Y')))
dataframe = dataframe.reindex(columns= cols)

print (len(dataframe.columns))

#output
#4 

Now we can apply feature extraction algorithm. It allows us select features according to the k highest scores.

# feature extraction
test = SelectKBest(score_func=chi2, k="all")
fit = test.fit(X, Y)
# summarize scores
numpy.set_printoptions(precision=3)
print ("scores:")
print(fit.scores_)

for i in range (len(fit.scores_)):
    print ( str(dataframe.columns.values[i]) + "    " + str(fit.scores_[i]))
features = fit.transform(X)

print (list(dataframe))

numpy.set_printoptions(threshold=numpy.inf)


scores:
[11.475  0.142 15.527]
URL_link0    11.475409836065575
URL_link1    0.14227166276346598
URL_link2    15.526957539965377
['URL_link0', 'URL_link1', 'URL_link2', 'Y']


Another algorithm that we can use is <strong>ExtraTreesClassifier</strong> from python machine learning library sklearn.


from sklearn.ensemble import ExtraTreesClassifier
from sklearn.feature_selection import SelectFromModel

clf = ExtraTreesClassifier()
clf = clf.fit(X, Y)
print (clf.feature_importances_)  
model = SelectFromModel(clf, prefit=True)
X_new = model.transform(X)
print (X_new.shape)      

#output
#[0.424 0.041 0.536]
#(150, 2)


The above two machine learning algorithms helped us to estimate the importance of our features (or actions) for our Y variable. In both cases URL_link2 got highest score.

There exist other methods. I would love to hear what methods do you use and for what datasets and/or problems. Also feel free to provide feedback or comments or any questions.

References
1. Feature Selection For Machine Learning in Python
2. sklearn.ensemble.ExtraTreesClassifier

Below is python full source code

# -*- coding: utf-8 -*-
import pandas
import numpy
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import chi2


filename = "C:\\Users\\drm\\data.csv"
dataframe = pandas.read_csv(filename)

dataframe=pandas.get_dummies(dataframe)
cols = dataframe.columns.tolist()
cols.insert(len(dataframe.columns)-1, cols.pop(cols.index('Y')))
dataframe = dataframe.reindex(columns= cols)

print (dataframe)
print (len(dataframe.columns))


array = dataframe.values
X = array[:,0:len(dataframe.columns)-1]  
Y = array[:,len(dataframe.columns)-1]   
print ("--X----")
print (X)
print ("--Y----")
print (Y)



# feature extraction
test = SelectKBest(score_func=chi2, k="all")
fit = test.fit(X, Y)
# summarize scores
numpy.set_printoptions(precision=3)
print ("scores:")
print(fit.scores_)

for i in range (len(fit.scores_)):
    print ( str(dataframe.columns.values[i]) + "    " + str(fit.scores_[i]))
features = fit.transform(X)

print (list(dataframe))

numpy.set_printoptions(threshold=numpy.inf)
print ("features")
print(features)


from sklearn.ensemble import ExtraTreesClassifier
from sklearn.feature_selection import SelectFromModel

clf = ExtraTreesClassifier()
clf = clf.fit(X, Y)
print ("feature_importances")
print (clf.feature_importances_)  
model = SelectFromModel(clf, prefit=True)
X_new = model.transform(X)
print (X_new.shape) 



Machine Learning for Correlation Data Analysis Between Food and Mood

Can sweet food affect our mood? A friend of mine was interesting if some of his minor mood changes are caused by sugar intake from sweets like cookies. He collected and provided records and in this post we will use correlation data analysis with python pandas dataframes to check the connection between food and mood. We will create python script for this task.


food and mood

Also online free service available at Online Machine Learning Algorithms where you can plug your data or play with this example data. (Select Time Series Correlation). The links to the code and data are provided below and in the references:

Dataset from Correlation Data Analysis Between Food and Mood
Source Code for Machine Learning Correlation Data Analysis Between Food and Mood

Connection Between Eating and Mental Health

From internet resources we can confirm that relationship between how we feel and what we eat exists.[1] Sweet food is not recommended to eat as fluctuations in blood sugar cause mood swings, lack of energy [2]. The information about chocolate is however contradictory. Chocolate affects us both negatively and positively.[3] But chocolate has also sugar.
What if we eat only small amount of sweets and not each day – is there still any connection and how strong is it? The machine learning data analysis can help us to investigate this.

The Problem

So in this post we will estimate correlation between sweet food and mood based on provided daily data.
Correlation means association – more precisely it is a measure of the extent to which two variables are related. [4]

Data

The dataset has two columns, X and Y where:
X is how much sweet food was taken on daily basis, on the scale 0 – 1 , 0 is nothing, 1 means a max value.
Y is variation of mood from optimal state, on the scale 0 – 1 , 0 – means no variations or no defects, 1 means a max value.

Approach

If we calculate correlation between 2 columns of daily data we will get something around 0. However this would not show whole picture. Because the effect of the food might take action in a few days. The good or bad feeling can also stay for few days after the event that caused this feeling.
So we would need to take average data for several days for both X (looking back) and Y (looking forward). Here is the diagram that explains how data will be aggregated:

Changing the data - averaging
Changing the data – averaging

And here is how we can do this in the program:
1.for each day take average X data for last N days and take average Y data for M next days.
2.create a pandas dataframe which has now new moving averages for X and Y.
3.calculate correlation between new X and Y data

What should be N and M? We will use different values – from 1 to 14. And we will check what is the highest value for correlation.

Here is the python code to use pandas dataframe for calculating averages:

def get_data (df_pandas,k,z):
    
    x = np.zeros(df_pandas.shape[0]) 
    y = np.zeros(df_pandas.shape[0])
       
    new_df = pd.DataFrame() #creates a new dataframe that's empty
    for index, row in df_pandas.iterrows():
       
        x[index]=df_pandas.loc[index-k:index,'X'].mean()
     
        y[index]=df_pandas.loc[index:index+z,'Y'].mean()
    
    new_df=pd.concat([pd.DataFrame(x),pd.DataFrame(y)], "columns")
    new_df.columns = ['X', 'Y']
   
    return new_df    

Correlation Data Analysis

For calculating correlation we use also pandas dataframe. Here is the code snipped for this:

for i in range (1,n):
    for j in range (1,m):
   
       data=get_data(df, i, j)
       corr_df.loc[i, j] = data['X'].corr(data['Y'])

print ("corr_df")       
print (corr_df)  

pandas.DataFrame.corr by default is calculating pearson correlation coefficient – it is the measure of the strength of the linear relationship between two variables. In our code we use this default option. [8]

Results

After calculating correlation coefficients we output data in the table format and plot results on heatmap using seaborn module. Below is the data output and the plot. The max value of correlation for each column is highlighted in yellow in the data table. Input data and full source code are available at [5],[6].

Correlation data
Correlation data between sweet food (taken in n days) and mood (in next m days)
Correlation data between sweet food (taken in N days)  and mood in the following averaged M days,
Correlation data between sweet food (taken in N days) and mood in the following M days, averaged

Conclusion

We performed correlation analysis between eating sweet food and mental health. And we confirmed that in our data example there is a moderate correlation (0.4). This correlation is showing up when we use moving averaging for 5 or 6 days. This corresponds with observation that swing mood may appear in several days, not on the same or next day after eating sweet food.

We also learned how we can estimate correlation between two time series variables X, Y.

Feel free to experiment with your data or this example of correlation data analysis using this link Online Machine Learning Algorithms Use “load default values” to run this example. Below is the screenshot from this online tool.

Online calculation correlation and building heatmap
Online calculation correlation and building heatmap

References
1. Our Moods, Our Foods The messy relationship between how we feel and what we eat
2. Can food affect your mood? By Cynthia Ramnarace, upwave.com
3. The Effects Of Chocolate On The Emotions
4. Correlation
5. Dataset from Correlation Data Analysis Between Food and Mood
6.Source Code for Machine Learning Correlation Data Analysis Between Food and Mood
7.Calculating Correlations of Forex Currency Pairs in Python
8.pandas.DataFrame.corr



Visualization of Viterbi Path for Hidden Markov Models

Hidden Markov Models and Trellis Diagram

The Viterbi path is the most likely sequence of hidden states that produce a sequence of observed events. We can calculate Viterbi path using Viterbi algorithm. The focus of this post will be how to visualize Viterbi path for HMM (Hidden Markov Models) using trellis diagram. Finding Viterbi path provides the answer to decoding question in HMM – given observations and states transitions find most likely states.

In many papers and texts we can find trellis diagrams are utilized to solve or visualize the above decoding problem. Trellis structures can be represented with the help of table that have number of rows equal to number of states and number of columns equal to number of observations.

When we move from one state in one column to another state in adjacent column we use transition probability matrix.

The likelihood of the state sequence given the observation sequence can be found by simply following the path in the trellis diagram, multiplying the observation and transition likelihoods.

Below are decoding examples of calculating probabilities in Viterbi algorithm.

Trellis Diagram Step2
Trellis Diagram Step2

Trellis Diagram Step3
Trellis Diagram Step3
Visualization of Viterbi’s path through trellis diagram

Source: Wikipedia An example of HMM[1]

Animated examples like above looks good but what if you want to plug your data and see results?

Calculating Viterbi Path Online

Here is how you can calculate Viterbi algorithm and visualize Viterbi path using the online tool from this blog site. The site is using code from tensorflow_hmm provided by Zach Dwiel. [2] The code actually has two implementations – one is using numpy array and another is using tensorflow library.

I added visualization Viterbi path through trellis table. With this addition we can understand more and easy how the path was calculated. And also we can track the path through different states and observations. Below is the sample of output for the table with calculated Viterbi path. The Viterbi path is represented by gray cells, states: 0,0,1

Viterbi Path Visualization
Viterbi Path Visualization

Below are the steps how to use Online Machine Learning Algorithms tool:

1. Access Online Machine Learning Algorithms and select HMM as shown on below screenshot. HMM highlighted in yellow.

Online Machine Learning Algorithms Tool Step1
Online Machine Learning Algorithms Tool Step1

2. Input the data that you want to run. Or click Load Default Values to run with preconfigured test example. Use this button to see input data format.

Online Machine Learning Algorithms Tool Step1
Online Machine Learning Algorithms Tool Step1

3. Click Run now.
4. Click View Run Results.

5. Click Refresh Page button on this new page , you maybe will need click few times untill data output show up. Usually it takes less than 1 min, but it will depend how much data you need to process.
Scroll to the bottom page to see calculations.

Online Machine Learning Algorithms Tool Step3
Online Machine Learning Algorithms Tool Step3

Hope you will find useful online tool for Visualization of Viterbi Path for Hidden Markov Models and will give try to run HMM for your data or default data. Feel free to post in the comments box questions or suggestions.

References
1. An example of HMM
2. tensorflow_hmm
3. Online Machine Learning Algorithms Tool