{"id":1995,"date":"2018-04-17T23:53:45","date_gmt":"2018-04-17T23:53:45","guid":{"rendered":"http:\/\/intelligentonlinetools.com\/blog\/?p=1995"},"modified":"2018-04-20T23:43:25","modified_gmt":"2018-04-20T23:43:25","slug":"lstm-neural-network-training-techniques-tuning-hyperparameters","status":"publish","type":"post","link":"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/","title":{"rendered":"LSTM Neural Network Training &#8211; Few Useful Techniques for Tuning Hyperparameters and Saving Time"},"content":{"rendered":"<p>Neural networks are among the most widely used machine learning techniques.[1] But neural network training and tuning multiple hyper-parameters takes time. I was recently building LSTM neural network for prediction for this post <a href=http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/03\/machine-learning-stock-market-prediction-lstm-keras\/  target='_blank'>Machine Learning Stock Market Prediction with LSTM Keras <\/a> and I learned some tricks that can save time.  In this post you will find some techniques that helped me to do neural net training more efficiently.<\/p>\n<h2>1.  Adjusting Graph To See All Details<\/h2>\n<p>Sometimes validation loss is getting high value and this prevents from seeing other data on the chart.  I added few lines of code to cut high values so you can see all details on chart. <\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nimport matplotlib.pyplot as plt\r\nimport matplotlib.ticker as mtick\r\n\r\nT=25\r\nhistory_val_loss=[]\r\n\r\nfor x in history.history['val_loss']:\r\n      if x &gt;= T:\r\n             history_val_loss.append (T)\r\n      else:\r\n             history_val_loss.append( x )\r\n\r\nplt.figure(6)\r\nplt.plot(history.history['loss'])\r\nplt.plot(history_val_loss)\r\nplt.title('model loss adjusted')\r\nplt.ylabel('loss')\r\nplt.xlabel('epoch')\r\nplt.legend(['train', 'test'], loc='upper left')\r\n<\/pre>\n<p>Below is the example of charts. Left graph is not showing any details except high value point because of the scale. Note that graphs are obtained from different tests.<br \/>\n<figure id=\"attachment_2007\" aria-describedby=\"caption-attachment-2007\" style=\"width: 670px\" class=\"wp-caption alignnone\"><img data-attachment-id=\"2007\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/lstm-nn-training-value-loss-with-high-number-and-adjusted\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-High-Number-and-adjusted-e1524097124417.png\" data-orig-size=\"680,308\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"LSTM NN Training Value Loss Charts with High Number and Adjusted\" data-image-description=\"&lt;p&gt;LSTM NN Training Value Loss Charts with High Number and Adjusted&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;LSTM NN Training Value Loss Charts with High Number and Adjusted&lt;\/p&gt;\n\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-High-Number-and-adjusted-300x136.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-High-Number-and-adjusted-e1524097124417.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-High-Number-and-adjusted-e1524097124417.png\" alt=\"LSTM NN Training Value Loss Charts with High Number and Adjusted\" width=\"680\" height=\"308\" class=\"size-full wp-image-2007\" \/><figcaption id=\"caption-attachment-2007\" class=\"wp-caption-text\">LSTM NN Training Value Loss Charts with High Number and Adjusted<\/figcaption><\/figure><\/p>\n<h2>2.  Early Stopping<\/h2>\n<p>Early stopping is allowing to save time on not running tests when a monitored quantity has stopped improving.  Here is how it can be coded:<\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nearlystop = EarlyStopping(monitor='val_loss', min_delta=0.0001, patience=80,  verbose=1, mode='min')\r\ncallbacks_list = [earlystop]\r\n\r\nhistory=model.fit (x_train, y_train, batch_size =1, nb_epoch =1000, shuffle = False, validation_split=0.15, callbacks=callbacks_list)\r\n<\/pre>\n<p>Here is what arguments mean per Keras documentation [2].<\/p>\n<p>min_delta: minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement.<br \/>\npatience: number of epochs with no improvement after which training will be stopped.<br \/>\nverbose: verbosity mode.<br \/>\nmode: one of {auto, min, max}. In min mode, training will stop when the quantity monitored has stopped decreasing; in max mode it will stop when the quantity monitored has stopped increasing; in auto mode, the direction is automatically inferred from the name of the monitored quantity.<\/p>\n<h2>3.  Weight Regularization<\/h2>\n<p>Weight regularizer can be used to regularize neural net weights. Here is the example. <\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nfrom keras.regularizers import L1L2\r\nmodel.add (LSTM ( 400,  activation = 'relu', inner_activation = 'hard_sigmoid' , bias_regularizer=L1L2(l1=0.01, l2=0.01),  input_shape =(len(cols), 1), return_sequences = False ))\r\n<\/pre>\n<p>Below are the charts that are showing impact of weight regularizer on loss value :<br \/>\n<figure id=\"attachment_2002\" aria-describedby=\"caption-attachment-2002\" style=\"width: 418px\" class=\"wp-caption alignnone\"><img data-attachment-id=\"2002\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/lstm-nn-training-value-loss-without-weigh-regularization\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-without-weigh-regularization.png\" data-orig-size=\"428,334\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"LSTM NN Training Value Loss without weigh regularization\" data-image-description=\"&lt;p&gt;LSTM NN Training Value Loss without weigh regularization&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;LSTM NN Training Value Loss without weigh regularization&lt;\/p&gt;\n\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-without-weigh-regularization-300x234.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-without-weigh-regularization.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-without-weigh-regularization.png\" alt=\"LSTM NN Training Value Loss without weigh regularization\" width=\"428\" height=\"334\" class=\"size-full wp-image-2002\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-without-weigh-regularization.png 428w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-without-weigh-regularization-300x234.png 300w\" sizes=\"(max-width: 428px) 100vw, 428px\" \/><figcaption id=\"caption-attachment-2002\" class=\"wp-caption-text\">LSTM NN Training Value Loss without weigh regularization<\/figcaption><\/figure><\/p>\n<figure id=\"attachment_2001\" aria-describedby=\"caption-attachment-2001\" style=\"width: 467px\" class=\"wp-caption alignnone\"><img data-attachment-id=\"2001\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/lstm-nn-training-value-loss-with-weight-regularization\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-weight-regularization.png\" data-orig-size=\"477,376\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"LSTM NN Training Value Loss with weight regularization\" data-image-description=\"&lt;p&gt;LSTM NN Training Value Loss without weigh regularization&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;LSTM NN Training Value Loss without weigh regularization&lt;\/p&gt;\n\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-weight-regularization-300x236.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-weight-regularization.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-weight-regularization.png\" alt=\"LSTM NN Training Value Loss without weigh regularization\" width=\"477\" height=\"376\" class=\"size-full wp-image-2001\" srcset=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-weight-regularization.png 477w, http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-weight-regularization-300x236.png 300w\" sizes=\"(max-width: 477px) 100vw, 477px\" \/><figcaption id=\"caption-attachment-2001\" class=\"wp-caption-text\">LSTM NN Training Value Loss without weigh regularization<\/figcaption><\/figure>\n<p>Without weight regularization validation loss is going more up during the neural net training.<\/p>\n<h2> 4. Optimizer <\/h2>\n<p>Keras software allows to use different optimizers.  I was using adam optimizer which is widely used. Here is how it can be used:<\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nadam=optimizers.Adam(lr=0.01, beta_1=0.91, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=True)\r\nmodel.compile (loss =&quot;mean_squared_error&quot; , optimizer = &quot;adam&quot;) \r\n<\/pre>\n<p>I found that beta_1=0.89 performed better then suggested 0.91 or other tested values. <\/p>\n<h2>5. Rolling Window Size <\/h2>\n<p>Rolling window (in case we use it) also can impact on performance. Too small or too big will drive higher validation loss. Below are charts for different window size (N=4,8,16,18, from left to right). In this case the optimal value was 16 which resulted in 81% accuracy.<\/p>\n<figure id=\"attachment_2005\" aria-describedby=\"caption-attachment-2005\" style=\"width: 690px\" class=\"wp-caption alignnone\"><img data-attachment-id=\"2005\" data-permalink=\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/lstm-nn-loss-charts\/#main\" data-orig-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Loss-Charts-e1524096629401.png\" data-orig-size=\"700,159\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"LSTM Neural Net  Loss Charts with Different N\" data-image-description=\"&lt;p&gt;LSTM Neural Net  Loss Charts with Different N&lt;\/p&gt;\n\" data-image-caption=\"&lt;p&gt;LSTM Neural Net  Loss Charts with Different N&lt;\/p&gt;\n\" data-medium-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Loss-Charts-300x68.png\" data-large-file=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Loss-Charts-e1524096629401.png\" decoding=\"async\" loading=\"lazy\" src=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Loss-Charts-e1524096629401.png\" alt=\"LSTM Neural Net Loss Charts with Different N\" width=\"700\" height=\"159\" class=\"size-full wp-image-2005\" \/><figcaption id=\"caption-attachment-2005\" class=\"wp-caption-text\">LSTM Neural Net  Loss Charts with Different N<\/figcaption><\/figure>\n<p>I hope you enjoyed this post on different techniques for tuning hyper parameters.  If you have any tips or anything else to add, please leave a comment below in the comment box.<\/p>\n<p>Below is the full source code:<\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\nimport numpy as np\r\nimport pandas as pd\r\nfrom sklearn import preprocessing\r\n\r\nimport matplotlib.pyplot as plt\r\nimport matplotlib.ticker as mtick\r\n\r\nfrom keras.regularizers import L1L2\r\n\r\nfname=&quot;C:\\\\Users\\\\stock data\\\\GM.csv&quot;\r\ndata_csv = pd.read_csv (fname)\r\n\r\n#how many data we will use \r\n# (should not be more than dataset length )\r\ndata_to_use= 150\r\n\r\n# number of training data\r\n# should be less than data_to_use\r\ntrain_end =120\r\n\r\n\r\ntotal_data=len(data_csv)\r\n\r\n#most recent data is in the end \r\n#so need offset\r\nstart=total_data - data_to_use\r\n\r\n\r\nyt = data_csv.iloc [start:total_data ,4]    #Close price\r\nyt_ = yt.shift (-1)   \r\n\r\nprint (yt_)\r\n\r\ndata = pd.concat ([yt, yt_], axis =1)\r\ndata. columns = ['yt', 'yt_']\r\n\r\n\r\nN=16    \r\ncols =['yt']\r\nfor i in range (N):\r\n  \r\n    data['yt'+str(i)] = list(yt.shift(i+1))\r\n    cols.append ('yt'+str(i))\r\n    \r\ndata = data.dropna()\r\ndata_original = data\r\ndata=data.diff()\r\ndata = data.dropna()\r\n    \r\n    \r\n# target variable - closed price\r\n# after shifting\r\ny = data ['yt_']\r\nx = data [cols]\r\n\r\n   \r\nscaler_x = preprocessing.MinMaxScaler ( feature_range =( -1, 1))\r\nx = np. array (x).reshape ((len( x) ,len(cols)))\r\nx = scaler_x.fit_transform (x)\r\n\r\nscaler_y = preprocessing. MinMaxScaler ( feature_range =( -1, 1))\r\ny = np.array (y).reshape ((len( y), 1))\r\ny = scaler_y.fit_transform (y)\r\n\r\n    \r\nx_train = x [0: train_end,]\r\nx_test = x[ train_end +1:len(x),]    \r\ny_train = y [0: train_end] \r\ny_test = y[ train_end +1:len(y)]  \r\n\r\nx_train = x_train.reshape (x_train. shape + (1,)) \r\nx_test = x_test.reshape (x_test. shape + (1,))\r\n\r\nfrom keras.models import Sequential\r\nfrom keras.layers.core import Dense\r\nfrom keras.layers.recurrent import LSTM\r\nfrom keras.layers import  Dropout\r\nfrom keras import optimizers\r\n\r\nfrom numpy.random import seed\r\nseed(1)\r\nfrom tensorflow import set_random_seed\r\nset_random_seed(2)\r\n\r\nfrom keras import regularizers\r\n\r\nfrom keras.callbacks import EarlyStopping\r\n\r\n\r\nearlystop = EarlyStopping(monitor='val_loss', min_delta=0.0001, patience=80,  verbose=1, mode='min')\r\ncallbacks_list = [earlystop]\r\n\r\nmodel = Sequential ()\r\nmodel.add (LSTM ( 400,  activation = 'relu', inner_activation = 'hard_sigmoid' , bias_regularizer=L1L2(l1=0.01, l2=0.01),  input_shape =(len(cols), 1), return_sequences = False ))\r\nmodel.add(Dropout(0.3))\r\nmodel.add (Dense (output_dim =1, activation = 'linear', activity_regularizer=regularizers.l1(0.01)))\r\nadam=optimizers.Adam(lr=0.01, beta_1=0.89, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=True)\r\nmodel.compile (loss =&quot;mean_squared_error&quot; , optimizer = &quot;adam&quot;) \r\nhistory=model.fit (x_train, y_train, batch_size =1, nb_epoch =1000, shuffle = False, validation_split=0.15, callbacks=callbacks_list)\r\n\r\n\r\ny_train_back=scaler_y.inverse_transform (np. array (y_train). reshape ((len( y_train), 1)))\r\nplt.figure(1)\r\nplt.plot (y_train_back)\r\n\r\n\r\nfmt = '%.1f'\r\ntick = mtick.FormatStrFormatter(fmt)\r\nax = plt.axes()\r\nax.yaxis.set_major_formatter(tick)\r\nprint (model.summary())\r\n\r\nprint(history.history.keys())\r\n\r\nT=25\r\nhistory_val_loss=[]\r\n\r\nfor x in history.history['val_loss']:\r\n      if x &gt;= T:\r\n             history_val_loss.append (T)\r\n      else:\r\n             history_val_loss.append( x )\r\n\r\n\r\nplt.figure(2)\r\nplt.plot(history.history['loss'])\r\nplt.plot(history.history['val_loss'])\r\nplt.title('model loss')\r\nplt.ylabel('loss')\r\nplt.xlabel('epoch')\r\nplt.legend(['train', 'test'], loc='upper left')\r\nfmt = '%.1f'\r\ntick = mtick.FormatStrFormatter(fmt)\r\nax = plt.axes()\r\nax.yaxis.set_major_formatter(tick)\r\n\r\n\r\n\r\nplt.figure(6)\r\nplt.plot(history.history['loss'])\r\nplt.plot(history_val_loss)\r\nplt.title('model loss adjusted')\r\nplt.ylabel('loss')\r\nplt.xlabel('epoch')\r\nplt.legend(['train', 'test'], loc='upper left')\r\n\r\n\r\nscore_train = model.evaluate (x_train, y_train, batch_size =1)\r\nscore_test = model.evaluate (x_test, y_test, batch_size =1)\r\nprint (&quot; in train MSE = &quot;, round( score_train ,4)) \r\nprint (&quot; in test MSE = &quot;, score_test )\r\n\r\npred1 = model.predict (x_test) \r\npred1 = scaler_y.inverse_transform (np. array (pred1). reshape ((len( pred1), 1)))\r\n \r\nprediction_data = pred1[-1]     \r\nmodel.summary()\r\nprint (&quot;Inputs: {}&quot;.format(model.input_shape))\r\nprint (&quot;Outputs: {}&quot;.format(model.output_shape))\r\nprint (&quot;Actual input: {}&quot;.format(x_test.shape))\r\nprint (&quot;Actual output: {}&quot;.format(y_test.shape))\r\n\r\nprint (&quot;prediction data:&quot;)\r\nprint (prediction_data)\r\n\r\ny_test = scaler_y.inverse_transform (np. array (y_test). reshape ((len( y_test), 1)))\r\nprint (&quot;y_test:&quot;)\r\nprint (y_test)\r\n\r\nact_data = np.array([row[0] for row in y_test])\r\n\r\nfmt = '%.1f'\r\ntick = mtick.FormatStrFormatter(fmt)\r\nax = plt.axes()\r\nax.yaxis.set_major_formatter(tick)\r\n\r\nplt.figure(3)\r\nplt.plot( y_test, label=&quot;actual&quot;)\r\nplt.plot(pred1, label=&quot;predictions&quot;)\r\n\r\nprint (&quot;act_data:&quot;)\r\nprint (act_data)\r\n\r\nprint (&quot;pred1:&quot;)\r\nprint (pred1)\r\n\r\nplt.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05),\r\n          fancybox=True, shadow=True, ncol=2)\r\n\r\n\r\nfmt = '$%.1f'\r\ntick = mtick.FormatStrFormatter(fmt)\r\nax = plt.axes()\r\nax.yaxis.set_major_formatter(tick)\r\n\r\ndef moving_test_window_preds(n_future_preds):\r\n\r\n    ''' n_future_preds - Represents the number of future predictions we want to make\r\n                         This coincides with the number of windows that we will move forward\r\n                         on the test data\r\n    '''\r\n    preds_moving = []                                    # Store the prediction made on each test window\r\n    moving_test_window = [x_test[0,:].tolist()]          # First test window\r\n    moving_test_window = np.array(moving_test_window)    \r\n   \r\n    for i in range(n_future_preds):\r\n      \r\n      \r\n        preds_one_step = model.predict(moving_test_window) \r\n        preds_moving.append(preds_one_step[0,0]) \r\n                       \r\n        preds_one_step = preds_one_step.reshape(1,1,1) \r\n        moving_test_window = np.concatenate((moving_test_window[:,1:,:], preds_one_step), axis=1) # new moving test window, where the first element from the window has been removed and the prediction  has been appended to the end\r\n        \r\n\r\n    print (&quot;pred moving before scaling:&quot;)\r\n    print (preds_moving)\r\n                                         \r\n    preds_moving = scaler_y.inverse_transform((np.array(preds_moving)).reshape(-1, 1))\r\n    \r\n    print (&quot;pred moving after scaling:&quot;)\r\n    print (preds_moving)\r\n    return preds_moving\r\n    \r\nprint (&quot;do moving test predictions for next 22 days:&quot;)    \r\npreds_moving = moving_test_window_preds(22)\r\n\r\n\r\ncount_correct=0\r\nerror =0\r\nfor i in range (len(y_test)):\r\n    error=error + ((y_test[i]-preds_moving[i])**2) \/ y_test[i]\r\n\r\n \r\n    if y_test[i] &gt;=0 and preds_moving[i] &gt;=0 :\r\n        count_correct=count_correct+1\r\n    if y_test[i] &lt; 0 and preds_moving[i] &lt; 0 :\r\n        count_correct=count_correct+1\r\n\r\naccuracy_in_change =  count_correct \/ (len(y_test) )\r\n\r\nplt.figure(4)\r\nplt.title(&quot;Forecast vs Actual, (data is differenced)&quot;)          \r\nplt.plot(preds_moving, label=&quot;predictions&quot;)\r\nplt.plot(y_test, label=&quot;actual&quot;)\r\nplt.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05),\r\n          fancybox=True, shadow=True, ncol=2)\r\n\r\n\r\nprint (&quot;accuracy_in_change:&quot;)\r\nprint (accuracy_in_change)\r\n\r\nind=data_original.index.values[0] + data_original.shape[0] -len(y_test)-1\r\nprev_starting_price = data_original.loc[ind,&quot;yt_&quot;]\r\npreds_moving_before_diff =  [0 for x in range(len(preds_moving))]\r\n\r\nfor i in range (len(preds_moving)):\r\n    if (i==0):\r\n        preds_moving_before_diff[i]=prev_starting_price + preds_moving[i]\r\n    else:\r\n        preds_moving_before_diff[i]=preds_moving_before_diff[i-1]+preds_moving[i]\r\n\r\n\r\ny_test_before_diff = [0 for x in range(len(y_test))]\r\n\r\nfor i in range (len(y_test)):\r\n    if (i==0):\r\n        y_test_before_diff[i]=prev_starting_price + y_test[i]\r\n    else:\r\n        y_test_before_diff[i]=y_test_before_diff[i-1]+y_test[i]\r\n\r\n\r\nplt.figure(5)\r\nplt.title(&quot;Forecast vs Actual (non differenced data)&quot;)\r\nplt.plot(preds_moving_before_diff, label=&quot;predictions&quot;)\r\nplt.plot(y_test_before_diff, label=&quot;actual&quot;)\r\nplt.legend(loc='upper center', bbox_to_anchor=(0.5, -0.05),\r\n          fancybox=True, shadow=True, ncol=2)\r\nplt.show()\r\n\r\n<\/pre>\n<p><strong>References<\/strong><br \/>\n1. <a href=\"https:\/\/dzone.com\/articles\/using-graknai-to-enhance-neural-network-models-for\" target=\"_blank\">Enhancing Neural Network Models for Knowledge Base Completion<\/a><br \/>\n2. <a href=https:\/\/keras.io\/callbacks\/ target=\"_blank\">Usage of callbacks<\/a><br \/>\n3. <a href=https:\/\/medium.com\/making-sense-of-data\/time-series-next-value-prediction-using-regression-over-a-rolling-window-228f0acae363 target=\"_blank\">Rolling Window Regression: a Simple Approach for Time Series Next value Predictions<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Neural networks are among the most widely used machine learning techniques.[1] But neural network training and tuning multiple hyper-parameters takes time. I was recently building LSTM neural network for prediction for this post Machine Learning Stock Market Prediction with LSTM Keras and I learned some tricks that can save time. In this post you will &#8230; <a title=\"LSTM Neural Network Training &#8211; Few Useful Techniques for Tuning Hyperparameters and Saving Time\" class=\"read-more\" href=\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":[]},"categories":[9,10,60],"tags":[48,63,18,27,16],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>LSTM Neural Network Training - Few Useful Techniques for Tuning Hyperparameters and Saving Time - Machine Learning Applications<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"LSTM Neural Network Training - Few Useful Techniques for Tuning Hyperparameters and Saving Time - Machine Learning Applications\" \/>\n<meta property=\"og:description\" content=\"Neural networks are among the most widely used machine learning techniques.[1] But neural network training and tuning multiple hyper-parameters takes time. I was recently building LSTM neural network for prediction for this post Machine Learning Stock Market Prediction with LSTM Keras and I learned some tricks that can save time. In this post you will ... Read more\" \/>\n<meta property=\"og:url\" content=\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/\" \/>\n<meta property=\"og:site_name\" content=\"Machine Learning Applications\" \/>\n<meta property=\"article:published_time\" content=\"2018-04-17T23:53:45+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2018-04-20T23:43:25+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-High-Number-and-adjusted-e1524097124417.png\" \/>\n<meta name=\"author\" content=\"owygs156\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"owygs156\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/\",\"url\":\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/\",\"name\":\"LSTM Neural Network Training - Few Useful Techniques for Tuning Hyperparameters and Saving Time - Machine Learning Applications\",\"isPartOf\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#website\"},\"datePublished\":\"2018-04-17T23:53:45+00:00\",\"dateModified\":\"2018-04-20T23:43:25+00:00\",\"author\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478\"},\"breadcrumb\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"http:\/\/intelligentonlinetools.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"LSTM Neural Network Training &#8211; Few Useful Techniques for Tuning Hyperparameters and Saving Time\"}]},{\"@type\":\"WebSite\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#website\",\"url\":\"http:\/\/intelligentonlinetools.com\/blog\/\",\"name\":\"Machine Learning Applications\",\"description\":\"Artificial intelligence, data mining and machine learning for building web based tools and services.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"http:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478\",\"name\":\"owygs156\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g\",\"contentUrl\":\"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g\",\"caption\":\"owygs156\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"LSTM Neural Network Training - Few Useful Techniques for Tuning Hyperparameters and Saving Time - Machine Learning Applications","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/","og_locale":"en_US","og_type":"article","og_title":"LSTM Neural Network Training - Few Useful Techniques for Tuning Hyperparameters and Saving Time - Machine Learning Applications","og_description":"Neural networks are among the most widely used machine learning techniques.[1] But neural network training and tuning multiple hyper-parameters takes time. I was recently building LSTM neural network for prediction for this post Machine Learning Stock Market Prediction with LSTM Keras and I learned some tricks that can save time. In this post you will ... Read more","og_url":"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/","og_site_name":"Machine Learning Applications","article_published_time":"2018-04-17T23:53:45+00:00","article_modified_time":"2018-04-20T23:43:25+00:00","og_image":[{"url":"http:\/\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/LSTM-NN-Training-Value-Loss-with-High-Number-and-adjusted-e1524097124417.png"}],"author":"owygs156","twitter_card":"summary_large_image","twitter_misc":{"Written by":"owygs156","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/","url":"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/","name":"LSTM Neural Network Training - Few Useful Techniques for Tuning Hyperparameters and Saving Time - Machine Learning Applications","isPartOf":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/#website"},"datePublished":"2018-04-17T23:53:45+00:00","dateModified":"2018-04-20T23:43:25+00:00","author":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478"},"breadcrumb":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/"]}]},{"@type":"BreadcrumbList","@id":"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/17\/lstm-neural-network-training-techniques-tuning-hyperparameters\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"http:\/\/intelligentonlinetools.com\/blog\/"},{"@type":"ListItem","position":2,"name":"LSTM Neural Network Training &#8211; Few Useful Techniques for Tuning Hyperparameters and Saving Time"}]},{"@type":"WebSite","@id":"http:\/\/intelligentonlinetools.com\/blog\/#website","url":"http:\/\/intelligentonlinetools.com\/blog\/","name":"Machine Learning Applications","description":"Artificial intelligence, data mining and machine learning for building web based tools and services.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"http:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Person","@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/7a886dc5eb9758369af2f6d2cb342478","name":"owygs156","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"http:\/\/intelligentonlinetools.com\/blog\/#\/schema\/person\/image\/","url":"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g","contentUrl":"http:\/\/2.gravatar.com\/avatar\/b351def598609cb4c0b5bca26497c7e5?s=96&d=mm&r=g","caption":"owygs156"}}]}},"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p7h1IJ-wb","jetpack-related-posts":[{"id":1783,"url":"http:\/\/intelligentonlinetools.com\/blog\/2018\/01\/19\/machine-learning-stock-prediction-lstm-keras\/","url_meta":{"origin":1995,"position":0},"title":"Machine Learning Stock Prediction with LSTM and Keras","date":"January 19, 2018","format":false,"excerpt":"In this post I will share experiments on machine learning stock prediction with LSTM and Keras with one step ahead. I tried to do first multiple steps ahead with few techniques described in the papers on the web. But I discovered that I need fully understand and test the simplest\u2026","rel":"","context":"In &quot;Machine Learning&quot;","img":{"alt_text":"Forecasting one step ahead LSTM 60","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/01\/Forecasting-one-step-ahead-LSTM-60-3_1_2018.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":1754,"url":"http:\/\/intelligentonlinetools.com\/blog\/2018\/02\/27\/time-series-prediction-lstm-keras\/","url_meta":{"origin":1995,"position":1},"title":"Time Series Prediction with LSTM and Keras for Multiple Steps Ahead","date":"February 27, 2018","format":false,"excerpt":"In this post I will share experiment with Time Series Prediction with LSTM and Keras. LSTM neural network is used in this experiment for multiple steps ahead for stock prices data. The experiment is based on the paper [1]. The authors of the paper examine independent value prediction approach. With\u2026","rel":"","context":"In &quot;Machine Learning&quot;","img":{"alt_text":"Multiple step prediction with separate neural networks","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/02\/multiple-step-prediction.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":1798,"url":"http:\/\/intelligentonlinetools.com\/blog\/2018\/01\/20\/machine-learning-stock-prediction-lstm-keras-python-source-code\/","url_meta":{"origin":1995,"position":2},"title":"Machine Learning Stock Prediction with LSTM and Keras &#8211; Python Source Code","date":"January 20, 2018","format":false,"excerpt":"Python Source Code for Machine Learning Stock Prediction with LSTM and Keras - Python Source Code with LSTM and Keras Below is the code for machine learning stock prediction with LSTM neural network. References 1. Machine Learning Stock Prediction with LSTM and Keras - Python Source Code with LSTM and\u2026","rel":"","context":"In &quot;Machine Learning&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":1178,"url":"http:\/\/intelligentonlinetools.com\/blog\/2017\/05\/14\/time-series-prediction-with-convolutional-neural-networks\/","url_meta":{"origin":1995,"position":3},"title":"Forecasting Time Series Data with Convolutional Neural Networks","date":"May 14, 2017","format":false,"excerpt":"Convolutional neural networks(CNN) is increasingly important concept in computer science and finds more and more applications in different fields. Many posts on the web are about applying convolutional neural networks for image classification as CNN is very useful type of neural networks for image classification. But convolutional neural networks can\u2026","rel":"","context":"In &quot;Artificial Intelligence&quot;","img":{"alt_text":"","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2017\/05\/time-series-LSTM-300x164.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":1974,"url":"http:\/\/intelligentonlinetools.com\/blog\/2018\/04\/03\/machine-learning-stock-market-prediction-lstm-keras\/","url_meta":{"origin":1995,"position":4},"title":"Machine Learning Stock Market Prediction with LSTM Keras","date":"April 3, 2018","format":false,"excerpt":"In the previous posts [1,2] I created script for machine learning stock market price on next day prediction. But it was pointed by readers that in stock market prediction, it is more important to know the trend: will the stock go up or down. So I updated the script to\u2026","rel":"","context":"In &quot;Machine Learning&quot;","img":{"alt_text":"Stock Data Prices Prediction with LSTM","src":"https:\/\/i0.wp.com\/intelligentonlinetools.com\/blog\/wp-content\/uploads\/2018\/04\/timeseries_differenced_and_inverted_back.png?resize=350%2C200","width":350,"height":200},"classes":[]},{"id":621,"url":"http:\/\/intelligentonlinetools.com\/blog\/2016\/10\/09\/online-resources-for-neural-networks-with-python\/","url_meta":{"origin":1995,"position":5},"title":"Online Resources for Neural Networks with Python","date":"October 9, 2016","format":false,"excerpt":"The neural network field enjoys now a resurgence of interest. New training techniques made training deep networks feasible. With deeper networks, more training data and powerful new hardware to make it all work, deep neural networks (or \u201cdeep learning\u201d systems) suddenly began making rapid progress in areas such as speech\u2026","rel":"","context":"In &quot;Artificial Intelligence&quot;","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/1995"}],"collection":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/comments?post=1995"}],"version-history":[{"count":17,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/1995\/revisions"}],"predecessor-version":[{"id":1997,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/posts\/1995\/revisions\/1997"}],"wp:attachment":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/media?parent=1995"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/categories?post=1995"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/tags?post=1995"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}