{"id":2364,"date":"2018-11-03T17:47:47","date_gmt":"2018-11-03T17:47:47","guid":{"rendered":"http:\/\/intelligentonlinetools.com\/blog\/?page_id=2364"},"modified":"2018-11-07T01:15:47","modified_gmt":"2018-11-07T01:15:47","slug":"rl-dyna-q-run-planning-rl","status":"publish","type":"page","link":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/","title":{"rendered":"Reinforcement Learning Dyna-Q Run Planning"},"content":{"rendered":"<p>This is the python source code of run_planning_RL.py for post <a href=\"http:\/\/intelligentonlinetools.com\/blog\/2018\/10\/28\/reinforcement-learning-example-planning-using-q-learning-dyna\/\" target=\"_blank\">Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q<\/a><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\n&quot;&quot;&quot;\r\nSimplest model-based RL, Dyna-Q.\r\nRewards 3, 10 are specified in env script in the following\r\nThe first goal is the goal to achieve - for example number of units to complete\r\nGoal_completion_criteria_and_rewards = [\r\n                        [2,3],\r\n                        [3,10]\r\n                        ]              \r\n\r\nThis script is the main part which controls the update method of this example.\r\nThe RL is in RL_brain.py.\r\nagent = &quot;RANDOM_AGENT&quot;   or &quot;&quot;\r\nactions ML, RL\r\ngoals  ML project,   RL project  \r\n\r\nstate positions\r\nnumber of hours(steps) for each goal   [4,8]  example\r\ninitial [0,0]    for each episode\r\n\r\n&quot;&quot;&quot;\r\n\r\nfrom planning_env import Maze\r\nfrom RL_brain import QLearningTable, EnvModel\r\n\r\n\r\noutput_data=[]\r\nindexes=[]\r\ndef update():\r\n   \r\n    counter=0\r\n    sum=0\r\n  \r\n    for episode in range(2000):    \r\n        \r\n        env.reset()\r\n        print (&quot;episode=&quot; + str(episode))\r\n      \r\n        s_position=[0,0]    \r\n       \r\n        while True:\r\n            \r\n            a = RL.choose_action(str(s_position))\r\n\r\n            s_next_position, r, done, comp_results = env.step(a)\r\n          \r\n            RL.learn(str(s_position), a, r, str(s_next_position), done)\r\n           \r\n            env_model.store_transition(str(s_position), a, r, s_next_position)\r\n         \r\n            \r\n            for n in range(10):     # learn 10 more times using the env_model\r\n            \r\n                ms, ma = env_model.sample_s_a()  # ms in here is a str\r\n                mr, ms_ = env_model.get_r_s_(ms, ma)\r\n                RL.learn(ms, ma, mr, str(ms_), done)\r\n\r\n           \r\n         \r\n            s_position = s_next_position.copy()\r\n            \r\n            \r\n          \r\n            if done:\r\n              \r\n                sum=sum+r\r\n                if episode % 50 == 0:\r\n                    output_data.append (sum \/ 50)\r\n                    sum=0\r\n                    indexes.append ( episode)\r\n                    counter=counter+1\r\n                    env_model.get_env()\r\n                             \r\n                break\r\n\r\n   \r\n    print('episodes over')\r\n   \r\n\r\nif __name__ == &quot;__main__&quot;:\r\n    env = Maze()\r\n    #RL = QLearningTable(actions=list(range(env.n_actions)), agent = &quot;RANDOM_AGENT&quot;)\r\n    RL = QLearningTable(actions=list(range(env.n_actions)))\r\n    env_model = EnvModel(actions=list(range(env.n_actions)))\r\n\r\n    update()\r\n  \r\n    env_model.get_env()\r\n    \r\n    import matplotlib.pyplot as plt\r\n\r\n    plt.plot(indexes, output_data, label='RL')\r\n    plt.show()\r\n\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>This is the python source code of run_planning_RL.py for post Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"jetpack_post_was_ever_published":false},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Reinforcement Learning Dyna-Q Run Planning - Machine Learning Applications<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Reinforcement Learning Dyna-Q Run Planning - Machine Learning Applications\" \/>\n<meta property=\"og:description\" content=\"This is the python source code of run_planning_RL.py for post Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q\" \/>\n<meta property=\"og:url\" content=\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/\" \/>\n<meta property=\"og:site_name\" content=\"Machine Learning Applications\" \/>\n<meta property=\"article:modified_time\" content=\"2018-11-07T01:15:47+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"2 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/\",\"url\":\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/\",\"name\":\"Reinforcement Learning Dyna-Q Run Planning - Machine Learning Applications\",\"isPartOf\":{\"@id\":\"https:\/\/intelligentonlinetools.com\/blog\/#website\"},\"datePublished\":\"2018-11-03T17:47:47+00:00\",\"dateModified\":\"2018-11-07T01:15:47+00:00\",\"breadcrumb\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/intelligentonlinetools.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Reinforcement Learning Dyna-Q Run Planning\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/intelligentonlinetools.com\/blog\/#website\",\"url\":\"https:\/\/intelligentonlinetools.com\/blog\/\",\"name\":\"Machine Learning Applications\",\"description\":\"Artificial intelligence, data mining and machine learning for building web based tools and services.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Reinforcement Learning Dyna-Q Run Planning - Machine Learning Applications","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/","og_locale":"en_US","og_type":"article","og_title":"Reinforcement Learning Dyna-Q Run Planning - Machine Learning Applications","og_description":"This is the python source code of run_planning_RL.py for post Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q","og_url":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/","og_site_name":"Machine Learning Applications","article_modified_time":"2018-11-07T01:15:47+00:00","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/","url":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/","name":"Reinforcement Learning Dyna-Q Run Planning - Machine Learning Applications","isPartOf":{"@id":"https:\/\/intelligentonlinetools.com\/blog\/#website"},"datePublished":"2018-11-03T17:47:47+00:00","dateModified":"2018-11-07T01:15:47+00:00","breadcrumb":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/"]}]},{"@type":"BreadcrumbList","@id":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/intelligentonlinetools.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Reinforcement Learning Dyna-Q Run Planning"}]},{"@type":"WebSite","@id":"https:\/\/intelligentonlinetools.com\/blog\/#website","url":"https:\/\/intelligentonlinetools.com\/blog\/","name":"Machine Learning Applications","description":"Artificial intelligence, data mining and machine learning for building web based tools and services.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"}]}},"jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/P7h1IJ-C8","jetpack-related-posts":[{"id":2368,"url":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/","url_meta":{"origin":2364,"position":0},"title":"Reinforcement Learning Dyna-Q Planning Environment","date":"November 3, 2018","format":false,"excerpt":"This is the python source code of planning_env.py for post Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2501,"url":"http:\/\/intelligentonlinetools.com\/blog\/reinforcement-learning-dqn-run-planning\/","url_meta":{"origin":2364,"position":1},"title":"Reinforcement Learning DQN Run Planning","date":"January 5, 2019","format":false,"excerpt":"This is the python source code of run_planning_RL_DQN.py for post Reinforcement Learning Python DQN Application for Resource Allocation","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2366,"url":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q\/","url_meta":{"origin":2364,"position":2},"title":"Reinforcement Learning Dyna-Q","date":"November 3, 2018","format":false,"excerpt":"This is the python source code of RL_brain.py for post Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2495,"url":"http:\/\/intelligentonlinetools.com\/blog\/reinforcement-learning-dqn-planning-environment\/","url_meta":{"origin":2364,"position":3},"title":"Reinforcement Learning DQN Planning Environment","date":"January 5, 2019","format":false,"excerpt":"This is the python source code of planning_envDQN.py for post Reinforcement Learning Python DQN Application for Resource Allocation","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2499,"url":"http:\/\/intelligentonlinetools.com\/blog\/reinforcement-learning-dqn\/","url_meta":{"origin":2364,"position":4},"title":"Reinforcement Learning DQN","date":"January 5, 2019","format":false,"excerpt":"This is the python source code of RL_brainDQN.py for post Reinforcement Learning Python DQN Application for Resource Allocation","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2419,"url":"http:\/\/intelligentonlinetools.com\/blog\/building-chatbot-can-act-robot-mentor\/","url_meta":{"origin":2364,"position":5},"title":"Building Chatbot that Can Act as Robot-Mentor &#8211; Online Project","date":"December 5, 2018","format":false,"excerpt":"This is the main page that will be dedicated to building chatbot described below. Feel free to provide comments, suggestions, feedback. Project Description The goal of this project is build a chatbot that can act as robot-mentor to help users to achieve or exceed goals and aspirations, to keep focus\u2026","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/pages\/2364"}],"collection":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/comments?post=2364"}],"version-history":[{"count":5,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/pages\/2364\/revisions"}],"predecessor-version":[{"id":2388,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/pages\/2364\/revisions\/2388"}],"wp:attachment":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/media?parent=2364"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}