{"id":2368,"date":"2018-11-03T17:52:42","date_gmt":"2018-11-03T17:52:42","guid":{"rendered":"http:\/\/intelligentonlinetools.com\/blog\/?page_id=2368"},"modified":"2018-11-07T01:14:12","modified_gmt":"2018-11-07T01:14:12","slug":"rl-dyna-q-planning-env","status":"publish","type":"page","link":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/","title":{"rendered":"Reinforcement Learning Dyna-Q Planning Environment"},"content":{"rendered":"<p>This is the python source code of planning_env.py for post <a href=\"http:\/\/intelligentonlinetools.com\/blog\/2018\/10\/28\/reinforcement-learning-example-planning-using-q-learning-dyna\/\" target=\"_blank\">Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q<\/a><\/p>\n<pre class=\"brush: python; title: ; notranslate\" title=\"\">\r\n\r\n&quot;&quot;&quot;\r\nReinforcement learning  example.\r\n\r\nThis script is the environment part of this example. The RL is in RL_brain.py.\r\n\r\n&quot;&quot;&quot;\r\n\r\n\r\nimport numpy as np\r\nnp.random.seed(1)\r\n\r\n\r\nACTIONS = ['ML', 'RL']\r\n\r\nNumber_of_steps=5\r\nNumber_of_goals = 2\r\n\r\nCurr_goals_position = [0,0] \r\n\r\nACTION_Goal_Matrix = np.array([\r\n                        [0, 0],\r\n                        [1, 1]\r\n                     \r\n                        ])\r\n\r\n\r\nGoal_completion_criteria_and_rewards = [\r\n                        [2,3],\r\n                        [3,10]\r\n                        ]                        \r\n\r\nclass Maze(object):\r\n    def __init__(self, actions = ACTIONS, init_pos=Curr_goals_position):\r\n        super(Maze, self).__init__()\r\n        self.action_space = actions\r\n        self.n_actions = len(self.action_space)\r\n      \r\n        \r\n        if init_pos is None:\r\n            for z in range(Number_of_goals):\r\n               init_pos[z]=0 \r\n        self.position=init_pos\r\n        self.reward = 0\r\n        self.position_number = 0\r\n     \r\n    def reset(self):\r\n         for z in range(Number_of_goals):\r\n               self.position[z]=0 \r\n         self.position_number = 0\r\n         self.reward = 0\r\n\r\n   \r\n\r\n    def is_completed(self, ind) :\r\n       \r\n        if self.position[ind] &gt;= Goal_completion_criteria_and_rewards[ind][0] :\r\n              return True\r\n        else :\r\n              return False\r\n           \r\n      \r\n        \r\n    def step(self, action_id):\r\n\r\n         completions=0\r\n\r\n         self.position[action_id]=self.position[action_id]+1\r\n\r\n         self.position_number=self.position_number+1\r\n         done = False\r\n         if self.position_number &gt;= Number_of_steps:\r\n             done = True\r\n             completions=0\r\n            \r\n           \r\n             for g in range(Number_of_goals):\r\n                 if self.is_completed(g):\r\n                   self.reward=self.reward+Goal_completion_criteria_and_rewards[g][1]\r\n                   completions=completions+1\r\n           \r\n                          \r\n        \r\n         return self.position, self.reward, done, completions\r\n\r\n<\/pre>\n","protected":false},"excerpt":{"rendered":"<p>This is the python source code of planning_env.py for post Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q<\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"jetpack_post_was_ever_published":false},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Reinforcement Learning Dyna-Q Planning Environment - Machine Learning Applications<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Reinforcement Learning Dyna-Q Planning Environment - Machine Learning Applications\" \/>\n<meta property=\"og:description\" content=\"This is the python source code of planning_env.py for post Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q\" \/>\n<meta property=\"og:url\" content=\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/\" \/>\n<meta property=\"og:site_name\" content=\"Machine Learning Applications\" \/>\n<meta property=\"article:modified_time\" content=\"2018-11-07T01:14:12+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data1\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/\",\"url\":\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/\",\"name\":\"Reinforcement Learning Dyna-Q Planning Environment - Machine Learning Applications\",\"isPartOf\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#website\"},\"datePublished\":\"2018-11-03T17:52:42+00:00\",\"dateModified\":\"2018-11-07T01:14:12+00:00\",\"breadcrumb\":{\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"http:\/\/intelligentonlinetools.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Reinforcement Learning Dyna-Q Planning Environment\"}]},{\"@type\":\"WebSite\",\"@id\":\"http:\/\/intelligentonlinetools.com\/blog\/#website\",\"url\":\"http:\/\/intelligentonlinetools.com\/blog\/\",\"name\":\"Machine Learning Applications\",\"description\":\"Artificial intelligence, data mining and machine learning for building web based tools and services.\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"http:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Reinforcement Learning Dyna-Q Planning Environment - Machine Learning Applications","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/","og_locale":"en_US","og_type":"article","og_title":"Reinforcement Learning Dyna-Q Planning Environment - Machine Learning Applications","og_description":"This is the python source code of planning_env.py for post Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q","og_url":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/","og_site_name":"Machine Learning Applications","article_modified_time":"2018-11-07T01:14:12+00:00","twitter_card":"summary_large_image","twitter_misc":{"Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/","url":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/","name":"Reinforcement Learning Dyna-Q Planning Environment - Machine Learning Applications","isPartOf":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/#website"},"datePublished":"2018-11-03T17:52:42+00:00","dateModified":"2018-11-07T01:14:12+00:00","breadcrumb":{"@id":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/"]}]},{"@type":"BreadcrumbList","@id":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-planning-env\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"http:\/\/intelligentonlinetools.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Reinforcement Learning Dyna-Q Planning Environment"}]},{"@type":"WebSite","@id":"http:\/\/intelligentonlinetools.com\/blog\/#website","url":"http:\/\/intelligentonlinetools.com\/blog\/","name":"Machine Learning Applications","description":"Artificial intelligence, data mining and machine learning for building web based tools and services.","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"http:\/\/intelligentonlinetools.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"}]}},"jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/P7h1IJ-Cc","jetpack-related-posts":[{"id":2495,"url":"http:\/\/intelligentonlinetools.com\/blog\/reinforcement-learning-dqn-planning-environment\/","url_meta":{"origin":2368,"position":0},"title":"Reinforcement Learning DQN Planning Environment","date":"January 5, 2019","format":false,"excerpt":"This is the python source code of planning_envDQN.py for post Reinforcement Learning Python DQN Application for Resource Allocation","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2366,"url":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q\/","url_meta":{"origin":2368,"position":1},"title":"Reinforcement Learning Dyna-Q","date":"November 3, 2018","format":false,"excerpt":"This is the python source code of RL_brain.py for post Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2364,"url":"http:\/\/intelligentonlinetools.com\/blog\/rl-dyna-q-run-planning-rl\/","url_meta":{"origin":2368,"position":2},"title":"Reinforcement Learning Dyna-Q Run Planning","date":"November 3, 2018","format":false,"excerpt":"This is the python source code of run_planning_RL.py for post Reinforcement Learning Example for Planning Tasks Using Q Learning and Dyna-Q","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2501,"url":"http:\/\/intelligentonlinetools.com\/blog\/reinforcement-learning-dqn-run-planning\/","url_meta":{"origin":2368,"position":3},"title":"Reinforcement Learning DQN Run Planning","date":"January 5, 2019","format":false,"excerpt":"This is the python source code of run_planning_RL_DQN.py for post Reinforcement Learning Python DQN Application for Resource Allocation","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2499,"url":"http:\/\/intelligentonlinetools.com\/blog\/reinforcement-learning-dqn\/","url_meta":{"origin":2368,"position":4},"title":"Reinforcement Learning DQN","date":"January 5, 2019","format":false,"excerpt":"This is the python source code of RL_brainDQN.py for post Reinforcement Learning Python DQN Application for Resource Allocation","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]},{"id":2419,"url":"http:\/\/intelligentonlinetools.com\/blog\/building-chatbot-can-act-robot-mentor\/","url_meta":{"origin":2368,"position":5},"title":"Building Chatbot that Can Act as Robot-Mentor &#8211; Online Project","date":"December 5, 2018","format":false,"excerpt":"This is the main page that will be dedicated to building chatbot described below. Feel free to provide comments, suggestions, feedback. Project Description The goal of this project is build a chatbot that can act as robot-mentor to help users to achieve or exceed goals and aspirations, to keep focus\u2026","rel":"","context":"Similar post","img":{"alt_text":"","src":"","width":0,"height":0},"classes":[]}],"_links":{"self":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/pages\/2368"}],"collection":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/comments?post=2368"}],"version-history":[{"count":6,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/pages\/2368\/revisions"}],"predecessor-version":[{"id":2390,"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/pages\/2368\/revisions\/2390"}],"wp:attachment":[{"href":"http:\/\/intelligentonlinetools.com\/blog\/wp-json\/wp\/v2\/media?parent=2368"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}