1/5/2024 0 Comments Pycharm educational courses![]() Reinforcement Learning lies between the spectrum of Supervised Learning and Unsupervised Learning, and there's a few important things to note: The policy is the strategy of choosing an action given a state in expectation of better outcomes.You give them a treat! Or a "No" as a penalty. After the transition, they may receive a reward or penalty in return.Our agents react by performing an action to transition from one "state" to another "state," your dog goes from standing to sitting, for example.An example of a state could be your dog standing and you use a specific word in a certain tone in your living room The situations they encounter are analogous to a state.The environment could in your house, with you. Your dog is an "agent" that is exposed to the environment.That's exactly how Reinforcement Learning works in a broader sense: Similarly, dogs will tend to learn what not to do when face with negative experiences. ![]() That's like learning "what to do" from positive experiences. Now guess what, the next time the dog is exposed to the same situation, the dog executes a similar action with even more enthusiasm in expectation of more food. ![]() If the dog's response is the desired one, we reward them with snacks. We emulate a situation (or a cue), and the dog tries to respond in many different ways. The dog doesn't understand our language, so we can't tell him what to do. Consider the scenario of teaching a dog new tricks.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |