example chess, cart-pole, car-on-the-hill jeep -> what is state, what is action? value of state: The value of a state is defined as the sum of the reinforcements received when starting in that state and following some fixed policy to a terminal state. Value function: The value function is a mapping from states to state values For the current learner policy ?, we can define so called value function which is a mapping from a state s, to an entire amount of rewards the learner expects to receive in the future after policy starting from that state $V^{\pi}(s) = E_{}$ where: E? expected (for the current policy ?) sum of future rewards rt, (0,1] ? ? discount rate, which determines that rewards received in the future are less worth for the state value. The reward value equals 0 when the task is not fulfilled or it equals 1 in other case stochastic, sequential decision problem (called decide action -> change situation -> decide action ...