example
chess, cart-pole, car-on-the-hill jeep -> what is state, what is action?

value of state:
The value of a state  is defined as the sum of the reinforcements received when starting in that state and following some fixed policy to a terminal state.


Value function:
The value function is a mapping from states to state values


For the current learner policy ?, we can define so called value function which is a
mapping from a state s, to an entire amount of rewards the learner expects to receive in the
future after policy starting from that state

$V^{\pi}(s) = E_{}$

where: E?  expected (for the current policy ?) sum of future rewards rt, (0,1] ? ?  discount
rate, which determines that rewards received in the future are less worth for the state value.


The reward value equals 0 when the task is not fulfilled or it equals 1 in other case


stochastic, sequential decision problem (called decide action -> change situation -> decide action ...