Life : Partially Observable Markov Decision Process
- a set of states : S = {S1, S2, S3, ....., Sn}
- a set of actions : A = {a1, a2, a3, ....., an}
- a set of observations : O = {O1, O2, O3,....On}
- a set of transition probabilities : T(Si, a, Sj) = P(Sj Si, a)
- a set of observation probabilities : O(Zi, a, Sj) = P(Zi Sj, a)
- a set of rewards : R : S * A -> R
- a discount factor : gamma
- an initial belief : P0(s)
A Partially Observable Markov Decision Process (POMDP) is one way of solving decision problems in AI. In a decision problem, the right choice of an action is to be done when your observations about the world are not perfect, and the effect of your action is not completely certain. You dont know where you are in the game of life and you even dont know in which direction is the goal. The actions you perform are probabilistic and you get reward/ penalty for your actions. The only way of knowing about the effect of the actions is to do them.
The real problem of solving POMDP in AI is NP-hard, which simply means they are unsolvable in reasonable amout of time with the current algorithms. And so is life...!!