When we say path-planning problem it usually implies to find a shortest path ...


The shortest path problem is concerned with finding the shortest
path from a specified origin to a specified destination in a given
network while minimizing the total cost associated with the path.


But what about the longest path problem?

While shortest path problem has a unique solution, the longest path problem
has many solutions
--------------------------------------------------------------------------------------------------
done above


random
100 units of fuel => none out of 10,000,000 trials
only one out of 100,000,000 => 
manhattan-distance-from-start-point-of-final-points 48 & 
max-manhattan-distance-from-start-so-far reached 51

reinforce
100 units of fuel => 7 out of 300 epocs agent returns to base
=> path only moving around starting point


\noindent
The above mentioned task of exploring a two-dimensional grid-world gives us such an environment. Assuming the agent is given $q$ units of energy, the number of paths that has the maximum distance from the starting point when measured in Hamming distance is:
\begin{eqnarray}
4 \hspace{.074in}  \sum_{r=0}^{q} \left( \hspace{.04in} {}_q C _r \hspace{.04in} \right).
\end{eqnarray}
Quite a large number if $q$ is sufficiently large. Then our task is: {\it ``How we find the recurrent network which makes an agent take a different routes from those many options at each time of trials?''}\\

Then what about if origin and destination coincide?