When we say path-planning problem it usually implies to find a shortest path ... The shortest path problem is concerned with finding the shortest path from a specified origin to a specified destination in a given network while minimizing the total cost associated with the path. But what about the longest path problem? While shortest path problem has a unique solution, the longest path problem has many solutions -------------------------------------------------------------------------------------------------- done above random 100 units of fuel => none out of 10,000,000 trials only one out of 100,000,000 => manhattan-distance-from-start-point-of-final-points 48 & max-manhattan-distance-from-start-so-far reached 51 reinforce 100 units of fuel => 7 out of 300 epocs agent returns to base => path only moving around starting point \noindent The above mentioned task of exploring a two-dimensional grid-world gives us such an environment. Assuming the agent is given $q$ units of energy, the number of paths that has the maximum distance from the starting point when measured in Hamming distance is: \begin{eqnarray} 4 \hspace{.074in} \sum_{r=0}^{q} \left( \hspace{.04in} {}_q C _r \hspace{.04in} \right). \end{eqnarray} Quite a large number if $q$ is sufficiently large. Then our task is: {\it ``How we find the recurrent network which makes an agent take a different routes from those many options at each time of trials?''}\\ Then what about if origin and destination coincide?