## Abstract

Animals use a combination of egocentric navigation driven by the internal integration of environmental cues, interspersed with geocentric course correction and reorientation. These processes are accompanied by uncertainty in sensory acquisition of information, planning and execution. Inspired by observations of dung beetle navigational strategies that show switching between geocentric and egocentric strategies, we consider the question of optimal reorientation rates for the navigation of an agent moving along a preferred direction in the presence of multiple sources of noise. We address this using a model that takes the form of a correlated random walk at short time scales that is punctuated by reorientation events leading to a biased random walks at long time scales. This allows us to identify optimal alternation schemes and characterize their robustness in the context of noisy sensory acquisition as well as performance errors linked with variations in environmental conditions and agent–environment interactions.

## 1. Introduction

Navigation in complex uncertain environments requires information about environmental cues (landmarks) along with the ability to memorize and execute intended plans based on these cues. It is thus often accompanied by several cognitive demanding activities, such as multi-sensory acquisition and integration, locomotion planning and motor control. As organisms have finite cognitive and computational resources, they must therefore multitask and develop optimal schemes to dynamically allocate resources to different tasks [1–6]. Similar demands are also placed on their artificial analogues, autonomous vehicles such as robots, self-driving cars and even deep-space craft [7,8]. Typical strategies for navigation involve a combination of egocentric and geocentric schemes. In the first, the organism uses information that it has acquired somehow to move along a path without any realtime feedback. In the second, the organism constantly probes its location relative to environmental cues and adjusts its strategy in real time. The first requires perfect memory and can be efficient, but can succeed only in a constant environment; the second requires continuous course corrections and is often slow, but can cope with fluctuating environments. Organisms switch between these strategies to navigate accurately in a variable environment, and understanding the switching strategies of organisms in their natural environments is a basic problem in neuroethology.

Dung beetles provide a well-studied example of this switching behaviour during navigation [9]. Foraging beetles look for nutrient-rich dung, and then attempt to roll a dung ball along a straight path radially away from the pile [1] with the aim of providing food for their brood. Beetles acquire navigation information using a variety of long-range cues before initiating a roll, and then push the dung ball while walking backwards with their hind legs in contact with the dung. However, they stop intermittently and get atop the ball and walk on it to reorient themselves before continuing to roll the dung ball. The reorientation behaviour of the beetle is visually mediated and triggered both by active and passive deviations off-course and by visual cues [10] that include the position of the Sun, the Moon and associated light polarization patterns. While the beetle continues to obtain navigational information during a roll, the lack of a constant deviation angle triggering a reorientation behaviour suggests that this event has uncertainty [10]. Not having to rely on terrestrial landmarks allows the beetle to travel arbitrarily far away from its starting point without large directional errors [11], thus making the nature of the navigational task fundamentally different from that of homing insects. Furthermore, the switching behaviour between runs and reorientations is a function of the environment; the more uncertain the environment as characterized by its physical roughness, the more frequent the reorientations [10]. Stopping to reorient after each step makes for slow progress, but guarantees a straighter route. Alternatively, not stopping at all avoids the loss of time associated with reorientation, but leads to trajectories that deviate significantly from the intended bearing. This leads to a natural question of the optimal rate of switching between the egocentric strategy and the geocentric one. In this paper, we introduce a model for the movement of the beetle, or a navigational agent, that is associated with paths that are a characteristic of correlated diffusion on short time scales, and biased diffusion on long time scales [12]. This allows us to pose and solve an optimization problem for the most efficient switching strategy and characterize its robustness in the presence of noisy sensory acquisition and in rough environments, with relevance to questions that go beyond the original motivation for the problem.

## 2. Mathematical model

Inspired by the beetle, we consider an agent that performs a random walk interspersed by reorientation events, in which its heading direction is reset. We expect that owing to finite cognitive and attention capacity, information gathered while rolling the ball leads to a bigger acquisition error relative to that when it is reorienting, i.e. the detected orientation is *θ*_{i}+*ϵ*; we assume that this dynamic acquisition error, *ϵ*, is drawn from a uniform distribution of headings [0,*ϵ*_{d}], where *ϵ*_{d}∈[0,*π*]. Between reorientation events, we assume that the random walk is correlated orientationally, with a mean corresponding to the current orientation and a uniform distribution of angular errors in the range [0,*θ**], where *θ**∈[0,*π*]. The agent can acquire (visual) sensory information in two modes: when it is rolling the ball, or when it reorients. Memory degradation, execution errors and environmental uncertainties translate to larger accumulated errors, i.e. large *θ**, so that the agent must stop to reorient more often, consistent with observations of beetles [10]. This suggests that reorientation is triggered by a threshold deviation from the preferred heading, which we denote by *θ*_{a}, where *θ*_{a}∈[0,*π*]. However, reorientation events do not occur at the same angular deviation from the original bearing [10], and therefore requires the addition of further triggering mechanisms. Indeed, beetles detect gradual deviation better than abrupt ones [10]. This suggests that beetles do not pay attention to navigational signals at all times owing to finite cognitive resources; when paying attention to navigational cues, motor control suffers and vice versa. To quantify this, we define the attention span, *τ*, where *τ*∈[0,1] is the fraction of time during movement when the agent pays more attention to navigational cues, with a distribution of turning angles drawn from a uniform distribution [0,*Aθ**] (with *A*>1), while over the fraction of time 1−*τ*, the distribution of angles is in the range [0,*θ**]. If the agent is inattentive to navigation cues during a step, the standard deviation of the distribution of angles is *θ**, else it increases by a factor *A*>1 and becomes *Aθ** over the attentive steps (1−*τ*)*N*. In other words, when paying attention to the navigational cues, the agent pays less attention to its motor control and the resulting trajectory is characterized by a random walk with a wider turning angle distribution. These sources of stochasticity in sensing, attention and movement as the agent interacts with the environment are depicted in figure 1.

The agent is assumed to start at the origin and walk in a straight line along a given radial direction *θ*_{0}. At each time step *i* the agent moves in a random direction *θ*_{i} (figure 1*a*), relative to its current heading, i.e. *θ*_{i}=*θ*_{i−1}+*δθ*, where for simplicity *δθ* is drawn from a uniform distribution [0,*θ**] [12] (choosing from a Gaussian distribution does not change any of results qualitatively). The agent stops at certain intervals to reorient, i.e. it resets *θ*_{i}=*θ*_{0}, which we assume takes time, given that the agent has to take a bearing (corresponding to the beetle reorienting on the dung ball). Then, after *N* steps that include *N*_{r} reorientations, the end–to–end vector of the agent is

To characterize the competition between accuracy and speed associated with the task of moving as far from the origin as quickly as possible, we define a simple cost function that penalizes the deviation from a straight line and also penalizes the time spent for reorientation:
*n*=*N*/*N*_{r}, or equivalently, the frequency of reorientation, is an optimum. Before proceeding to find this, we point out that this problem has some similarities to a recent class of problems named ‘search with reset’ [14–17], but differs qualitatively in that here we consider diffusion in orientational rather than rectilinear space, while coupling reorientation to translational motion, making an analytic approach difficult except in the simplest of situations.

## 3. Analysis of model and results

### 3.1. Regular reorientation

We start with the simplest variant of the question of optimal switching, assuming that the heading direction is reset to zero every *n* steps, with *N*=*nN*_{r}, complete inattention to navigational cues between reorientation events, i.e. *τ*=0, so that there is no error amplification, i.e. *A*=1, and finally there is no acquisition error, i.e. *ϵ*_{d}=0. Later, we will include the additional stochasticity associated with randomness in the choice of reorientation intervals, and errors in acquisition, variable attention span and accompanying error amplification.

With these assumptions, the agent’s path is a correlated random walk (CRW) in between reorientation events, and a biased random walk at the long time scale (BRW) [12], as illustrated in figure 2*a*,*b*. The path always starts with a step along *θ*_{0}, followed by an *n*−1 steps CRW, resulting in a mean square end-to-end distance *θ*^{CRW}. These two quantities define a sector accessible to the agent between reorientation events with central angle 2*θ*^{CRW} and radius *c*). Then, following [11]:
*n* is the number of steps before a reorientation, and *R*_{∥CRW}〉 is the mean distance travelled along the directional bias and *θ*_{CRW} is the mean deviation from the intended direction.

On longer time scale, the entire random walk may be treated as a BRW with individual steps given by the CRWs (figure 2*c*). The end-to-end distance of a BRW is defined as *N*_{r} term to

Using (3.1) and (3.3) in the above relationship allows us to calculate

In figure 3, we show the dependence of the cost given by (3.7) on the frequency of reorientation *n*. When the width of the turning angle distribution *θ**=0, the agent travels along a perfectly straight line and the cost function *f* decreases monotonically with *n* (black solid curve). For a uniform turning angle distribution, *θ**≠0, *f* is optimal for a particular *n*(*θ** (coloured solid curves). We confirm these analytical results using simulations, as shown in figure 3*a*.

To study how the optimal reorientation frequency, *n*^{opt} depends on *θ**, we consider the cost function in the asymptotic limit *n*, yields
*W* is the Lambert W function (the solution of *b,* we plot *n*^{opt}(*θ**) versus *θ** derived from our simulations, and see that they agree well with our simple theoretical prediction. Furthermore, figure 3*a* shows that as *θ** increases it becomes more crucial to be close to the optimal reorientation interval, as small deviations in *n* result in a sharp increase of the cost.

### 3.2. Dynamic reorientation

Having understood this simple scenario, we now go back to address the complexities associated with noise in acquisition, planning and execution, in addition to the noise in the turning angle distribution *θ**. To address this, we use numerical simulations with the acquisition error *ϵ*_{d}≠0, the error amplification associated with attention *A*>1, and ask how the agent must optimize the attention span *τ*≠1 and the threshold activation angle for reorientation *θ*_{a} to minimize the cost function defined in equation (2.1). We use the covariance matrix adaptation algorithm (CMA–ES) [18] to determine the optimal strategy, where given a set (*θ**,*A*,*ϵ*_{d}), we determine the optimal set (*θ*_{a},*τ*). The CMA-ES is a stochastic derivative-free optimization method for nonlinear or non-convex continuous optimization problems. By incrementally increasing the probability of previously successful candidate solutions, we iteratively perform the following three steps: (i) sample p new sets of (*θ*_{a},*τ*) following the distribution with the updated mean and covariance of the cost function *f* (with step size built in), (ii) evaluate *f* and re-order the sampled solutions based on their fitness, and (iii) update the internal state variables (*θ*_{a},*τ*), including the mean, the isotropic and anisotropic evolution path, the covariance matrix, and the step size, based on the q best out of p solutions, until we have converged to the minimal cost that will yield *τ*=*τ*(*θ**,*A*,*ϵ*_{d}), *θ*_{a}=*θ*_{a}(*θ**,*A*,*ϵ*_{d}).

The results of our optimization are summarized in figure 4. We see that both the optimal attention span *τ* and optimal activation angle *θ*_{a} decrease monotonically with increase in the attention error amplification, *A*, consistent with our intuition. However, the mean reorientation interval, 〈*n*〉, (extracted retrospectively from the simulations), is relatively independent of *A* (figure 4*b*). When the variance of the turning angles *θ** increases, we see two regimes; for large *A*, *τ* is large and *θ*_{a} increases with *θ**. For small values of *A* , there is no clear trend. Finally, *θ*_{a} and *τ* increase monotonically with *ϵ*_{d}.

## 4. Discussion

Inspired by the behavioural strategy of navigational movement of the dung beetle, we have posed and solved an optimization problem of geocentric navigation interspersed by egocentric cue integration. In the simplest setting, we find that the optimal reorientation interval is inversely proportional to the environmental noise and is invariant to sensory acquisition noise. In more complex settings, our study highlights the variations in the optimal navigation strategy that balances accuracy, speed and effort. Our study might well be generalizable more broadly to animal navigation, and perhaps even artificial vehicular navigational situations, all of which use egocentric and geocentric cues with varying attention to create optimal strategies that balance accuracy and efficiency in the presence of noise, and additional constraints such as finite cognitive bandwidth. A natural next step would be to test the theory with animals, e.g. dung beetles (using established experimental methods [19]), as well as autonomous robotic agents.

## Data accessibility

Supporting data for this study can be accessed at http://datadryad.org/resource/doi:10.5061/dryad.cc3rb.

## Authors' contributions

L.M. conceived research, O.P. and L.M. designed research, O.P. performed research, O.P. and L.M. analysed data, and O.P. and L.M. wrote the paper.

## Funding

This work was supported by funding from the MacArthur Foundation and the Kavli Institute for NanoBio Science and Technology.

## Competing interests

We have no competing interests.

## Acknowledgements

We thank Elisabetta Matsumoto for a discussion on the asymptotics of the Lambert W function, and Prof. M. Dacke and the Mahadevan laboratory for discussions and comments.

- Received February 24, 2016.
- Accepted June 27, 2016.

- © 2016 The Authors.

Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.