## Abstract

The recently introduced generalized Hamiltonian–Real (GHR) calculus comprises, for the first time, the product and chain rules that makes it a powerful tool for quaternion-based optimization and adaptive signal processing. In this paper, we introduce novel dual relationships between the GHR calculus and multivariate real calculus, in order to provide a new, simpler proof of the GHR derivative rules. This further reinforces the theoretical foundation of the GHR calculus and provides a convenient methodology for generic extensions of real- and complex-valued learning algorithms to the quaternion domain.

## 1. Introduction

Quaternions were introduced by William Hamilton in 1843, as an associative but not commutative algebra over reals, and have been used in many fields, such as physics, computer graphics and signal processing.

Because of the non-commutativity of quaternion product, definitions of quaternion derivative are very different from those for real and complex derivatives. For example, Sudbery [1] establishes that only linear quaternion functions fulfil the requirements of traditional derivative definition. However, such a derivative is too stringent for practical optimization problems, whereby the cost functions are often real-valued and therefore non-analytic. In order to relax the derivative condition, the recently introduced Hamiltonian–Real (HR) calculus [2] deals with both analytic and non-analytic quaternion functions within a unified framework. However, the HR calculus does not comprise effective derivative rules (product, chain) for dealing with complicated derivation operations in practical problems. By exploiting quaternion rotation, the recently introduced generalized HR (GHR) calculus embarks upon the HR calculus to provide the powerful product and chain derivative rules that facilitate the calculation of quaternion gradient and Hessian in quaternion optimization problems [3]. In particular, we note that the product rule is a distinctive feature of the functional calculi. However, the original proof of the product and chain rules for the GHR calculus is quite involved and difficult to understand for non-experts.

The aim of this paper is to further elucidate the main ideas of the GHR calculus, and to give simpler proofs of the main results in [3]. The new proof is based on the duality between the GHR calculus and multivariate real calculus. One of the advantages of such an approach is that it allows for a direct treatment of quaternion-valued functions, without an intermediate transition to real functions. Based on the so-introduced relationships, we illustrate the effectiveness of such an approach by providing a dual version of the widely linear quaternion least mean square (WL-QLMS) adaptive learning algorithm [4,5], and show that it produces the same output as the primal WL-QLMS, but with a reduced computational complexity.

## 2. Preliminaries

In the following, we state some basic definitions of the GHR calculus [3,6,7].

### Definition 2.1 (Quaternion rotation [8])

For any quaternion *q*, the quaternion rotation is defined as
*μ* is any non-zero quaternion.

### Definition 2.2 (Real-differentiability [1])

A function *f*(*q*)=*f*_{a}(*q*)+*if*_{b}(*q*)+*jf*_{c}(*q*)+ *kf*_{d}(*q*) is called real differentiable if its real-valued components *f*_{a}(*q*), *f*_{b}(*q*), *f*_{c}(*q*) and *f*_{d}(*q*) are differentiable functions with respect to four real components *q*_{a},*q*_{b},*q*_{c} and *q*_{d} of a quaternion variable *q*=*q*_{a}+*iq*_{b}+*jq*_{c}+*kq*_{d}.

### Definition 2.3 (The GHR derivatives [3])

If *f* with respect to *q*^{μ} and *q*^{μ*} *q*=*q*_{a}+*iq*_{b}+*jq*_{c}+*kq*_{d}, quaternion components *f*/∂*q*_{a}, ∂*f*/∂*q*_{b}, ∂*f*/∂*q*_{c} and ∂*f*/∂*q*_{d} are the partial derivatives of *f* with respect to *q*_{a}, *q*_{b}, *q*_{c} and *q*_{d}, whereas the set {1,*i*^{μ},*j*^{μ},*k*^{μ}} is an orthogonal basis of

### Lemma 2.4 (Duality of quaternions and real vectors [9,10])

*For any quaternion* *q*=*q*_{a}+*iq*_{b}+*jq*_{c}+*kq*_{d}, *where* *l*∈{*a*,*b*,*c*,*d*}, *the relationship between the augmented quaternion vector* *and its quadrivariate real vector counterpart* *is given by* *where*
*and* (⋅)^{H} *denotes the quaternion conjugate transpose*.

### Lemma 2.5 (Duality of quaternion gradients and real gradient vectors [7])

*For a real-differentiable function* *the relation between the quadrivariate real gradient vector* *and the augmented quaternion gradient vector* *f*/∂*q*^{k*})^{T} *is given by*

### Lemma 2.6 (Duality of quaternion and real Jacobian matrices)

*For a real-differentiable function* *the relation between the quadrivariate real Jacobian matrix* *and the augmented quaternion Jacobian matrix* *is given by* *where*
*with* *f*(*q*)=*f*_{a}(*q*)+*if*_{b}(*q*)+*jf*_{c}(*q*)+*kf*_{d}(*q*), *q*=*q*_{a}+*iq*_{b}+*jq*_{c}+*kq*_{d} *and* *l*∈{*a*,*b*,*c*,*d*}.

### Proof.

The proof follows directly from definition 2.3. ▪

### Lemma 2.7 (Duality of quaternion and real Hessian matrices [7])

*For a real-differentiable function* *the relation between the quadrivariate real Hessian matrix* *and the augmented quaternion Hessian matrix* *is given by*

## 3. Main results

### Theorem 3.1 (Chain rule of GHR derivatives)

*Let* *and let* *be real-differentiable at an interior point q of the set S. Let* *be such that g(q)∈T for all q∈S. Assume that* *is real-differentiable at an interior point g(q)∈T. Then, the composite function h(q)=f(g(q)) satisfies the following chain rule
**where* *, ∂f/∂g*^{ν}*=∂f(g)/∂g*^{ν} *(ν=1,i,j,k) are the GHR derivatives.*

### Proof.

Using the chain rule of multivariate real calculus, we have
**A** and (**A**^{μ})^{−1}, we have

### Theorem 3.2 (Product rule of GHR derivatives)

*If the functions* *are real-differentiable, then the derivative of their product fg satisfies the following product rule
**where* *∂f/∂q*^{gμ} *can be obtained by replacing μ with gμ in definition *2.3.

### Proof.

Denote *h*(*q*)=*f*(*q*)*g*(*q*), *f*=*f*_{a}+*if*_{b}+*jf*_{c}+*kf*_{d} and *g*=*g*_{a}+*ig*_{b}+*jg*_{c}+*kg*_{d}. Then,
**A** and (**A**^{μ})^{−1} and using lemma 2.6, we have
**X**)_{1} denotes the first row of the matrix **X**. The first element of the above vector equation finally yields

## 4. Application example

The WL-QLMS algorithm is based on the quaternion widely linear model *y*(*n*)=**w**^{T}(*n*)**q**(*n*), where *e*(*n*)=*d*(*n*)−*y*(*n*) is the error between the desired signal *d*(*n*) and the filter output *y*(*n*). The weight update of WL-QLMS is then given by [3]
*α*>0 is the step size. Similar to the scalar duality in lemma 2.4, the duality between the augmented quaternion vector **q**(*n*) and its dual quadrivariate real vector **J**=**A**⊗**I**_{N} denotes the Kronecker product of **A** (cf. (2.3)) and the *N*×*N* identity matrix **I**_{N}. The filter output *y*(*n*) can now be rewritten as
**J** and noting that **J**^{H}**J**=4**I**_{4N}, we have
**w**^{T}(*n*)**J** with **v**^{T}(*n*), the dual version of the WL-QLMS (D-WL-QLMS) algorithm becomes
*α*.

### Remark 4.1

From (4.4), we can see that the D-WL-QLMS and WL-QLMS have the same filter output. Therefore, the D-WL-QLMS has the same performance as the WL-QLMS, which is better than the strictly linear QLMS and the real LMS (RLMS) [11,5], when dealing with non-circular inputs. Table 1 shows that D-WL-QLMS also has a lower computational complexity than the WL-QLMS, owing to the use of real-valued input vector **r**(*n*) in the calculations. This is similar to the operation of the RLMS; however, the weight vector **v**(*n*) of the D-WL-QLMS is quaternion-valued, which is different from the RLMS.

### Remark 4.2

Note that if we start from *y*(*n*)=**w**^{H}(*n*)**q**(*n*), then an alternative form of the WL-QLMS is given by **w**(*n*+1)=**w**(*n*)+*α* **q**(*n*)*e**(*n*) [3]. Denote **v**^{T}(*n*)=**w**^{H}(*n*)**J**, then the filter output becomes *y*(*n*)=**w**^{H}(*n*)**J****r**(*n*)=**v**^{T}(*n*)**r**(*n*), that is, the same as in (4.4). Finally, the proposed D-WL-QLMS keeps the same form of update rule in (4.6), and has the same computational complexity as that shown in table 1.

## 5. Conclusion

We have provided a new proof for the GHR calculus through the duality relationships between the GHR calculus and multivariate real calculus. These results complement the original proof given in [3], are easier to understand and are physically meaningful, and thus provide additional insights into the operation of the GHR calculus. An application example in adaptive learning theory demonstrates the advantages of the proposed approach.

## Authors' contributions

D.X. designed the study, carried out the analysis and drafted the manuscript; H.G. checked the analysis results and helped with the WL-QLMS algorithm; D.P.M. conceived the study and helped with the analysis and drafting the manuscript. All authors gave final approval for publication.

## Competing interests

The authors have no competing interests.

## Funding

D.X. was supported by the National Natural Science Foundation of China (no. 61301202), by the Fundamental Research Funds for the Central Universities (no. 2412016KJ014) and by Science and Technology Development Planning of Jilin Province. D.P.M. was partially supported by EPSRC grant no. EP/H026266/1.

## Acknowledgements

The authors thank the anonymous reviewers for their very helpful and insightful suggestions, and their expert advice on how to improve the proof of the two theorems.

- Received March 23, 2016.
- Accepted August 8, 2016.

- © 2016 The Authors.

Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.