Chapter 2 of my Dissertation

2023 Notes: Another exercise in converting troff into QTML. I used vim to replace dollar signs to .eqni tags for the inline equations. When an inline equation takes two lines of source, this method swapped opening and closing tags. I manually cleaned things up, but I probably missed a couple, and likely messed up some paragraph breaks. Chapter 1, by the way, is here.

CHAPTER 2: EQUATIONS OF MOTION

The Hertz-Vector Equations

The six components of the $E$ and $B$ fields are redundant as regards the underlying mathematics of the electromagnetic field. Nisbet [1] has shown that with the use of the Hertz vectors and their associated gauge transformations it is possible to describe the electromagnetic field in terms of only two components of the six possible Hertz vector components, even in the source regions. In a subsequent paper [2] he develops a similar approach for fields in inhomogeneous, anisotropic media. A literature search finds that little note has been taken of these very fundamental techniques which have the potential of simplifying many problems in electrodynamics. In the hope of popularizing this knowledge, the Hertz vector formulation is here re-derived for the isotropic case with some steps and commentary added, before we state the specific gauge condition to be used for this research. We start with Maxwell's equations in Gaussian cgs units:

(2.1)

\nabla \times E + \frac{1}{c} \partial_{t} B = 0,

(2.2)

\nabla \cdot B = 0,

(2.3)

\nabla \times H - \frac{1}{c} \partial_{t} D = 4 π J,

(2.4)

\nabla \cdot D = 4 π ρ,

and with the constitutive relations

(2.5)

D = ε E, B = μ H .

Note that in this formulation, that these constitutive relations only take care of the linear polarization. The nonlinear polarization is contained in $ρ$ and $J$ . We have four sets of coupled three-component vector partial differential equations to solve for the six scalar components of the electric and magnetic fields. Since the fields are not independent, we can reduce the number of components to be solved for by describing the fields in terms of potentials. For example, the vector and scalar potentials can be defined inside an inhomogeneous medium in the same fashion as in free space:

(2.6)

E = - \nabla φ - \frac{1}{c} \partial_{t} A

and

(2.7)

B = \nabla \times A .

These identically satisfy (2.1) and (2.2). We now have only four components to find and two vector equations for them to satisfy. Unfortunately, the two remaining Maxwell equations lead to the coupled equations:

(2.8)

\nabla \times \frac{1}{μ} \nabla \times A + \frac{ε}{c} \partial_{t} \nabla φ + \frac{ε}{c^{2}} \partial_{t t} A = 4 π J,

(2.9)

- \nabla \cdot ε \nabla φ - \frac{1}{c} \nabla \cdot ε \partial_{t} A = 4 π ρ .

The main advantage of potentials is that they are not unique. If we set

(2.10)

φ = φ_{0} + \frac{1}{c} \partial_{t} χ,

(2.11)

A = A_{0} + \nabla χ,

$E$ and $B$ are unaffected. This allows various gauge transformations to be used to simplify the equations for a particular problem. For example, the gauge transformations allow us to set

(2.12)

\frac{1}{c} \partial_{t} φ + \nabla \cdot ε A = 0 .

This separates the scalar and vector equations similarly to the Lorentz equation in free space. The equations of motion are now

(2.13)

\nabla \times (\frac{1}{μ} \nabla \times A) + \frac{ε}{c^{2}} \partial_{t t} A - ε \nabla (\nabla \cdot ε A) = 4 π J,

(2.14)

- \nabla \cdot (ε \nabla φ) + \frac{1}{c^{2}} \partial_{t t} φ = 4 π ρ .

If the permittivity or permeability varies in space, however, the equations for the components of $A$ are coupled. Also, the mixed second partial derivatives with respect to $z$ and the transverse coordinates rule out an explicit eigenvalue-eigenfunction formulation for finding the bound modes. One important idea behind using the Hertz vector formulation is that it allows a much larger selection of gauge transformations to choose from than $A$ and $φ$ . This is accomplished by expressing the source terms as well as the fields in terms of potentials. To start with, we define stream potentials $Q_{e}$ and $Q_{m}$ such that

(2.15)

\nabla \cdot Q_{e} = - ρ,

and

(2.16)

\frac{1}{c} \partial_{t} Q_{e} + \nabla \times (\frac{1}{μ} Q_{m}) = J .

These potentials fit in Maxwell's equations in the same positions as the polarization and magnetization densities. If there are no free charges, they could well be those densities. Even without free charges, $Q_{e}$ and $Q_{m}$ could differ from $P$ and $M$

because we allow the following gauge transformations which leave the physical sources $ρ$ and $J$ unaffected:

(2.17)

Q_{e} = Q_{e}^{0} + \nabla \times \frac{1}{μ} G,

(2.18)

Q_{m} = Q_{m}^{0} - \frac{1}{c} \partial_{t} G - μ \nabla g .

Next, we define stream potentials $R_{e}$ and $R_{m}$ for the magnetic monopole charge and current densities. Of course there are no such (known) things as magnetic monopoles so the right-hand sides of our definitions are set to zero:

(2.19)

\nabla \cdot R_{m} = 0,

and

(2.20)

- \frac{1}{c} \partial_{t} R_{m} + \nabla \times \frac{1}{ε} R_{e} = 0 .

The (un)physical sources are invariant when the following gauge transformations are made.

(2.21)

R_{e} = R_{e}^{0} - \frac{1}{c} \partial_{t} L - ε \nabla l,

(2.22)

R_{m} = R_{m}^{0} - \nabla \times \frac{1}{ε} L .

When these two potentials are used, $φ$ and $A$ are must become the potentials for $E - 4 π R_{e}$ and $B + 4 π R_{m}$ . That is,

(2.23)

\nabla \times A = B + 4 π R_{m}, - \nabla φ - \frac{1}{c} \partial_{t} A = E - 4 π \frac{1}{ε} R_{e} .

The Hertz vectors are potentials which use these stream potentials instead of their derivatives as their source terms. This is done by defining $Π_{e}$ and $Π_{m}$ through the equations:

(2.24)

φ = - \nabla \cdot ε Π_{e},

(2.25)

A = \frac{1}{c} \partial_{t} Π_{e} + \frac{1}{ε} \nabla \times Π_{m} .

This choice automatically satisfies our variant of the Lorentz condition. To find the physical fields in terms of the Hertz vectors, we plug into the definitions of the vector and scalar potentials:

(2.26)

E = \frac{4 π}{ε} R_{e}^{0} - \nabla φ - \frac{1}{c} \partial_{t} A = \frac{4 π}{ε} R_{e}^{0} + \nabla (\nabla \cdot ε Π_{e}) - \frac{1}{c^{2}} \partial_{t t} Π_{e} - \frac{1}{c ε} \nabla \times \partial_{t} Π_{m} .

This can be simplified by letting $l = - \frac{1}{4 π} \nabla \cdot ε Π_{e}$ in the gauge transformation for $R_{e}$ . This gives

(2.27)

E = \frac{4 π}{ε} R_{e} - \frac{1}{c^{2}} \partial_{t t} Π_{e} - \frac{1}{c ε} \nabla \times \partial_{t} Π_{m} .

And for $B$ :

(2.28)

B = - 4 π R_{m} + \nabla \times A = - 4 π R_{m} + \nabla \times \frac{1}{c} \partial_{t} Π_{e} + \nabla \times \frac{1}{ε} \nabla \times Π_{m} .

To obtain equations of motion for the Hertz vectors, we first plug into (2.4):

(2.29)

\nabla \cdot D = \nabla \cdot ε E = \nabla \cdot (4 π R_{e} - \frac{ε}{c^{2}} \partial_{t t} Π_{e} - \nabla \times \frac{1}{c} \partial_{t} Π_{m}) = - 4 π \nabla \cdot Q_{e} .

We can remove the divergence at the price of introducing a curl:

(2.30)

\frac{ε}{c^{2}} \partial_{t t} Π_{e} = 4 π (Q_{e} + R_{e}) + \nabla \times T,

where $T$ is an as yet arbitrary vector. From (2.3) we get

(2.31)

\nabla \times \frac{B}{μ} - \frac{ε}{c} \partial_{t} E = \frac{4 π}{c} \partial_{t} Q_{e} + 4 π \nabla \times \frac{1}{μ} Q_{m}

(2.32)

= \nabla \times \frac{1}{μ} (- 4 π R_{m} + \nabla \times \frac{1}{c} \partial_{t} Π_{e} + \nabla \times \frac{1}{ε} \nabla \times Π_{m}) - \frac{1}{c} \partial_{t} (4 π R_{e} - \frac{ε}{c^{2}} \partial_{t t} Π_{e} - \nabla \times \frac{1}{c} \partial_{t} Π_{m}) .

Substituting our relation for $\frac{ε}{c^{2}} \partial_{t t} Π_{e}$ we have:

= \nabla \times \frac{1}{μ} (- 4 π R_{m} + \nabla \times \frac{1}{c} \partial_{t} Π_{e} + \nabla \times \frac{1}{ε} \nabla \times Π_{m})

(2.33)

- (\frac{4 π}{c} \partial_{t} R_{e} - \frac{4 π}{c} \partial_{t} R_{e} - \frac{4 π}{c} \partial_{t} Q_{e} - \nabla \times \frac{1}{c} \partial_{t} T - \nabla \times \frac{1}{c^{2}} \partial_{t t} Π_{m}) .

To get rid of $Π_{e}$ , we let $T = - \frac{1}{μ} \nabla \times Π_{e}$ and cancel terms to get

(2.34)

\nabla \times \frac{1}{μ} (- 4 π R_{m} + \nabla \times \frac{1}{ε} \nabla \times Π_{m} + \frac{μ}{c^{2}} \partial_{t t} Π_{m}) = 4 π \nabla \times \frac{1}{μ} Q_{m},

(2.35)

\frac{μ}{c^{2}} \partial_{t t} Π_{m} + \nabla \times \frac{1}{ε} \nabla \times Π_{m} = 4 π (Q_{m} + R_{m}),

and the equation for $Π_{e}$ becomes

(2.36)

\frac{ε}{c^{2}} \partial_{t t} Π_{e} - \nabla \times \frac{1}{μ} \nabla \times Π_{e} = 4 π (Q_{e} + R_{e}) .

We could have $μ$ times the gradient of an arbitrary scalar tacked to the end of this equation, but this is taken into account by the available gauge transformation of $Q_{m}$ . Finally, we can use these equations of motion to get symmetric equations for $D$ and $H$ :

(2.37)

D = ε E = 4 π R_{e} - \frac{ε}{c^{2}} \partial_{t t} Π_{e} - \nabla \times \frac{1}{c} \partial_{t} Π_{m} = - 4 π Q_{e} - \nabla \times \frac{1}{c} \partial_{t} Π_{m} + \nabla \times \frac{1}{μ} \nabla \times Π_{e},

(2.38)

H = \frac{B}{μ} = - \frac{4 π}{μ} R_{m} + \frac{1}{μ} \nabla \times \frac{1}{c} \partial_{t} Π_{e} + \frac{1}{μ} \nabla \times \frac{1}{ε} \nabla \times Π_{m} = \frac{4 π}{μ} Q_{m} - \frac{1}{c^{2}} \partial_{t t} Π_{m} + \frac{1}{μ} \nabla \times \frac{1}{c} \partial_{t} Π_{e} .

The next step is to find the available gauge transformations for the Hertz vectors. The first two gauge transformations follow easily from the definition of the Hertz vectors in terms of the scalar and vector potentials (2.23,2.24). The scalar potential is obviously unaffected when $Π_{e}$ undergoes the transformation $Π_{e} = Π_{e}^{0} + ε^{- 1} \nabla \times Γ$ . In order to keep $A$ constant we must make the transformation $Π_{m} = Π_{m}^{0} - \frac{1}{c} \partial_{t} Γ$ . Setting $Π_{e}^{0}, Q_{e}^{0},$ and $R_{e}^{0}$ to zero and substituting into (2.36) gives

(2.39)

\frac{1}{c^{2}} \partial_{t t} \nabla \times Γ + \nabla \times \frac{1}{μ} \nabla \times \frac{1}{ε} \nabla \times Γ = 4 π (\nabla \times \frac{1}{μ} G - \frac{1}{c} \partial_{t} L - ε \nabla l) .

Since this must hold true even without the introduction of stream potentials for magnetic monopoles, and $L$ and $l$ transform the monopole potentials, we set them to zero. Thus $G$ must satisfy:

(2.40)

4 π G = \frac{μ}{c^{2}} \partial_{t t} Γ + \nabla \times \frac{1}{ε} \nabla \times Γ - μ \nabla ξ .

The last term is the as yet arbitrary gradient that is allowed when a curl is set to zero. The remaining obvious gauge transformation is that of adding the gradient of a scalar to $Π_{m}$ which leaves the vector potential

$A$ unaffected ( $Π_{m} = Π_{m}^{0} - \nabla γ$ ). Applying both transformations, setting the monopole stream potentials to zero and plugging into the equation of motion for $Π_{m}$ gives

(2.41)

\frac{1}{c^{2}} \partial_{t t} (- \frac{μ}{c} \partial_{t} Γ - μ \nabla γ) - \nabla \times \frac{1}{ε} \nabla \times Γ = - 4 π \frac{1}{c} \partial_{t} G - 4 π μ \nabla g .

Substitute back in the equation for $G$ :

(2.42)

- \frac{1}{c^{2}} \partial_{t t} μ \nabla γ = \frac{1}{c} \partial_{t} μ \nabla ξ - 4 π μ \nabla g .

(2.43)

4 π g = \frac{1}{c^{2}} \partial_{t t} γ + \frac{1}{c} \partial_{t} ξ .

The gauge functions for the Hertz vectors resulting from the gauge transformations for the magnetic monopole potentials are less obvious by themselves since both the Hertz vector gauge functions and the magnetic monopole gauge functions appear in the redefinition of $A$ and $φ$ (2.23). However, from the abundance of symmetries encountered so far (in $E$ and $H$ , $B$ and $D$ , $Q_{e}$ and $R_{m} Q_{m}$ and $R_{e}$ , and $Π_{e}$ and $Π_{m}$ ), it is possible to guess the Hertz vector gauge transformations resulting from the transformations for $R_{e}$ and $R_{m}$ from those resulting from the transformations of $Q_{e}$ and $Q_{m}$ . So here is the complete set of available gauge transformations; the remaining verifications, if desired, are left to the reader:

(2.44)

Π_{e} = Π_{e}^{0} + \frac{1}{ε} \nabla \times Γ - \frac{1}{c} \partial_{t} Λ - \nabla λ,

(2.45)

Π_{m} = Π_{m}^{0} - \frac{1}{c} \partial_{t} Γ - \nabla γ - \frac{1}{μ} \nabla \times Λ,

if the gauge functions are related by the following relations:

(2.46)

4 π g = \frac{1}{c^{2}} \partial_{t t} γ + \frac{1}{c} \partial_{t} ξ,

(2.47)

4 π l = \frac{1}{c^{2}} \partial_{t t} λ + \frac{1}{c} \partial_{t} ζ,

(2.48)

4 π G = μ \frac{1}{c^{2}} \partial_{t t} Γ + \nabla \times \frac{1}{ε} \nabla \times Γ - μ \nabla ξ,

(2.49)

4 π L = ε \frac{1}{c^{2}} \partial_{t t} Λ + \nabla \times \frac{1}{μ} \nabla \times Λ - ε \nabla ξ,

where $ξ$ and $ζ$ are any arbitrary functions. Nisbet claims that this set of gauge transformations allows us to set all but any two components to zero. In actual practice this is accomplished by setting all but two of the Hertz vector components to zero at a particular time. Then the stream potentials are transformed in order to make the time derivatives of the zeroed-out components equal to zero in the equations of motion for the Hertz vectors. Nisbet and later Mohsen [2-4] then describe various special cases in which this formalism produces simplified equations. The emphasis is on situations in which one can obtain two uncoupled scalar partial differential equations. Such situations include materials which vary in only one direction, spherically symmetric media, and materials which vary in two dimensions while the field does not vary in the third dimension. Uncoupled scalar equations can be obtained by setting all but one component each of $Π_{e}$ and

$Π_{m}$ to be zero. For media which vary in the $z$ direction this is the

$z$ component. For spherically symmetric media this is the $r$ direction. For media which vary in $x$ and $y$ , while the field does not vary in $z$ , this is the $z$ component. The original research begins with the choice of gauges. For the case of guided optical devices it is desirable to have equations of motion which have no cross derivatives with respect to the propagation direction and either of the transverse directions. Such equations allow the solution of bound modes to be an explicit eigenvalue equation in which the propagation constant is proportional to the eigenvalues of a transverse operator. Also, such a formulation simplifies the task of formulating a vector beam propagation method since the task (for no nonlinearity) is reduced to approximating the exponential of a transverse operator. This is, of course, the case for the scalar beam propagation method. For the purposes of this work we represent all non-linear phenomena through the polarization density $P$ . Thus, we can set $Q_{e}^{0} = P, Q_{m}^{0} = R_{e}^{0} = R_{m}^{0} = 0$ . The equations of motion for the Hertz vectors become

(2.50)

\frac{ε}{c^{2}} \partial_{t t} Π_{e} + \nabla \times \frac{1}{μ} \nabla \times Π_{e} = 4 π (P + \nabla \times \frac{1}{μ} G - \frac{1}{c} \partial_{t} L - ε \nabla l),

(2.51)

\frac{μ}{c^{2}} \partial_{t t} Π_{m} + \nabla \times \frac{1}{ε} \nabla \times Π_{m} = 4 π (- \frac{1}{c} \partial_{t} G - μ \nabla g - \nabla \times \frac{1}{ε} L) .

For a waveguide device pointing in the $z$ direction, we can get the desired form by setting $Π_{e x}$ and $Π_{e y}$ to be the non-zero components. We can keep $Π_{m}$ zero if we set

(2.52)

G = - \nabla \times \frac{1}{ε} a - μ \nabla b

(2.53)

g = - \frac{1}{c} \partial_{t} b

(2.54)

L = \frac{1}{c} \partial_{t} a .

Here $a$ and $b$ are arbitrary. As it turns out, the trivial choice of setting $G, g,$ and $L$ to be zero is quite sufficient. This leaves $l$ as a free parameter to make $Π_{e z}$ to be zero. Expanding what is left into components gives:

(2.55)

\frac{ε}{c^{2}} \partial_{t t} Π_{e x} + \partial_{y} \frac{1}{μ} (\partial_{x} Π_{e y} - \partial_{y} Π_{e x}) - \partial_{z} \frac{1}{μ} (\partial_{z} Π_{e x} - \partial_{x} Π_{e z}) = 4 π P_{x} - 4 π ε \partial_{x} l,

(2.56)

\frac{ε}{c^{2}} \partial_{t t} Π_{e y} + \partial_{z} \frac{1}{μ} (\partial_{y} Π_{e z} - \partial_{z} Π_{e y}) - \partial_{x} \frac{1}{μ} (\partial_{x} Π_{e y} - \partial_{y} Π_{e x}) = 4 π P_{y} - 4 π ε \partial_{y} l,

(2.57)

\frac{ε}{c^{2}} \partial_{t t} Π_{e z} + \partial_{x} \frac{1}{μ} (\partial_{z} Π_{e x} - \partial_{x} Π_{e z}) - \partial_{y} \frac{1}{μ} (\partial_{y} Π_{e z} - \partial_{z} Π_{e y}) = 4 π P_{z} - 4 π ε \partial_{z} l .

Setting $Π_{e z}$ zero in the last equation leaves

(2.58)

\partial_{x} \frac{1}{μ} \partial_{z} Π_{e x} + \partial_{y} \frac{1}{μ} \partial_{z} Π_{e y} = 4 π P_{z} - 4 π ε \partial_{z} l .

For a waveguide in the $z$ direction, $ε$ and $μ$ are not dependent on $z$ . So we can commute them with the $\partial_{z}$ operator to get

(2.59)

l = l_{1} - \frac{1}{4 π ε} (\partial_{x} \frac{1}{μ} Π_{e x} + \partial_{y} \frac{1}{μ} Π_{e y}),

where

(2.60)

\partial_{z} l_{1} = \frac{P_{z}}{ε} .

Substituting this value of $l$ into (49) and (50) provides our equations of motion for the remaining components.

\frac{ε}{c^{2}} \partial_{t t} Π_{e x} + \partial_{y} \frac{1}{μ} \partial_{x} Π_{e y} - \partial_{y} \frac{1}{μ} \partial_{y} Π_{e x} - \frac{1}{μ} \partial_{z z} Π_{e x}

(2.61)

= 4 π P_{x} - 4 π ε \partial_{x} l_{1} + ε \partial_{x} \frac{1}{ε} (\partial_{x} \frac{1}{μ} Π_{e x} + \partial_{y} \frac{1}{μ} Π_{e y}),

\frac{ε}{c^{2}} \partial_{t t} Π_{e y} - \frac{1}{μ} \partial_{z z} Π_{e y} - \partial_{x} \frac{1}{μ} \partial_{x} Π_{e y} + \partial_{x} \frac{1}{μ} \partial_{y} Π_{e x}

(2.62)

= 4 π P_{y} - 4 π ε \partial_{y} l_{1} + ε \partial_{y} \frac{1}{ε} (\partial_{x} \frac{1}{μ} Π_{e x} + \partial_{y} \frac{1}{μ} Π_{e y}) .

We can write this as

(2.63)

\frac{μ ε}{c^{2}} \partial_{t t} Π - \partial_{z z} Π = W Π + 4 π q,

where

(2.64)

Π = [\begin{array}{c} Π_{e x} \\ - - \\ Π_{e y} \end{array}], q = [\begin{array}{c} μ P_{x} - μ ε \partial_{x} l_{1} \\ - - - - - \\ μ P_{y} - μ ε \partial_{y} l_{1} \end{array}],

and

(2.65)

W = [\begin{array}{c} μ ε \partial_{x} \frac{1}{ε} \partial_{x} \frac{1}{μ} + μ \partial_{y} \frac{1}{μ} \partial_{y} \\ - - - - - - - - - \\ - μ \partial_{x} \frac{1}{μ} \partial_{y} + μ ε \partial_{y} \frac{1}{ε} \partial_{x} \frac{1}{μ} \end{array} \begin{array}{c} | \\ | \\ | \end{array} \begin{array}{c} μ ε \partial_{x} \frac{1}{ε} \partial_{y} \frac{1}{μ} - μ \partial_{y} \frac{1}{μ} \partial_{x} \\ - - - - - - - - - \\ μ \partial_{x} \frac{1}{μ} \partial_{x} + μ ε \partial_{y} \frac{1}{ε} \partial_{y} \frac{1}{μ} \end{array}] .

Note that when $ε$ and $μ$ are constant in space, W becomes $\nabla_{T}^{2}$ . In the case where $μ$ is constant we can use the fact that

$ε \partial_{x} \frac{1}{ε} = \partial_{x} - \frac{(\partial_{x} ε)}{ε}$ to get

(2.66)

W = [\begin{array}{c} \partial_{x x} + \partial_{y y} - \frac{(\partial_{x} ε)}{ε} \partial_{x} \\ - - - - - - - - - \\ - \frac{(\partial_{y} ε)}{ε} \partial_{x} \end{array} \begin{array}{c} | \\ | \\ | \end{array} \begin{array}{c} - \frac{(\partial_{x} ε)}{ε} \partial_{y} \\ - - - - - - - - - \\ \partial_{x x} + \partial_{y y} - \frac{(\partial_{y} ε)}{ε} \partial_{y} \end{array}] .

Throughout the rest of this work, it will be assumed that $μ$ is constant. Under this set of conditions, it is now useful to expand the equation for getting the electric field from the Hertz vectors.

(2.67)

E = \frac{4 π}{ε} R_{e} - \frac{1}{c^{2}} \partial_{t t} Π_{n} = \frac{4 π}{ε} (- ε \nabla l) - \frac{1}{c^{2}} \partial_{t t} Π_{e} .

We can substitute in the relations for $l$ and write the result in matrix form.

(2.68)

[\begin{array}{c} E_{x} \\ - - - \\ E_{y} \\ - - - \\ E_{z} \end{array}] = \frac{1}{μ ε} [\begin{array}{c} (\partial_{x} - \frac{(\partial_{x} ε)}{ε}) \partial_{x} \\ - - - - - - \\ (\partial_{y} - \frac{(\partial_{y} ε)}{ε}) \partial_{x} \\ - - - - - - \\ \partial_{x} \partial_{z} \end{array} \begin{array}{c} | \\ | \\ | \end{array} \begin{array}{c} (\partial_{x} - \frac{(\partial_{x} ε)}{ε}) \partial_{y} \\ - - - - - - \\ (\partial_{y} - \frac{(\partial_{y} ε)}{ε}) \partial_{y} \\ - - - - - - \\ \partial_{y} \partial_{z} \end{array}] [\begin{array}{c} Π_{x} \\ - - - \\ Π_{y} \end{array}] - \frac{1}{c^{2}} \partial_{t t} [\begin{array}{c} Π_{x} \\ - - - \\ Π_{y} \end{array}] - 4 π [\begin{array}{c} \partial_{x} l_{1} \\ - - - \\ \partial_{y} l_{1} \\ - - - \\ \frac{P_{z}}{ε} \end{array}] .

It turns out that by ignoring the terms involving the derivatives of

$ε$ , we almost regain the scalar approximation. To show this, we first expand each component of the electric field given this approximation:

(2.69)

E_{x} = \frac{1}{μ ε} (\partial_{x x} Π_{x} + \partial_{x} \partial_{y} Π_{y}) - \frac{1}{c^{2}} \partial_{t t} Π_{x} - 4 π \partial_{x} l_{1},

(2.70)

E_{y} = \frac{1}{μ ε} (\partial_{x} \partial_{y} Π_{x} + \partial_{y y} Π_{y}) - \frac{1}{c^{2}} \partial_{t t} Π_{y} - 4 π \partial_{y} l_{1},

(2.71)

E_{z} = \frac{1}{μ ε} (\partial_{z} \partial_{x} Π_{x} + \partial_{z} \partial_{y} Π_{y}) - 4 π \frac{P_{z}}{ε} .

Next we take the divergence of this approximation to the electric field making the further approximation of once again commuting the transverse derivatives with $1 / ε$ :

(2.72)

\nabla \cdot E \approx \frac{1}{μ ε} (\partial_{x x} \partial_{x} Π_{x} + \partial_{x x} \partial_{y} Π_{y}) - \frac{1}{c^{2}} \partial_{t t} \partial_{x} Π_{x} - 4 π \partial_{x x} l_{1}

+ \frac{1}{μ ε} (\partial_{y y} \partial_{x} Π_{x} + \partial_{y y} \partial_{y} Π_{y}) - \frac{1}{c^{2}} \partial_{t t} \partial_{y} Π_{y} - 4 π \partial_{y y} l_{1}

+ \frac{1}{μ ε} (\partial_{z z} \partial_{x} Π_{x} + \partial_{z z} \partial_{y} Π_{y}) - 4 π \partial_{z} \frac{P_{z}}{ε}

To take care of the $z$ derivatives of the Hertz field we make use of the equation of motion once again making the approximation that the transverse derivatives commute with $1 / ε$ :

(2.73)

\frac{1}{μ ε} \partial_{z z} \partial_{x} Π_{x} \approx \frac{1}{μ ε} (- \partial_{x x} - \partial_{y y} - \frac{μ ε ω^{2}}{c^{2}}) \partial_{x} Π_{x} - \frac{4 π}{ε} \partial_{x} Π_{x} + 4 π \partial_{x x} l_{1} .

(2.74)

\frac{1}{μ ε} \partial_{z z} \partial_{y} Π_{y} \approx \frac{1}{μ ε} (- \partial_{x x} - \partial_{y y} + \frac{μ ε ω^{2}}{c^{2}}) \partial_{y} Π_{y} - \frac{4 π}{ε} \partial_{y} Π_{y} + 4 π \partial_{y y} l_{1} .

Substituting these back in makes almost everything cancel:

(2.75)

\nabla \cdot E \approx - \frac{4 π}{ε} \nabla \cdot P .

So we have made the scalar approximation after commuting derivatives as regards $ε$ , but not regarding the polarization. The propagation code is fitted with a flag ISCALAR to make this a runtime option. If the divergence of the nonlinear polarization is sufficiently small this should give results close to the scalar approximation. For comparison with a true scalar approximation, a separate routine was written propagating the electric field directly with the scalar and paraxial approximations using the same numerical methods as were used for the Hertz vector program.

The Equations for Beam Propagation

The numerical effort of solving these equations is radically reduced when it can be assumed that the beam is traveling in one direction and that its frequency components are tightly centered around a carrier frequency

$ω$ . The first assumption allows the possibility of taking $z$

steps larger than the wavelength. The second allows time steps longer than its shortest period. To make use of the conditions we write

$Π = Re \tilde{Π} exp (i β_{0} z - i ω t)$ , $q = Re \tilde{q} exp (i β_{0} z - i ω t)$ . Then the equations of motion become

(2.76)

\frac{μ ε}{c^{2}} (\partial_{t t} \tilde{Π} - 2 i ω \partial_{t} \tilde{Π} - ω^{2} \tilde{Π}) + β_{0}^{2} \tilde{Π} - 2 i β_{0} \partial_{z} \tilde{Π} - \partial_{z z} \tilde{Π} = W \tilde{Π} + 4 π \tilde{q} .

For a beam with tightly centered frequency components, the amplitude varies slowly compared to oscillation at the carrier frequency so $| \partial_{t t} \tilde{Π} | ≪ | ω \partial_{t} \tilde{Π} |$ . For the paraxial approximation a similar assumption is made about the rate of variation of the envelope function compared to the propagation constant;

$| \partial_{z z} \tilde{Π} | ≪ | β_{0} \partial_{z} \tilde{Π} |$ . In free space this is equivalent to stating that all the light rays of interest are nearly parallel to the $z$ axis. Several attempts were made to correct the paraxial approximation by treating the $\partial_{z z} \tilde{Π}$ term as a small perturbation, but none of the methods tried proved to be stable. Another possibility is to follow optical cycles, but is too computationally intensive. Some versions of the Beam-Propagation method correct the paraxial approximation by exponentiating the square root of the transverse operator. This method depends on being able to represent sources in terms of a varying $ε$ . This is not so easily done for the Hertz vectors since the relation between the propagated field and the polarization is complicated by the conversion from

$\tilde{Π}$ to $\tilde{E}$ . Thus, even though there is question as to whether the paraxial or the scalar approximation breaks down first, the paraxial approximation is used in this work. As a later chapter will show, this still leads to interesting results. Let us split the dielectric permittivity into constant and space-varying parts; i.e., $ε = ε_{0} + ε_{1}$ , where $ε_{0}$ is constant. $ε_{1}$ can be much smaller than $ε_{0}$ and still produce large results. The equations become:

(2.77)

- \frac{2 i ω μ ε_{0}}{c^{2}} \partial_{t} \tilde{Π} - \frac{μ ε_{0} ω^{2}}{c^{2}} \tilde{Π} + β_{0}^{2} \tilde{Π} - 2 i β_{0} \partial_{z} \tilde{Π} = W \tilde{Π} + \frac{2 i μ ε_{1} ω}{c^{2}} \partial_{t} \tilde{Π} + 4 π \tilde{q} + \frac{μ ε_{1} ω^{2}}{c^{2}} \tilde{Π} + \partial_{z z} \tilde{Π} .

Two very large terms cancel if we set

$β_{0} = ω \sqrt{μ ε_{0}} / c$ which makes the above equation

(2.78)

= - \frac{2 i μ ε_{0} ω}{c^{2}} \partial_{t} \tilde{Π} - \frac{2 i ω \sqrt{μ ε_{0}}}{c} \partial_{z} \tilde{Π} .

These two terms can be consolidated if we switch to a frame of reference which moves at the speed of a carrier frequency beam through an

$ε_{0}$ dielectric. To do so we go to a "retarded time" $t' = t - \frac{\sqrt{ε_{0} μ}}{c}$ so that

(2.79)

\partial_{z} \tilde{Π} = \partial_{z'} \tilde{Π} - \frac{\sqrt{ε_{0} μ}}{c} \partial_{t'} \tilde{Π}, \partial_{t} \tilde{Π} = \partial_{t'} \tilde{Π} .

The equations of motion are reduced to

(2.80)

\partial_{z'} \tilde{Π} = \frac{i c}{2 ω \sqrt{μ ε_{0}}} (W + \frac{μ ε_{1} ω^{2}}{c^{2}}) \tilde{Π} - \frac{c ε_{1} \sqrt{μ}}{ω \sqrt{ε_{0}}} \partial_{t} \tilde{Π} + \frac{i c 4 π}{2 ω \sqrt{μ ε_{0}}} \tilde{q} - \frac{i c}{2 ω \sqrt{μ ε_{0}}} \partial_{z z} Π .

For the case of a constant dielectric medium the time derivative disappears entirely, with the possible exception of the computation of the nonlinear source terms. This allows each time slice to be propagated separately, a tremendous saving. Fortunately, this can still be done as long as the time variation is sufficiently slow. $ε_{1}$ has its importance in that it is multiplied by $ω / c$ in its first appearance. At optical and infrared frequencies this is a large number. The time derivative term is divided by this large number as well as multiplied by the usually small $ε_{1}$ so there are many cases of interest where this term can be safely dropped. However, this research is restricted to cases of no time variation of the envelope as storing three dimensions of field is prohibitive in terms of computer memory. Also, the effects of vector coupling are more readily studied, at first, by simplifying other conditions.

The Equations for Guided Modes

One very useful aspect of this Hertz vector formulation is the simplification of the equation for the guided modes in a dielectric waveguide with a graded index of refraction. All that needs to be done is to remove the nonlinear polarization from the previous analysis and set the envelope function to be constant in time and the direction of propagation. That is, let

$Π = Re Π_{n} (x, y) exp (i β_{n} z - i ω t)$ The equation of motion becomes

(2.81)

(W + \frac{μ ε_{1} ω^{2}}{c^{2}}) Π_{n} = (β_{n}^{2} - \frac{μ ε_{0} ω^{2}}{c^{2}}) Π_{n} .

The propagation constant for each mode is directly related to the eigenvalues $λ_{n}$ of the operator on the left.

(2.82)

β_{n} = \pm {(λ_{n} + \frac{μ ε_{0} ω^{2}}{c^{2}})}^{\frac{1}{2}} .

As previously advertised, here we have an explicit eigenvalue equation for finding the bound modes given a particular frequency $ω$ . All that remains is to make a matrix representation of the operator and send it to the proper EISPACK routine. However, the remaining task is not trivial. Taking two transverse dimensions into account results in a product space easily requiring a huge number of elements to get enough accuracy if too primitive a method for estimating the derivatives is used. The above formulation can be used for a finite element formulation. Or the field could expanded in orthogonal basis functions. Once the eigenvalues are found, there is still the task of sorting out the resulting data. For most cases of interest, the number of bound modes is finite, but the number of unbound modes is uncountably infinite. Fortunately, these problems have already been dealt with for the scalar case. This formulation of the vector case presents no new difficulties save that the order of the matrix must be twice as large. The electric field can be obtained by suitable modification of (2.68).

(2.83)

[\begin{array}{c} E_{n x} \\ - - - \\ E_{n y} \\ - - - \\ E_{n z} \end{array}] = \frac{1}{μ ε} [\begin{array}{c} (\partial_{x} - \frac{(\partial_{x} ε)}{ε}) \partial_{x} \\ - - - - - - \\ (\partial_{y} - \frac{(\partial_{y} ε)}{ε}) \partial_{x} \\ - - - - - - \\ i β_{n} \partial_{x} \end{array} \begin{array}{c} | \\ | \\ | \end{array} \begin{array}{c} (\partial_{x} - \frac{(\partial_{x} ε)}{ε}) \partial_{y} \\ - - - - - - \\ (\partial_{y} - \frac{(\partial_{y} ε)}{ε}) \partial_{y} \\ - - - - - - \\ i β_{n} \partial_{y} \end{array}] [\begin{array}{c} Π_{n x} \\ - - - \\ Π_{n y} \end{array}] + \frac{ω^{2}}{c^{2}} [\begin{array}{c} Π_{n x} \\ - - - \\ Π_{n y} \end{array}] .

A bit of general insight is available from just looking at the above equation. We see that in order to have a transverse electric mode it is necessary that $\partial_{x} Π_{n x} = - \partial_{y} Π_{n y}$ This is equivalent to saying that $Π_{n x} = \partial_{y} f, Π_{n y} = - \partial_{x} f$ for an arbitrary scalar $f$ . This condition for $Π_{n}$ causes all the matrix terms to cancel, making $E_{n} = (ω^{2} / c^{2}) Π_{n}$ . This also causes $E_{n}$ to obey the scalar approximation since

(2.84)

\nabla \cdot E_{n} = \frac{ω^{2}}{c^{2}} \nabla \cdot Π_{n} = \frac{ω^{2}}{c^{2}} (\partial_{x} \partial_{y} f - \partial_{y} \partial_{x} f) \equiv 0 .

By expanding W in equation (2.81), we can see the prospects for getting transverse electric modes:

(2.85)

[\begin{array}{c} \partial_{x x} + \partial_{y y} - \frac{(\partial_{x} ε)}{ε} \partial_{x} + \frac{μ ε_{1} ω^{2}}{c^{2}} \\ - - - - - - - - - \\ - \frac{(\partial_{y} ε)}{ε} \partial_{x} \end{array} \begin{array}{c} | \\ | \\ | \end{array} \begin{array}{c} - \frac{(\partial_{x} ε)}{ε} \partial_{y} \\ - - - - - - - - - \\ \partial_{x x} + \partial_{y y} - \frac{(\partial_{y} ε)}{ε} \partial_{y} + \frac{μ ε_{1} ω^{2}}{c^{2}} \end{array}] Π_{n} = λ_{n} Π_{n} .

Most pleasantly, the condition for a transverse mode causes the terms with derivatives of epsilon to cancel. Thus, we effectively have two separate equations for $Π_{n x}$ and $Π_{n y}$ . Unfortunately, it is the same equation. Since $Π_{n x}$ cannot equal $Π_{n y}$ (and still obey $\partial_{x} Π_{n x} = \partial_{y} Π_{n y}$ ) without being constant, this leads to difficulty. Otherwise, it would be a very nice situation. The equation to be solved is merely a time independent Schr $\overset{..}{o}$ dinger equation in which variation in the dielectric performs the same role that the potential plays in Quantum Mechanics. This is the case for the scalar approximation. We could merely make trivial changes in existing techniques and perform the conversion from Hertz potential to electric field. But the requirement that two different functions must obey this equation and have the same eigenvalue to boot restricts us to degenerate solutions. And not all degenerate solutions are permissible. The two functions must be the x and -y partials of a third function. This condition rules out most separable solutions. For this case, the eigenfunctions are products of one-dimensional eigenfunctions. For the last condition to be satisfied, the $\partial_{x}$ and $\partial_{y}$ operators must convert one eigenfunction into another much like the raising or lowering operators for the harmonic oscillator. For example, for a parabolic dielectric, this limits us to one possible transverse electric mode, and the parabola must be symmetric for that. The $\partial_{x}$ and $\partial_{y}$ operators behave like raising operators when operating on the ground state (only). So setting $f$ to be the "ground state" Gaussian allows $\partial_{x} f$ and $\partial_{y} f$ to be eigenfunctions of our "Schr $\overset{..}{o}$ dinger" equation. Of course with a sufficiently large waveguide, very nearly transverse modes are quite feasible without degenerate solutions to the "Schr $\overset{..}{o}$ dinger" equation. As long as the transverse derivatives of the Hertz field are negligible compared to $ω / c$ times the Hertz field, the electric field is dominated by the terms involving $ω^{2} / c^{2}$ all of which affect the transverse components only. A similar simple condition can be found for transverse magnetic waves. Since $Π_{m}$ and $R_{m}$ are zero, the expression for the magnetic field is particularly simple.

(2.86)

B = \frac{1}{c} \partial_{t} \nabla \times Π_{n} .

This can also be put into matrix form:

(2.87)

[\begin{array}{c} B x \\ - - - \\ B y \\ - - - \\ B z \end{array}] = - i \frac{ω}{c} [\begin{array}{c} 0 \\ - - - - \\ i β_{n} \\ - - - - \\ - \partial_{y} \end{array} \begin{array}{c} | \\ | \\ | \end{array} \begin{array}{c} - i β \\ - - - - \\ 0 \\ - - - - \\ \partial_{x} \end{array}] [\begin{array}{c} Π_{n x} \\ - - - \\ Π_{n y} \end{array}] .

By trivial inspection, the condition for a transverse magnetic mode is $\partial_{y} Π_{n x} = \partial_{x} Π_{n x}$ . This is equivalent to $Π_{n x} = \partial_{x} f, Π_{n y} = \partial_{y} f$ , where $f$

is some scalar. Unfortunately, the two equations that $f$ must satisfy are not as simple as they were for the transverse electric case. No analytic solutions were found. This look at the bound modes produced only one analytic solution and it is unphysical: a parabolic dielectric which goes to negative infinity. Also, the cancellations of the derivative of the dielectric terms hide the singularities that occur when $ε$ goes to zero. Fortunately, the field is only significant in the physical regions, at least for most parameters of interest. So this one analytic solution can be used as a test for the propagation method that is developed in the next chapter.

An Electric Field Method

To compare the results of the Hertz vector method with known methods, a second program was written to propagate the physical electric field using the same numerical techniques as were used for the Hertz vector program. This program gives the results of completely making the scalar approximation for comparison purposes. By adding an extra source term, it is possible to come close to a full vector treatment using the physical field when the background dielectric is nearly constant. This feature is useful for providing something of a check of the results of the Hertz-vector program. We start with usual approach for getting the vector wave equation; that is, we take the curl of (2.1) and substitute in (2.3) assuming $μ$ is constant.

(2.88)

\nabla \times \nabla \times E + \frac{1}{c} \partial_{t} (\frac{μ ε}{c} \partial_{t} E + \frac{4 π μ}{c} \partial_{t} P) = 0 .

Here, the source terms have been described in terms of a polarization density so $D = ε E + 4 π P$ . With the use of a vector identity we get

(2.89)

- \nabla^{2} E + \frac{μ ε}{c^{2}} \partial_{t t} E = - \frac{4 π μ}{c^{2}} \partial_{t t} P - \nabla (\nabla \cdot E) .

Factoring out the usual carrier frequencies and splitting $ε$ , we get

(2.90)

β_{0}^{2} \tilde{E} - 2 i β_{0} \tilde{E} - \partial_{z z} \tilde{E} - \frac{μ ε_{0} ω^{2}}{c^{2}} \tilde{E} = \frac{μ ε_{1} ω^{2}}{c^{2}} \tilde{E} + \frac{4 π μ ω^{2}}{c^{2}} \tilde{P} - \nabla' (\nabla' \cdot \tilde{E}) .

The $\nabla'$ is to remind us to replace $\partial_{z}$ with $i β_{0} + \partial_{z}$ . With the usual definition for $β_{0}$ and making the paraxial approximation we have

(2.91)

- 2 i β_{0} \partial_{z} \tilde{E} = \frac{μ ε_{1} ω^{2}}{c^{2}} \tilde{E} + \frac{4 π μ ω^{2}}{c^{2}} \tilde{P} - \nabla' (\nabla' \cdot \tilde{E}) .

If the variation of $ε$ is small, we can approximate

$\nabla' (\nabla' \cdot \tilde{E})$ in terms of the polarization.

(2.92)

\nabla \cdot D = 0 = \nabla \cdot ε E + 4 π P, \nabla \cdot E \approx - \frac{4 π}{ε} \nabla \cdot P .

For significant variation in the dielectric permittivity, $ε_{1} E$ needs to be incorporated in $P$ for the non-scalar physical field method to work. The ability to incorporate variation in the linear dielectric is a major advantage of the Hertz-vector approach. Finally, we have

(2.93)

- 2 i β_{0} \partial_{z} \tilde{E} = \frac{μ ε_{1} ω^{2}}{c^{2}} \tilde{E} + \frac{4 π μ ω^{2}}{c^{2}} \tilde{P} + \nabla' (\nabla' \cdot \tilde{P}) .

To get the scalar approximation, it is merely necessary to turn off the last term. Note there are problems with using the above as a full vector method. There are second derivatives with respect to $z$ in the source term. These are taken somewhat crudely. The Hertz-vector method requires only first derivatives with respect to $z$ in the calculation of the source term. It is also easy to use an initial field that is far from equilibrium using this method. For example, it is quite tempting to just start up with, say, a Gaussian beam with just an $x$ component. For large transverse variation ${\tilde{E}}_{x}$ , ${\tilde{E}}_{z}$ must rapidly change in the $z$ direction to make $\nabla \cdot E = 0$ . That is,

(2.94)

\partial_{x} {\tilde{E}}_{x} + \partial_{z} {\tilde{E}}_{z} + i β_{0} 0 = 0 .

A better initial selection would have the $i β_{0} {\tilde{E}}_{z}$ term cancel out the $\partial_{x} {\tilde{E}}_{x}$ term. The Hertz vector method does this automatically. This is especially important when the $z$ steps are not small compared to a wavelength.

References

[1] A. Nisbet, Proceedings of the Royal Society A 231, 250-263 (1955).

[2] A. Nisbet, Proceedings of the Royal Society A 240, 375-381 (1957).

[3] A. Mohsen, Applied Physics 2 , 123-128 (1973).

[4] A. Mohsen, Applied Physics 10, 53-55 (1976).

Prev: It Begins

Next: A Five-Point Crank-Nicolson Method

1 COMMENTS

Chris Price on Oct 24, 2023 9:29 PM

The formatting is impressive. As to the content, I've only stayed at a Holiday Inn Express once. So...

Replies:

You must be logged in to comment

Carl Milsted, Jr