2023 Notes: Another exercise in converting troff into QTML. I used vim to replace dollar signs to .eqni tags for the inline equations. When an inline equation takes two lines of source, this method swapped opening and closing tags. I manually cleaned things up, but I probably missed a couple, and likely messed up some paragraph breaks. Chapter 1, by the way, is here.
The six components of the and fields are redundant as regards the underlying mathematics of the electromagnetic field. Nisbet [1] has shown that with the use of the Hertz vectors and their associated gauge transformations it is possible to describe the electromagnetic field in terms of only two components of the six possible Hertz vector components, even in the source regions. In a subsequent paper [2] he develops a similar approach for fields in inhomogeneous, anisotropic media. A literature search finds that little note has been taken of these very fundamental techniques which have the potential of simplifying many problems in electrodynamics. In the hope of popularizing this knowledge, the Hertz vector formulation is here re-derived for the isotropic case with some steps and commentary added, before we state the specific gauge condition to be used for this research. We start with Maxwell's equations in Gaussian cgs units:
and with the constitutive relations
Note that in this formulation, that these constitutive relations only take care of the linear polarization. The nonlinear polarization is contained in and . We have four sets of coupled three-component vector partial differential equations to solve for the six scalar components of the electric and magnetic fields. Since the fields are not independent, we can reduce the number of components to be solved for by describing the fields in terms of potentials. For example, the vector and scalar potentials can be defined inside an inhomogeneous medium in the same fashion as in free space:
and
These identically satisfy (2.1) and (2.2). We now have only four components to find and two vector equations for them to satisfy. Unfortunately, the two remaining Maxwell equations lead to the coupled equations:
The main advantage of potentials is that they are not unique. If we set
and are unaffected. This allows various gauge transformations to be used to simplify the equations for a particular problem. For example, the gauge transformations allow us to set
This separates the scalar and vector equations similarly to the Lorentz equation in free space. The equations of motion are now
If the permittivity or permeability varies in space, however, the equations for the components of are coupled. Also, the mixed second partial derivatives with respect to and the transverse coordinates rule out an explicit eigenvalue-eigenfunction formulation for finding the bound modes. One important idea behind using the Hertz vector formulation is that it allows a much larger selection of gauge transformations to choose from than and . This is accomplished by expressing the source terms as well as the fields in terms of potentials. To start with, we define stream potentials and such that
and
These potentials fit in Maxwell's equations in the same positions as the polarization and magnetization densities. If there are no free charges, they could well be those densities. Even without free charges, and could differ from and
because we allow the following gauge transformations which leave the physical sources and unaffected:
Next, we define stream potentials and for the magnetic monopole charge and current densities. Of course there are no such (known) things as magnetic monopoles so the right-hand sides of our definitions are set to zero:
and
The (un)physical sources are invariant when the following gauge transformations are made.
When these two potentials are used, and are must become the potentials for and . That is,
The Hertz vectors are potentials which use these stream potentials instead of their derivatives as their source terms. This is done by defining and through the equations:
This choice automatically satisfies our variant of the Lorentz condition. To find the physical fields in terms of the Hertz vectors, we plug into the definitions of the vector and scalar potentials:
This can be simplified by letting in the gauge transformation for . This gives
And for :
To obtain equations of motion for the Hertz vectors, we first plug into (2.4):
We can remove the divergence at the price of introducing a curl:
where is an as yet arbitrary vector. From (2.3) we get
Substituting our relation for we have:
To get rid of , we let and cancel terms to get
or
and the equation for becomes
We could have times the gradient of an arbitrary scalar tacked to the end of this equation, but this is taken into account by the available gauge transformation of . Finally, we can use these equations of motion to get symmetric equations for and :
The next step is to find the available gauge transformations for the Hertz vectors. The first two gauge transformations follow easily from the definition of the Hertz vectors in terms of the scalar and vector potentials (2.23,2.24). The scalar potential is obviously unaffected when undergoes the transformation . In order to keep constant we must make the transformation . Setting and to zero and substituting into (2.36) gives
Since this must hold true even without the introduction of stream potentials for magnetic monopoles, and and transform the monopole potentials, we set them to zero. Thus must satisfy:
The last term is the as yet arbitrary gradient that is allowed when a curl is set to zero. The remaining obvious gauge transformation is that of adding the gradient of a scalar to which leaves the vector potential
unaffected ( ). Applying both transformations, setting the monopole stream potentials to zero and plugging into the equation of motion for gives
Substitute back in the equation for :
So
The gauge functions for the Hertz vectors resulting from the gauge transformations for the magnetic monopole potentials are less obvious by themselves since both the Hertz vector gauge functions and the magnetic monopole gauge functions appear in the redefinition of and (2.23). However, from the abundance of symmetries encountered so far (in and , and , and and , and and ), it is possible to guess the Hertz vector gauge transformations resulting from the transformations for and from those resulting from the transformations of and . So here is the complete set of available gauge transformations; the remaining verifications, if desired, are left to the reader:
if the gauge functions are related by the following relations:
where and are any arbitrary functions. Nisbet claims that this set of gauge transformations allows us to set all but any two components to zero. In actual practice this is accomplished by setting all but two of the Hertz vector components to zero at a particular time. Then the stream potentials are transformed in order to make the time derivatives of the zeroed-out components equal to zero in the equations of motion for the Hertz vectors. Nisbet and later Mohsen [2-4] then describe various special cases in which this formalism produces simplified equations. The emphasis is on situations in which one can obtain two uncoupled scalar partial differential equations. Such situations include materials which vary in only one direction, spherically symmetric media, and materials which vary in two dimensions while the field does not vary in the third dimension. Uncoupled scalar equations can be obtained by setting all but one component each of and
to be zero. For media which vary in the direction this is the
component. For spherically symmetric media this is the direction. For media which vary in and , while the field does not vary in , this is the component. The original research begins with the choice of gauges. For the case of guided optical devices it is desirable to have equations of motion which have no cross derivatives with respect to the propagation direction and either of the transverse directions. Such equations allow the solution of bound modes to be an explicit eigenvalue equation in which the propagation constant is proportional to the eigenvalues of a transverse operator. Also, such a formulation simplifies the task of formulating a vector beam propagation method since the task (for no nonlinearity) is reduced to approximating the exponential of a transverse operator. This is, of course, the case for the scalar beam propagation method. For the purposes of this work we represent all non-linear phenomena through the polarization density . Thus, we can set . The equations of motion for the Hertz vectors become
For a waveguide device pointing in the direction, we can get the desired form by setting and to be the non-zero components. We can keep zero if we set
Here and are arbitrary. As it turns out, the trivial choice of setting and to be zero is quite sufficient. This leaves as a free parameter to make to be zero. Expanding what is left into components gives:
Setting zero in the last equation leaves
For a waveguide in the direction, and are not dependent on . So we can commute them with the operator to get
where
Substituting this value of into (49) and (50) provides our equations of motion for the remaining components.
We can write this as
where
and
Note that when and are constant in space, W becomes . In the case where is constant we can use the fact that
to get
Throughout the rest of this work, it will be assumed that is constant. Under this set of conditions, it is now useful to expand the equation for getting the electric field from the Hertz vectors.
We can substitute in the relations for and write the result in matrix form.
It turns out that by ignoring the terms involving the derivatives of
, we almost regain the scalar approximation. To show this, we first expand each component of the electric field given this approximation:
Next we take the divergence of this approximation to the electric field making the further approximation of once again commuting the transverse derivatives with :
To take care of the derivatives of the Hertz field we make use of the equation of motion once again making the approximation that the transverse derivatives commute with :
Substituting these back in makes almost everything cancel:
So we have made the scalar approximation after commuting derivatives as regards , but not regarding the polarization. The propagation code is fitted with a flag ISCALAR to make this a runtime option. If the divergence of the nonlinear polarization is sufficiently small this should give results close to the scalar approximation. For comparison with a true scalar approximation, a separate routine was written propagating the electric field directly with the scalar and paraxial approximations using the same numerical methods as were used for the Hertz vector program.
The numerical effort of solving these equations is radically reduced when it can be assumed that the beam is traveling in one direction and that its frequency components are tightly centered around a carrier frequency
. The first assumption allows the possibility of taking
steps larger than the wavelength. The second allows time steps longer than its shortest period. To make use of the conditions we write
, . Then the equations of motion become
For a beam with tightly centered frequency components, the amplitude varies slowly compared to oscillation at the carrier frequency so . For the paraxial approximation a similar assumption is made about the rate of variation of the envelope function compared to the propagation constant;
. In free space this is equivalent to stating that all the light rays of interest are nearly parallel to the axis. Several attempts were made to correct the paraxial approximation by treating the term as a small perturbation, but none of the methods tried proved to be stable. Another possibility is to follow optical cycles, but is too computationally intensive. Some versions of the Beam-Propagation method correct the paraxial approximation by exponentiating the square root of the transverse operator. This method depends on being able to represent sources in terms of a varying . This is not so easily done for the Hertz vectors since the relation between the propagated field and the polarization is complicated by the conversion from
to . Thus, even though there is question as to whether the paraxial or the scalar approximation breaks down first, the paraxial approximation is used in this work. As a later chapter will show, this still leads to interesting results. Let us split the dielectric permittivity into constant and space-varying parts; i.e., , where is constant. can be much smaller than and still produce large results. The equations become:
Two very large terms cancel if we set
which makes the above equation
These two terms can be consolidated if we switch to a frame of reference which moves at the speed of a carrier frequency beam through an
dielectric. To do so we go to a "retarded time" so that
The equations of motion are reduced to
For the case of a constant dielectric medium the time derivative disappears entirely, with the possible exception of the computation of the nonlinear source terms. This allows each time slice to be propagated separately, a tremendous saving. Fortunately, this can still be done as long as the time variation is sufficiently slow. has its importance in that it is multiplied by in its first appearance. At optical and infrared frequencies this is a large number. The time derivative term is divided by this large number as well as multiplied by the usually small so there are many cases of interest where this term can be safely dropped. However, this research is restricted to cases of no time variation of the envelope as storing three dimensions of field is prohibitive in terms of computer memory. Also, the effects of vector coupling are more readily studied, at first, by simplifying other conditions.
One very useful aspect of this Hertz vector formulation is the simplification of the equation for the guided modes in a dielectric waveguide with a graded index of refraction. All that needs to be done is to remove the nonlinear polarization from the previous analysis and set the envelope function to be constant in time and the direction of propagation. That is, let
The equation of motion becomes
The propagation constant for each mode is directly related to the eigenvalues of the operator on the left.
As previously advertised, here we have an explicit eigenvalue equation for finding the bound modes given a particular frequency . All that remains is to make a matrix representation of the operator and send it to the proper EISPACK routine. However, the remaining task is not trivial. Taking two transverse dimensions into account results in a product space easily requiring a huge number of elements to get enough accuracy if too primitive a method for estimating the derivatives is used. The above formulation can be used for a finite element formulation. Or the field could expanded in orthogonal basis functions. Once the eigenvalues are found, there is still the task of sorting out the resulting data. For most cases of interest, the number of bound modes is finite, but the number of unbound modes is uncountably infinite. Fortunately, these problems have already been dealt with for the scalar case. This formulation of the vector case presents no new difficulties save that the order of the matrix must be twice as large. The electric field can be obtained by suitable modification of (2.68).
A bit of general insight is available from just looking at the above equation. We see that in order to have a transverse electric mode it is necessary that This is equivalent to saying that for an arbitrary scalar . This condition for causes all the matrix terms to cancel, making . This also causes to obey the scalar approximation since
By expanding W in equation (2.81), we can see the prospects for getting transverse electric modes:
Most pleasantly, the condition for a transverse mode causes the terms with derivatives of epsilon to cancel. Thus, we effectively have two separate equations for and . Unfortunately, it is the same equation. Since cannot equal (and still obey ) without being constant, this leads to difficulty. Otherwise, it would be a very nice situation. The equation to be solved is merely a time independent Schr dinger equation in which variation in the dielectric performs the same role that the potential plays in Quantum Mechanics. This is the case for the scalar approximation. We could merely make trivial changes in existing techniques and perform the conversion from Hertz potential to electric field. But the requirement that two different functions must obey this equation and have the same eigenvalue to boot restricts us to degenerate solutions. And not all degenerate solutions are permissible. The two functions must be the x and -y partials of a third function. This condition rules out most separable solutions. For this case, the eigenfunctions are products of one-dimensional eigenfunctions. For the last condition to be satisfied, the and operators must convert one eigenfunction into another much like the raising or lowering operators for the harmonic oscillator. For example, for a parabolic dielectric, this limits us to one possible transverse electric mode, and the parabola must be symmetric for that. The and operators behave like raising operators when operating on the ground state (only). So setting to be the "ground state" Gaussian allows and to be eigenfunctions of our "Schr dinger" equation. Of course with a sufficiently large waveguide, very nearly transverse modes are quite feasible without degenerate solutions to the "Schr dinger" equation. As long as the transverse derivatives of the Hertz field are negligible compared to times the Hertz field, the electric field is dominated by the terms involving all of which affect the transverse components only. A similar simple condition can be found for transverse magnetic waves. Since and are zero, the expression for the magnetic field is particularly simple.
This can also be put into matrix form:
By trivial inspection, the condition for a transverse magnetic mode is . This is equivalent to , where
is some scalar. Unfortunately, the two equations that must satisfy are not as simple as they were for the transverse electric case. No analytic solutions were found. This look at the bound modes produced only one analytic solution and it is unphysical: a parabolic dielectric which goes to negative infinity. Also, the cancellations of the derivative of the dielectric terms hide the singularities that occur when goes to zero. Fortunately, the field is only significant in the physical regions, at least for most parameters of interest. So this one analytic solution can be used as a test for the propagation method that is developed in the next chapter.
To compare the results of the Hertz vector method with known methods, a second program was written to propagate the physical electric field using the same numerical techniques as were used for the Hertz vector program. This program gives the results of completely making the scalar approximation for comparison purposes. By adding an extra source term, it is possible to come close to a full vector treatment using the physical field when the background dielectric is nearly constant. This feature is useful for providing something of a check of the results of the Hertz-vector program. We start with usual approach for getting the vector wave equation; that is, we take the curl of (2.1) and substitute in (2.3) assuming is constant.
Here, the source terms have been described in terms of a polarization density so . With the use of a vector identity we get
Factoring out the usual carrier frequencies and splitting , we get
The is to remind us to replace with . With the usual definition for and making the paraxial approximation we have
If the variation of is small, we can approximate
in terms of the polarization.
For significant variation in the dielectric permittivity, needs to be incorporated in for the non-scalar physical field method to work. The ability to incorporate variation in the linear dielectric is a major advantage of the Hertz-vector approach. Finally, we have
To get the scalar approximation, it is merely necessary to turn off the last term. Note there are problems with using the above as a full vector method. There are second derivatives with respect to in the source term. These are taken somewhat crudely. The Hertz-vector method requires only first derivatives with respect to in the calculation of the source term. It is also easy to use an initial field that is far from equilibrium using this method. For example, it is quite tempting to just start up with, say, a Gaussian beam with just an component. For large transverse variation , must rapidly change in the direction to make . That is,
A better initial selection would have the term cancel out the term. The Hertz vector method does this automatically. This is especially important when the steps are not small compared to a wavelength.
[1] A. Nisbet, Proceedings of the Royal Society A 231, 250-263 (1955).
[2] A. Nisbet, Proceedings of the Royal Society A 240, 375-381 (1957).
[3] A. Mohsen, Applied Physics 2 , 123-128 (1973).
[4] A. Mohsen, Applied Physics 10, 53-55 (1976).
Chris Price on Oct 24, 2023 9:29 PM
The formatting is impressive. As to the content, I've only stayed at a Holiday Inn Express once. So...
You must be logged in to comment