2023 Notes: This is chapter 3 of my dissertation from 1991. If I were trying to solve the same problem today, I'd probably do something different, as computers are enormously more powerful than what I had available at the time. Even the supercomputers of the day were lame compared to today's consumer grade desktop machines. 100MB disk drives were the size of file cabinet drawers when I was doing this work, and RAM was incredibly expensive by today's standards. I forget how much we had availble.
To see the equations I was working up to solve, see Chapter 2 of my dissertation.
To solve the propagation equation, we will ignore the second derivative with respect to term. Attempts were made to treat the term as a perturbation from the paraxial approximation, but none of these proved to be stable. This done, the propagation equation can be written
The reason for writing the equation in this fashion is that in most cases the first matrix dominates the equation. The second matrix can be treated as a perturbation and is thus lumped in with the nonlinear source term.
For the development of a numerical method, the above equation can be cast in the abstract form
Here, is a transverse operator. We can get away with a method for a scalar equation since the first operator in (3.1) is block-diagonal. Once is discretized in the transverse direction, we essentially have a very large system of ordinary differential equations. Due to storage considerations, it is desirable to use a method that does not store many values of at a time.
One reason for separating from the source terms is that has some very large eigenvalues, which can severely limit the step size for the wrong method. The large eigenvalues come from the transverse Laplacian operator. Any finite-difference approximation for will have elements inversely proportional to the square of . For single-mode fibers and similar structures, the overall window in each transverse dimension is on the order of a few wavelengths of the light in question. Thus, the matrix elements divided by can become considerably larger than the reciprocal of a wavelength.
Another way of looking at the discretized operator is that it approximates the derivative of all spatial frequencies up to the Nyquist frequency, . That is, there needs to be at least four points per cycle or the result is aliased to lower frequencies. Thus the highest eigenvalue of the discretized operator is about . Were the whole of this number would set the Nyquist frequency in , and would have to be smaller than
This sets the -step much smaller than the light wavelength, which completely nullifies the benefits of using an envelope formulation. For specific equations and difference methods, more precise results can often be obtained using von Neumann stability analysis. For example, for and when is positive and real, an Euler method using a three-point transverse discretization requires that the step size be smaller than ( see Ref. [1], page 636)
This is not much different from the very crude estimate (3.3), which assumes an exact estimate of at the Nyquist frequency in .
One way of dealing with these high eigenvalues is to exponentiate as in the Beam Propagation Method. Another is to use an implicit method. For second order accuracy in there is the trapezoidal method
This can be rewritten
is found by a series of predictions and corrections. Euler predictions were used for this particular case in order to save memory.
Were the matrix inversions shown above explicitly performed, we would have the Crank-Nicholson method. However, for two transverse dimensions we do not have the easily solved tridiagonal systems that result from only one dimension, so this is not feasible. Instead, we make an approximation by splitting , where
Then we approximate
so
Now we have two matrix inversions of band-diagonal matrices. The price is that of introducing the spurious cross term . The effect of the cross term can be reduced by factoring the explicit matrix in the same fashion. The actual difference equation that is solved is
This is the composite form of the Peaceman-Rachford ADI (alternating direction implicit) method [2].
The foregoing discussion justifies splitting off part of the matrix operator for equation 3.1 and treating the rest as a black-box source term. An unconditionally stable method can be applied to the first matrix due to its simplified form. Besides, the second matrix is assumed to be a sufficiently small perturbation so that a predictor-corrector will work. After all, much good work has been done by effectively treating this matrix as zero.
The question remains as how to find the matrix inverses for Equation (3.9). If a 3 point method is used to find estimate and , then the operators are tridiagonal and a simple recursion exists to accomplish the inverse operation. This is the way used to solve Crank-Nicholson systems. Here, we (mostly) follow the notation of Adams et al [3].
The inverse operation is first written as a system of linear equations:
Here is now an index for one of the transverse dimensions, is the vector to be found, A, B, and C are the three numbers found on each row of the tridiagonal matrix, and D is the column vector to be operated on. If we define and so that obeys the recursion relation , then and must obey
To start the recursions, we need a boundary condition for . For example, a homogeneous Dirichlet boundary condition ( ) is satisfied by which determines and
for to . Then, an assumed value (0 for this case) is assumed for to start off the recursion for .
This method is where is the dimension in the same transverse direction as the derivative and is the other transverse dimension. The loop across the other dimension is trivially vectorized so that recursion is not a major impediment to optimization. The above derivation was made for a radial Laplacian, which is not symmetric. For the case of this research (at least for even point spacings) so we have a small simplification:
The problem is that the three-point finite difference operator is not very accurate. This produces some ugly phase graphs when Gaussian or other exponentially decaying fields are involved. As the next section will show, operator symmetry is crucial for getting around this problem.
In comparing various means of solution for the scalar approximation, Yevick and Hermansson [4] find that the only significant difference between the Fourier transform based Beam Propagation method and certain finite-difference methods is that the finite-difference methods estimate the transverse derivatives less accurately. Using a five-point method for estimating the derivatives would take care of part of this objection. The question arises how to solve the resulting pentadiagonal systems efficiently.
The method used here is to factor the pentadiagonal matrix approximately into a product of two tridiagonal matrices. If we can set , then . The task of solving a pentadiagonal system is then reduced to solving two successive tridiagonal systems. To see if this is possible, we write out the expansion of the product of two tridiagonal matrices:
If we denote the pentadiagonal matrix by its diagonals we end up with a system of five nonlinear equations to solve per original row:
This author could find no general solution to this system and doubts that there is one. An approximation has been found which works for this program for most parameters of interest. For this particular case, the pentadiagonal operator is symmetric and only varies with . This is due to possible variations in the background dielectric. The first and fifth equations are trivially solved:
The symmetry allows the choices . This considerably simplifies the middle equation:
so that
Some trickery has been played. For , the assumption has to be made that , and therefore is constant along the diagonal. This is not true when varies. However, it is correct to first order if varies slowly enough. (This approximation was first made by mistake, through forgetting the variation along the diagonals. After this mistaken method worked nicely, it became necessary to rationalize the success.) will vary slowly if there are enough points so that changes a small amount from point to point, or if is small enough so that the constant term inversely proportional to the square of is much greater than the term involving . This is just the case for a waveguide whose dimensions are on the order of a wavelength as is the case for single mode systems.
From the second (and by symmetry, the fourth) equation comes the need for and to vary.
Next, we approximate to first order. This is only a local approximation. still varies along the diagonal. If the relative variation of from one row to an adjacent row is significant, this approximation is of low validity. With this approximation, we get a quadratic equation.
This is readily solved:
This, and (3.23) are most easily rendered consistent if is chosen to be constant. To eliminate a denominator, let us choose it to be one-half; then (3.23) leads to the value of in terms of the original pentadiagonal operator:
Likewise for :
In the code the plus sign was arbitrarily chosen and proven to work.
To watch out for cases in which this factorization is not valid, the maximum relative change in between two rows is found at startup. Should this number be unacceptable, it is possible to drop back to a three-point method.
The standard way of generating a five-point method for calculating the second derivative is to look at the Taylor series and choose weights in order to cancel out the lowest-order terms. Kuo and Levy [5] point out that there are other possible criteria for choosing a five- or - point rule. They review methods in which the finite-difference is exact for some of the homogeneous solutions for the differential equation to be solved. Or, it could be chosen to be exact for low-order polynomials locally multiplied by homogeneous solutions. This idea is known as "mode-dependent discretization."
For Gaussian beams and bound modes in dielectric media in general, the field dies off exponentially towards the edge. This results in a significant amount of phase error in this area. Given that there are memory restrictions in increasing the density of transverse points, I decided to look into this possibility of improving upon the five-point derivative operator.
To guarantee proper convergence as the step size goes to zero, a five point method for estimating the second derivative can be written as a weighted average of two three point methods of differing step sizes. To have scalable convergence analysis, these two step sizes should be written as constant factors multiplying some . For even point spacing the obvious choice is and . With this choice, the estimate for the second derivative becomes:
is a parameter to be adjusted to minimize the error according to criteria yet to be determined.
The standard criterion for choosing is to solve for exactly for as many terms as possible of the Taylor series expansion about [6]. The Taylor expansion gives:
So the inner finite difference results in
The lowest order error term goes to zero as . The outer finite difference results in
This also converges as but the constant multiplying the error term is larger. Also, the higher-order terms are more important as each successive term gets multiplied by an additional factor of 4. Getting these higher-order terms to partly cancel can be more important than the convergence rate of the lowest-order term in certain circumstances.
For the standard central finite difference, we should choose
so that the lowest-order error term is eliminated.
For taking the derivative of an unknown function, the standard method is usually best. It converges the most rapidly when high accuracy is sought by taking successively smaller step sizes. But for a differential equation, we really do not have a given function. When the step size decreases, the space of possible functions for the differential equation to choose from is altered. As previously stated, a finite-difference scheme "sees" a function as a weighted sum of frequencies up to the Nyquist limit. One possibility for choosing is to optimize over this space of functions.
Let us look at a particular frequency :
and so on. The general five-point approximation to the second derivative leads to
The leads to the following expression for the relative error:
At the Nyquist frequency equals . At this frequency the relative error becomes:
It takes a while for the factorial in the denominator to cancel out the growing terms in the numerator. The high-order terms are important at the upper frequencies.
Figure 3.1 shows a graph of the error for various frequencies for the three-point method; i.e., . This graph was made by applying the differentiation scheme to , subtracting off , dividing by , and taking the absolute value, so that the graph shows the absolute value of the relative error. is normalized by the Nyquist frequency which makes this graph -independent and covers the full range of applicable frequencies. The same procedure is applied to the standard five-point method for taking the second derivative ( ) in Figure 3.2. The error is considerably reduced for low spatial frequencies and is about one-third at the Nyquist frequency. In both of these graphs the error is monotonically increasing. The curve for the five-point method has the appearance of an exponential; obviously the higher-order terms are making a contribution.
The next step is to look for a five-point method which minimizes the error over the whole interval instead of concentrating on accuracy near . To do this, a program was written to compute the area under curves like that in Figure 3.2 and print the result for a range of values of . Figure 3.3 shows the result. Note that the minimum is around 1.4, whereas the standard method is approximately 1.33. Figure 3.4 shows a closer look at this minimum. If this criterion was used, we would have the error shown in Figure 3.5. This is the same type of plot as Figure 3.2 with now equal to 1.4. The error is no longer monotonically increasing. Between a normalized spatial frequency .7 and .8, the error actually goes to zero. Were this plot not an absolute value, the error would go through the origin instead of bouncing off zero. Were a least-squares criterion used for minimization, the zero value for the error would occur a bit closer to the Nyquist limit to clamp a bit more on the peak error. This would move us further still from the standard method, though not by much.
The trade-off for this procedure is that the error is increased for low frequencies although it is still better here than with the three-point method. If the modeling of a high-frequency function with a minimum number of points is desired, the trade-off is worth it.
Actually, the problem that led to this analysis was not high frequencies but steep exponentials. For such fields as Gaussian beams or bound modes with an evanescent field, the field dies off exponentially near the edge. Low-order polynomial fits do a worse job locally for exponentials than sinusoids (complex exponentials). The resulting error term is given by (3.53) with the factor removed. That is, is replaced by in the equations leading up to (3.53). Thus all the error terms work together \(em there is no cancellation. This error is a particular problem for bound modes since the normally flat phase has the phase of the error added to it resulting in ugly phase graphs even when the overall absolute error is small.
To minimize this error, was replaced by in the programs that produced Figures 3.1 to 3.5. Then we repeat the procedure used for the sinusoidal case. Figure 3.6 shows the error for the three-point method for exponentials. Note that is still normalized by the same for comparison purposes even though is no longer the Nyquist frequency. A comparison of Figure 3.1 and Figure 3.6 shows a small increase in error due to the constructive effect. The effect is far more noticeable for the standard fiv- point method since the higher-order terms have more effect. This can be seen by comparing Figure 3.2 to Figure 3.7.
The same procedure of graphing the area under the error curve for a range of values is done for Figure 3.8. This time the minimum error is for a value of less than that for the standard method. Figure 3.9 zooms in to show that the minimum is around . Figure 3.10 shows the resulting error curve for this value of . The zero-error point has moved to a somewhat higher value of , a bit above 0.8.
This whole procedure can be repeated for the first derivatives. Here, we will use a weighted average of an inner and an outer central difference. The inner finite difference is:
The lowest-order error term is of the same power as for the three-point method. However, these powers of multiply a lower-order term from the Taylor series. For sinusoids and exponentials the derivatives of all have the same magnitude so that there is little difference. But the factorials in the denominators are smaller for the lower-order terms so the error for the first derivatives is larger despite the division by 2. A comparison of Figure 3.11 with Figure 3.1 shows the increase in error for sinusoids.
Next, we look at the outer finite difference:
This results in the same standard value for as for the second derivative:
Using this we get Figure 3.12. This difference in error when compared with Figure 3.2 is more pronounced than for the three-point method because the difference in factorial order has more impact for higher factorials. Figure 3.13 shows the integral of the error versus . The minimum is not as sharp as for the second derivative and is at around as can be seen from Figure 3.14. This is a bigger deviation from the standard method than was the case for the second derivative. Figure 3.15 shows the error curve for this choice of .
Finally, the procedure is performed one more time for exponentials. Figures 3.16 - 3.20 show the by-now-familiar sequence of graphs. The resulting value of is 1.24 which is further from the standard choice than the 1.27 found for the second-derivative case.
The impact of choosing will be seen in a later section when the output of the propagation code is compared to an analytic solution.
The preceding sections fail to point out some of the difficulties involved in getting the source term. To get any nonlinear polarization, it is first necessary to get the physical field. This is done with small modifications of (2.68).
The bottom element of the last vector comes from the equation of motion for (2.60).
There are a couple of difficulties with this equation. First, it involves the derivative of the Hertz vector. Here is the only undesirable mixed partial derivative that was not eliminated from the equation of motion. This means subtracting two successive values of the Hertz field and dividing by the -step size to get . This requires an extra vector of computer memory. The biggest potential difficulty is at startup where there is only one value of the Hertz field to work with. For a cold start the code estimates the source terms by assuming that is zero and then uses the source terms to estimate . This value can be used to get a better value for the source term with the process iterated as desired. For restarts, the old value of is stored so that this process does not have to be repeated. This is potentially important for studies of self-focusing. As the beam focuses, the polarization may get large enough so that such iteration does not converge.
A more important problem is that depends on the nonlinear polarization through . Since the nonlinear polarization is generally a function of the electric field, we therefore have an implicit relation for obtaining the electric field. This means possible iteration for every step in areas where the polarization is large. For a first guess at the electric field, is predicted by assuming that is constant. This value of is used to get the new , which corrects . This iteration is nested inside another as the Hertz field is corrected for its source terms. The number of corrections for each is determined by the user at start-up time. For increases in nonlinearity, this can be changed by dumping a restart file and restarting with new values for these loop ranges.
A problem that is not confined to the vector case is the lack of physical boundary conditions. The matrix methods described in previous sections merely assume that the field is zero beyond the numerical window. This wrong assumption can generate error on the edges. In order to try to keep this error from affecting the solution closer to the center, a filter routine was written. This routines multiplies the edges of all the fields by 1 minus a Gaussian whose peak is on the edge. For parameters tried, this filter has had only limited success. Generally, making sure that the numerical window is large enough has proven to be the best way of reducing edge effects for most situations. This has the price of increasing the transverse step size for a given array dimension.
Finally, there is the question of the gauge field . The original equation of motion for (2.60) is quite trivial. However, when we factor out the carrier wave some difficulties arise:
For an explicit method it would be necessary to follow optical cycles, which defeats the purpose of using the carrier-wave formalism in the first place. Fortunately, we are still able to take advantage of the slow variation of over a step even when a step covers multiple wavelengths. This can be done with a fully implicit or trapezoidal method regarding the first term on the right. The latter was chosen since it pulls the solution towards the adiabatic solution when grows large compared to . That is, we use
or
For large enough we have
This is the solution of (3.46) if we assume that is zero.
For linear partial differential equations with constant coefficients, it is possible to determine stability using von Neumann stability analysis (see Numerical Recipes, pages 625-638 [1]). This works by noting that sinusoids are eigenfunctions of the transverse part of the resulting difference equation. That is, can be considered to be the continuous sum of functions of the form
Here, is a complex constant raised to the power at the th step. and are the spatial frequencies in the and directions respectively, and and are the respective indices. For each there is some value of (also called the amplification factor) that satisfies the difference equation. If that particular mode grows exponentially with increasing . If that particular mode decays. If for all , then the difference equation is stable.
The above can also be used for a non-rigorous estimate of the stability for nonlinear equations or for linear equations with non-constant coefficients by looking at the effective coefficients at a particular point and considering them to be constant. This will be looked at at the end of this section, but first the analysis will be used on (3.10) for the homogeneous case.
For this case and . Let us first look at (3.50) for represented by a three-point operator:
For a five-point method this becomes
For just replace with and with . For the term, we merely multiply by both factors.
Plugging into (3.10) and dividing out , we get
Here, we have let
We then have
Since and are always real, the above value always has a magnitude of one since is purely imaginary. That is, the real parts of the numerator and denominator are the same and the imaginary parts differ only by a sign. This cancellation is very important since and can become very large for some values of , so that reliance on the multiplication of for stability would require very small steps.
In any analysis involving source terms, the denominator of (3.57) would show up in the additional terms. Due to its large size for high spatial frequencies, it tends to insure stability. The denominator is particularly important for the vector equations since (3.45) shows that there are second derivatives involved in getting from . Thus, these second derivatives show up in the source terms. As long as the coefficients of and in the denominator are larger than the coefficients of and in the numerator, stability is likely. Should the nonlinearity become strong enough, this is not the case so instability should be expected.
This actually occurs in practice. In the study of self-focusing, there was stability as long as was small enough. If this factor became too large, the characteristic fuzzy phase of instability showed up just in the center region of the beam, with the program crashing soon afterwards. The high transverse frequency components blew up. However, this only occurs for very high nonlinearity. Roughly speaking, the absolute change in the permittivity must reach around 1 for instability to begin.
[1] W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical Recipes, the Art of Scientific Computing (New York: Cambridge University Press, 1987).
[2] L. Lapidus and G. F. Pinder, Numerical Solution of Partial Differential Equations in Science and Engineering (New York: John Wiley & Sons, 1982).
[3] D. R. Adams, C. D. Cantrell, and W. M. Lee, pp. 456-463 in Proceedings of the International Conference on Lasers '83, R. C. Powell, ed. (STS Press, 1983).
[4] D. Yevick and B. Hermansson, IEEE Journal of Quantum Electronics QE-26, 109-112 (1990).
[5] C.-C. J. Kuo and B. C. Levy, Proceedings of the IEEE 78, 1807-1842 (1991).
[6] R. L. Burden, J. D. Faires, and A. C. Reynolds, Numerical Analysis (Boston: Prindle, Weber & Schmidt, 1981).
Tags: dissertation physics numerical methods
You must be logged in to comment