The Eigenray Acoustic Ray Tracing Code

This is a page to download the source code for the "eigenray" ray propagation code for calculating the basic properties of rays over long ranges in deep water. Other hints for using this code and benchmarks can also be found here.

Codes for Acoustic Propagation

RAM in Fortran 95. A package for calculating acoustic propagation using the RAM Parabolic Equation approach and Fortran 95.
RAM in Matlab. A package for calculating acoustic propagation using the RAM Parabolic Equation approach and matlab.
Eigenray This page.
Modes in Matlab. A simple matlab script for calculating acoustic modes by solving an eigenvalue/eigenfunction problem.

Other formulations/code for running acoustic ray calculations in the oceans can be found here: http://www.hlsresearch.com/oalib/Rays/index.html Of these, the Bowlin Ray code is most similar to the code below, although that code is written in C.

Table of Contents

Documentation

Early in the development of the source code, I wrote a short Technical Memorandum:

Dushaw, B. D. and J. A. Colosi, Ray tracing for ocean acoustic tomography, Applied Physics Laboratory, University of Washington, APL-UW TM 3-98, 1998. Download PDF

This document describes the basics of what the code does and how it is set up, but the code has evolved considerably from what is described in the TM. Nevertheless, the report serves as a useful way to reference the code. The PDF of the TM is included in the documentation tarball, which also has a few blurbs on Runge-Kutta integration.

* eigenray.doc.tar.gz

Source Code (Eigenray Version 3.4b - April 2016)

Fixed bug in bottom reflection per I. Udovydchenkov; needed flat earth transformation.

* eigenraymp_04.11.2016.tgz: eigenraymp_04.11.2016.tgz

Source Code (Eigenray Version 3.4 - April 2013)

Fixed bug in eigenray finder. Parallelization of eigenray finding routines.

* eigenraymp3.4_April_2013.tgz: eigenraymp3.4_April_2013.tgz

Source Code (Eigenray Version 3.3 - November 2012)

Cleaned up initial writeout of input variables. Bug fixes to bottom interaction code. Default parameters for sound speeds, ray paths, etc. made larger.

* eigenraymp3.3_November_2012.tgz

Source Code (Eigenray Version 3.2 - November 2011)

The code has been slightly modified to accept sound speed profiles to unequal depths. The code still needs to run using all sound speed profiles to a constant maximum depth value, so it just extrapolates linearly from the deepest profile depth to the maximum depth - there is a potential for problems here, but the functionality is handy. Short profiles implies the use of topography to keep the rays from entering the depths of fake interpolated sound speed. It extrapolates (rather than using a nominal value such as zero) to keep the splining that is essential to the calculation well behaved. The code also fixes a couple of bugs causing crashes (segmentation faults/memory overruns) having to do with rays inching deeper than their maximum allowed depths. Version 3a (Dec. 2011) fixes a rare bug that arose when using a tracking template in parallel computations, requiring a minor redesign of the tracking procedure.

eigenraymp3.2_November_2011.tgz
eigenraymp3.2a_December_2011.tgz

February 2012 - the input file now includes a parameter that sets a maximum depth a ray can go. Rays deeper than this depth are dropped. Old input files should work o.k., but this parameter should be included at some point.

eigenraymp3.2b_February_2012.tgz

Source Code (Eigenray Version 3.1 - September 2010)

The code has been revised to take advantage of OPENMP. The MPI version has been removed. Reorganization of the code to make it "thread safe." Simplification/reorganization of the include files. Note that a ca. 10% improvement in computation speed may result from using many more threads than actual processors - set by the OMP_NUM_THREADS environment variable; see README.OMP.

eigenraymp.tgz

Source Code (Eigenray Version 3.0 - February 2008)

Revised code to implement a driver+subroutine approach to eigenray. This new design makes the code more versatile, e.g., for numerical ocean models, or for a matlab mex file for tracing rays.

eigenray.3_March_6_2008.tar.gz: Minor clean up.
eigenray.3_March_3_2008.tar.gz: Minor clean up. Use 10-m depth intervals by default - will work fine for the World Ocean Atlas profiles. matlab mex file binary for 32-bit linux.
eigenray.3_February_23_2008.tar.gz: A major revision to the cubic spline routines and Runge-Kutta-Merson integration approach. Seems to work much better; not an incremental improvement, but really a whole new ray code. See: RELEASE_NOTES_VERSION_3.0. N.B. Increase the value of epsilon to 1e-4 to 1e-5 for normal usage. Accuracy is much better even for the larger epsilon. For quick ray traces, increase the step sizes and epsilon aggressively and cut computation time by a factor of 3-4 for still reasonable results! I recommend step size parameters of "10 400 700" and epsilon of 1e-5 as good for both speed and accuracy. Please send me feedback on problems/bugs/suggestions. (If you've downloaded a version earlier than Feb. 23, please get the latest version.)
eigenray.new_Sep_20_2007.tar.gz: Rework bottom reflection code; beware the optical illusion caused by the aspect ratio.
eigenray.new_Mar_9_2007.tar.gz: Fixed a bug in the mex file, meig.f.
eigenray.new_Mar_7_2007.tar.gz: Some clean up; fix a bug in the tracking elements of the code (only in MPI version).
eigenray.new.tar.gz: Original upload; Feb. 13, 2007

See the included src/driver.f routine for how to initiate the various parameters and data and call the src/eig.f subroutine. The src/driver.f routine is used to compile the revised, standalone "eigenray" program.

This code is not well tested. Please report problems or suggestions to the author.

The old original source code is included at the bottom of this page, should you prefer that version.

Geodesy Programs

Courtesy of George Dworski and Craig Friesen, this tarball has c and fortran programs for accurately (?) calculating the ranges between two points, or for calculating points at a given range interval between two points. In all cases the WGS84 ellipsoid is used.

* geo.tar.gz

I don't know much about these. The tarball includes precompiled binaries that have been there for 10 years, yet still seem to work on my linux machine!

Compiling on OSX

I was able to compile eigenray by installing:

(1) The Xcode package from the OSX DVD (and optional package of considerable size; it has "make" and other unix utilities. If "which make" finds make already you are good to go.) and

(2) gfortran from : http://hpc.sourceforge.net/ This involves unpacking the tarball of the binary with "sudo tar xfz package.tgz -C /" which puts gfortran in /usr/local/bin.

After this, just "make" made the eigenray binary with no changes required. One doesn't REALLY need "make" - one can compile and link all the elements by hand, perhaps by a script. There are not that many components (in src) to eigenray. It appears not to be possible to compile a static binary for OSX, alas.

Important Notes on Sound Speeds

The code uses a precomputed table of sound speeds as a technique to speed up the calculations. This table is constructed using cubic splines, with values extending to 5500 m. The side effect of this approach is that the sound speed data that is sent to the code has to extend to at least 5500 m.

Sending sound speeds only to 1000 m, say, with zeros below that, will cause the cubic spline routine to badly misbehave, with resulting problems in the table of sound speeds.

So even though you may not be interested in propagation in the deep ocean, you still have to send the code some reasonable values for sound speed to at least 5500 m depth. If you use appropriate bathymetry, the fictional sound speeds below that will have no effect in the results at all. You need those values just to make the sound speed tables behave properly. The filled in values are basically just a convenience for calculation; the results should not depend on the artificial values at all.

Developing the code to handle sound speeds a little more flexibly may be my next development.

Notes on Matlab Mex File Compilation/Binary Download

The eigenray code is in fortran, the mex file is in c. A fortran mex file seems to be impossible to achieve. This means that the fortran subroutines have to be linked to the c mex file. This requires some caution to be sure the c and fortran compilers are compatible and work together properly. I have verified two combinations to work on my Suse 10.2 linux machine:

GNU Compilers
I downloaded and compiled gcc and g77 version 3.4.5, which was the most recent compiler officially supported by Matlab's mex routine. This combination produced a working mex file. Most modern linuxes have gcc 4 or higher; so that is potentially a problem. The latest matlab (R2007a) will compile the mex file with a more uptodate gcc and gfortran.

Intel Compilers
I had success using intel's c compiler (icc) version 9.1.046 and fortran compiler 9.1.041. If the versions of icc and ifort are too different, the code will not compile. These intel compilers are troublesome, but they make an executable that runs 10-15% faster than the GNU compilers.

See the mexopts.sh file included in the tarball for details of compiler options.

The procedure is first to build eigenray make clean ; make to create all the object files in the obj directory. Then, mex src/meig.c should create the working mex file, meig, while linking in the object files. You should ensure that the compiler options in the Makefile for eigenray are consistent with the compiler options of the mexopts.sh file.

In the matlab directory of the source tarball is a routine runeig.m which shows how to set up the calculation and call meig. It is important to allocate the dimensions of the depths, ranges, sound speeds, bathymetry, etc. data correctly before calling "meig". The mex files are very particular about how the memory is allocated. Failure to do this properly will result in a matlab crash/segmentation fault/default on your home loan/etc.

Send in depths, ranges, and sound speed ( zz0, rr0, ssp0). Send in a string of parameters ( par). Send in a bathymetry matrix ( bb - an array of range and depth). Send in a small tracking data matrix ( trk - when calculating eigenrays, trk contains data that can be used to save only the particular eigenrays you want, i.e., tracking.) The bathymetry and tracking are optional, but have to be allocated anyways. The matlab line is then:

[nr,info,rpath] = meig(zz0,rr0,ssp0,par,bb,trk);

Returns with nr the number of rays calculated; info, the ray information (travel time, turning points), and rpath, a three dimensional array of the ray paths and ds/c^2 along the ray path. (Ray paths are only returned if you ask for them, and if you are calculating eigenrays. But the large array rpath is returned in all cases.)

The fortran version of the code requires specifying four values defining the "step sizes" used for the Runge-Kutta-Merson integration. These values are not required for the mex file; I decided that the standard values I use work well enough that I hard coded them into the mex file. If you need to change the step sizes, you will have to edit the mex file source code and recompile. Note that step sizes can be adjusted indirectly somewhat by changing the value of epsilon, although at computational expense. For the record, the four values in the mex file are (1500, 5, 70, 300), which define the vector s.

A few mex files compiled from this code; whether they will work or not in your case is unknown.
Operating System	CC/Fortran Version	Matlab Version	Mex File Binary	Other Comments
Suse Linux 10.2	Intel icc/ifort	Version 7.3.0.298 (R2006b)	meig.mexa64

A Request: If you are able to compile a mex file that is not listed here, please send it to me and I will post it here.

Notes on Parallel Calculations

One issue with benchmarking is that the powersave mode turned on for the CPU, so that the CPU's automatically lower their clock speeds to when not in use. This is a noble thing to do, but it can also takes 0.5-1 s for the clock speeds to spin up to full speed, which is troublesome for accurate benchmarking on a small test like this.

On Tracking

The code has built into it the ability to only report back rays that are identified by travel time, angle, and number of turning points. This is called "tracking". Typically, the identified rays are those that have appeared in data as the stable, resolvable rays, and one is doing a ray trace to determine the ray properties of those arrivals.

The tracking template array has the variable name trk. The format of the matrix for trk is the same as the output to the ray.info file: travel time, source angle, receiver angle, lower turning depth, upper turning depth, depth at receiver, sound speed at receiver, and number of turning points. Of these parameters, only the travel times, angles and number of turning points are used to identify the rays in the ray trace. (The other parameters are kept just to make it easy to select the ID'd rays out of the ray.info file and just cut and paste the lines into the template file.)

The subroutine that performs the identification is IDrays.f . It is set up by default so that the template and ray trace travel times have to agree within 2.5 s, the angles have to agree by 1 deg., and the number of turning points have to agree exactly. On occasion these windows may need some adjustment, e.g. a wider angle window. The subroutine will have to be edited to specify the new windows and the binary recompiled to implement such changes.

Flat Earth Transformation

At present the eigenray code automatically applies the flat earth transformation to instrument depths and sound speeds. One way to disable this transformation is to change the radius of the earth variable "Re" in the includes/constants.h file to be very large, 1e14 or so. Then recompile. Someday I'll arrange for a flat earth option in the input file.

Benchmarks

Compute time for eigenray with 4000 rays at 3 Mm range
The time for the test case included with the new (Mar. '08) source code. This particular benchmark is rather lightweight; it is not sensitive to the cache size of a cpu and it scales closely by cpu speed. While the Intel Core 2 Quad (Q6600) processor is faster for single processes, it does not scale well to four processes. The AMD Phenoms scale almost ideally to four processes.

Machine	cpu Type	Compiler	Time	Options
kingfish	Phenom II X6 1090T 3.2 GHz	f90/gfortran	14.9s	-fast
kingfish	Phenom II X6 6 procs/openmp	f90/gfortran	3.2s	-fast
kingfish	Phenom II X6 40 procs/openmp	f90/gfortran	2.9s	-fast

skipjack	Phenom II X4 965 3.4 GHz	f90/sunstudio	17.0s	-fast
skipjack	Phenom II X4 4 procs/mpich	f90/sunstudio	4.5s	-fast
skipjack	Phenom II X4 4 procs/openmp	f90/sunstudio	5.3s	-fast
skipjack	Phenom II X4 4 procs/openmp	f90/ifort	4.2s	-xW -fPIC -openmp -ipo -O3 -ansi-alias -no-prec-div -pc64
skipjack	Phenom II X4 4 procs/openmp 4 threads	gfortran	4.4s	-fPIC -march=amdfam10 -m64 -mfpmath=sse -Wall -fopenmp -finline-functions -ffast-math -fno-strength-reduce -fomit-frame-pointer -falign-functions=2 -O3
skipjack	Phenom II X4 4 procs/openmp 40 threads	gfortran	4.0s

sardines	Opteron 265 1.8 GHz	ifort	34.0s	-xW -ipo -O3 -ansi-alias
sardines	Opteron 265	f90/sunstudio	42.7s	-fast -xtarget=barcelona
sardines	Opteron 265 4 procs/mpich	f90/sunstudio		-fast -xtarget=barcelona

home	Intel Core 2 Quad Q6600 - 2.4 GHz	ifort	24.0s	-xT -fPIC -ipo -O3 -ansi-alias -no-prec-div
home	Intel Q6600	f90/sunstudio	25.7s	-fast -xtarget=barcelona
home	Intel Q6600	gfortran	31.5s	-fastmath -O3 -march=athlon64, etc.
home	Intel Q6600 4 procs/mpich	f90/sunstudio	7.7s	-fast -xtarget=barcelona

swordfish	AMD Phenom 9600 - 2.3 GHz	ifort	24.0s	-fast -fPIC
swordfish	AMD Phenom	f90/sunstudio	27.0s	-fast -xtarget=barcelona
swordfish	AMD Phenom 4 procs/mpich	f90/sunstudio	7.2s	-fast -xtarget=barcelona
swordfish	AMD Phenom 12 procs/mpich	f90/sunstudio	2.9s	-fast -xtarget=barcelona

aogreef*	Opteron 2210 1.8 GHz	ifort	35.1s	-fast -fPIC
aogreef	Opteron 2210	f90/sunstudio	43.0s	-fast -xtarget=barcelona
aogreef	Opteron 2210 4 procs/mpich	f90/sunstudio	12.8s	-fast -xtarget=barcelona
aogreef	Opteron 2210 36 procs/mpich	f90/sunstudio	3.6s	-fast -xtarget=barcelona

raspberry pi	ARMv6 700 MHz	gfortran	345s	-O3 -ffast-math

skipjack	intel i7-3820 1 proc/OMP	gfortran	12.0s	-mtune=corei7-avx -march=corei7-avx -fopenmp -Wall -ffast-math -O3 -fmax-stack-var-size=32000
	4 procs		3.5s
	40 procs (single CPU)		2.4s

Of all the compilers, sunstudio, freely available from http://developers.sun.com/sunstudio/

, is the one I prefer (easy install/maintenance!). Note that for this code, the binaries from the intel compiler (ifort) performed better than the sunstudio compiler. The intel compiler is available freely for academic/personal use from: http://www.intel.com/cd/software/products/asmo-na/eng/compilers/279636.htm

after a relatively painless registration. (Or more generally: http://www.intel.com/cd/software/products/asmo-na/eng/compilers/284264.htm

- for intel's compilers, C and fortran). The patches to get the executables to run on AMD cpus can be found here: http://www.swallowtail.org/naughty-intel.html#patches

*aogreef's cpu's have their powersaving mode turned on so that when idle, the cpu frequency is 1 GHz. Such cpu's take a moment to spin up for a process, as a result the benchmark numbers on this lightweight test are a slower than they should be.

Benchmarks (old version)

Compute time for eigenray with 4000 rays at 3 Mm range
The time for the test case included with the source code. The more recent code is some 3-4 times faster!

Machine	cpu Type	Compiler	Time	Options
aogreef	Opteron 2210, 1.8 GHz 36 procs/mpich	ifort	6s	-xW -ipo -O3 -ansi-alias
sardines	Opteron 265, 1.8 GHz 4 procs/mpich	ifort	31s	-xW -ipo -O3 -ansi-alias
sardines	Opteron 265, 1.8 GHz 4 procs/mpich	g95	35s
home	AMD 64, 4400 X2, 2.2 GHz 2 procs/mpich	ifort	47s	-xW -ipo -O3 -ansi-alias
home	AMD 64, 4400 X2, 2.2 GHz	ifort	93s	-xW -ipo -O3 -ansi-alias
home	AMD 64, 4400 X2, 2.2 GHz	gfortran	111s	-O3 -march=athlon64 -ffast-math -funroll-all-loops -ftree-vectorize -msse3
home	AMD 64, 4400 X2, 2.2 GHz	pgf77	112s	-tp=k8-64e -fastsse -pc 64 -Mipa=fast -O3
home	AMD 64, 4400 X2, 2.2 GHz	g95	113s
aogreef	Opteron 2210, 1.8 GHz (socket F)	ifort	114s	-xW -ipo -O3 -ansi-alias
Wellington	AMD 64, 3200+, 2.0 GHz	g95	124s
tarpon	3200 XP, 2.2 GHz	pgf90	126s
tarpon	3200 XP, 2.2 GHz	g95	137s
sardines	Opteron 265, 1.8 GHz (socket 939)	g95	137s
home	2500 XP oc'd to 2.0 GHz	g95	152s
aogreef	Opteron 2210, 1.8 GHz (socket F)	g95	165s
aogreef	Xeon, 2.2 GHz	pgf77	186 s
aogreef	Xeon, 2.2 GHz	g95	215s

Of all the compilers, g95, freely available from http://www.g95.org

, is the one I prefer (easy install/maintenance!). Note that for this code, on my specific platform, the binaries from the intel compiler (ifort) performed about 20% faster. This compiler is available freely for academic/personal use from: http://www.intel.com/cd/software/products/asmo-na/eng/compilers/279636.htm

after a relatively painless registration. (Or more generally: http://www.intel.com/cd/software/products/asmo-na/eng/compilers/284264.htm

- for intel's compilers, C and fortran). The patches to get the executables to run on AMD cpus can be found here: http://www.swallowtail.org/naughty-intel.html#patches

Caveats and Warnings

This code is designed for the sole purpose of obtaining rays and travel times for ocean acoustic tomography. It is designed to be as fast as possible in doing that. Hence, it has limitations for other applications. Use the proper tool for the job...

The code does not calculate an intensity or other parameters that may be of interest. The Bowlin Ray code makes an estimate of intensity, although the accuracy of such results from rays leaves much to be desired. For intensity, a full-wave calculation, such as from the Parabolic Equation (PE) approach, would be better suited.

For really accurate calculations, such as for researching the detailed effects of internal waves on acoustic propagation, the code is probably not suitable, though it could likely give a rough idea of the effects. The problem is that there is a slight numerical "jitter" in the results, so that, when examined under a microscope, the arrivals on the branches of a time front will not be continuous as angle is varied. The code is perfectly suitable for tomography purposes, but not so suitable for this kind of detailed study that demands extraordinary accuracy. For code that will work for those purposes, contact, e.g., Mike Wolfson or John Colosi.

The code is dimensioned for ordinary acoustic propagation in the ocean - with sound speed profiles every 100 km or so, say (200 profiles maximum), and less than 100 values of sound speed with depth. In addition, the code relies on sound speed tables that sample sound speed at 3-m intervals over the water column. So attempting to use profiles every kilometer horizontally and every meter vertically is likely to fail. Such situations will require a revision to the code and a recompile...and a revision requiring more than just a change in the dimensions of the arrays, generally.

TODO List

These are a list of improvements or changes to the code that would be nice to implement someday. If someone should resolve some of these, by all means send the code on to me for implementation!

Handle sound speed profiles to irregular maximum depths better.
Solve the "jitter" problem, or that the arrivals along the time front do not seem to be continuous, but have some kind of numerical jitter. Code with this problem solved may well run much slower, however.
A mex file written in fortran rather than C. The mixed C/fortran coding is just trouble. However, even matlab's test fortran mex files fail to work, so the issue here lies with matlab for now.
?