
The ECN No Name Newsletter is no longer being published. This is an archived issue.
[previous article] [next article]If you are compiling and running your FORTRAN programs on either a Sun 3 or a Sun 4, you can take advantage of IEEE's floating point error-handling function. This function is able to detect invalid operands, division by zero, overflow and underflow. This error-handling routine is a C function called from your FORTRAN program and generates an error code and an abort, resulting in a core dump, both of which can prove handy in debugging (for more details, see ieee_handler(3) in the SunOS 4.0 Manual, Math Library Functions, Section 3). First, let's look at and compile a small FORTRAN program that has an overflow problem.
$ cat overflow.f
PROGRAM OVERFLOW
REAL A
A = 2.0E+20
A = A * A
STOP
END
$ f77 -g overflow.f
$ a.out
Warning: the following IEEE floating-
point arithmetic exceptions occurred
in this program and were never cleared:
Overflow;
What we see here is a rather cryptic message. For a large program, this might pose a problem since no line number is given. Even using the debugger dbx at this point is not of much help, as it completes execution without even so much as a hint of any problem. To solve our dilemma, let's create a simple C source file, containing 2 tiny functions:
$ cat setieee.c #includefp_handler(sig, code) int sig, code; { printf("ieee exception code %x\n",code); abort(); } setieee_() { ieee_handler("set","common",fp_handler); }
The first of these "homemade" functions, fp_handler(), will print a brief error message and abort at the point of error, causing a core dump. The second function, setieee_(), calls the library routine ieee_handler and our own fp_handler(). This second function is the one that our FORTRAN program will call. Let's add this call to our original program, and recompile the 2 files (don't forget the -g option for dbx):
$ cat overflow.f
PROGRAM OVERFLOW
REAL A
CALL SETIEEE
A = 2.0E+20
A = A * A
STOP
END
$ f77 -g overflow.f setieee.c
$ a.out
ieee exception code d4
*** IOT Trap = signal 6 code 0
Abort (core dumped)
When we run a.out, we see that our trap-handling function produced a message with the code signaling the kind of error we encountered. The error code shown, d4, is a "floating overflow", as can be seen from the table below. (This table can also be found in the file /usr/include/signal.h).
code exception description
c4 floating inexact result
c8 floating divide by zero
cc floating underflow
d0 floating operand error
d4 floating overflow
d8 floating Not-A-Number
The real benefit of our homemade functions comes in when we use dbx. The abort in our fp_handler routine allows us to use dbx to track down not only the nature of the problem, but where it occurred as well:
$ dbx a.out
Reading symbolic information...
Read 195 symbols
(dbx) run
Running: a.out
ieee exception code d4
signal IOT (IOT instruction)
in kill at 0xf7728f70
0xf7728f70: bgeu 0xf7728f98
(dbx) where
kill() at 0xf7728f70
fp_handler(sig = 8,code = 212)
line 6 in "setieee.c"
_sigfpe_master() at 0xf7720560
_sigtramp() at 0xf7702e20
MAIN(0x239c, 0x0, 0x4, 0x4,
0x7, 0xffffef58)
line 3 in "overflow.f"
main() at 0x23f8
(dbx) quit
The dbx output above not only tells us that this is an overflow problem, but that it can be found in line #3 of our program. Now it should be relatively easy to fix our floating point error.