Current Research: Completeness Thresholds for Bounded Model
Checking
 
This page is a reproduction of the project webpage
     
http://www.ccs.neu.edu/home/wahl/Research/fpa-heterogeneous.html
maintained by Thomas Wahl.
College of Computer Science, Northeastern University
The page below may therefore be slightly out of date.
Ensuring
Reliability and Portability of Scientific Software
for Heterogeneous Architectures
 
Floating-point arithmetic is
used in scientific software to perform calculations with
(approximations of) real numbers. Despite successful efforts to
standardize floating-point arithmetic — reflected in the universally
accepted IEEE 754 floating-point standard (the "Standard") —,
results of floating-point calculations are generally
not portable across computer
architectures and can in fact differ vastly.
There are a number of reasons for this phenomenon. One is the
difference in floating-point hardware available on different
architectures. For instance, the presence or absence of a
fused multiply-add (FMA) unit
significantly impacts the precision of calculating expressions of
the form
a ×
b + c. Different ways of
evaluating compound expressions are sanctioned by the Standard, as
evaluation rules for expressions are mostly left to the programming
language (unlike the result of basic arithmetic operations such
as
a + b).
Another reason for the differences in floating-point results is
especially relevant for parallel architectures: for efficiency,
complex expressions such as sums of many operands are split by the
compiler into sub-expressions to be computed by individual threads;
the result is in the end combined to obtain the final sum, as shown
here for a sum of four arguments:
Unfortunately, due to the loss of precision in floating-point
arithmetic compared to real arithmetic, many
common laws of arithmetic known from
high school no longer hold, in particular the associativity
of addition. Different associations of summands in a long addition
will therefore typically yield different results.
Most programmers are unaware of these vagaries of floating-point
arithmetic. As a result, parallel scientific programs are
susceptible to reliability and portability issues that can range
from simple deviations in precision to
changes of program control flow when moving from one
architecture to another. This threat of non-portability stands in
contrast to the promise made by parallel programming standards such
as OpenCL for "write-once, run anywhere" functionality.
This research aims at tools and techniques to help programmers find
"issues" with the use of floating-point arithmetic in parallel
scientific code, specifically written using OpenCL. Specifically,
the goal is to detect potential sources of reliability and
portability deficiencies in such code that are due to dependencies
of the floating-point behavior on the underlying (IEEE-compliant)
architecture. This will have important implications for the
reliability of scientific programs such as those used in biomedical
imaging applications, climate modeling, and vehicle design.
People:
Publications:
- Miriam Leeser, Jaideep Ramachandran, Thomas Wahl, and Devon
Yablonski. OpenCL
floating
point software on heterogeneous architectures — portable or
not? In Workshop on Numerical Software
Verification (NSV), 2012. [pdf ]
Sponsorship:
National Science Foundation, under award number
CCF-1218075.
External links (follow at
your own risk!):