Reconfigurable and GPU Computing Laboratory

Accelerating Formal Verification of Cache Coherence Protocols Using FPGAs

Mary Ellen Tie, Miriam Leeser

Abstract

PHAST, a Pipelined Hardware Accelerated STate Checker, achieves a 50x speedup for explicit state model checking of Murφ models. PHAST is a reimplementation, to accommodate FPGA hardware, of the Murφ verifier developed at Stanford University. PHAST takes advantage of the flexible memory architecture and inherent concurrency provided by an FPGA to accelerate model checking. PHAST is currently implemented on an Alpha-Data board with a Xilinx Virtex 5 chip and 1 GB of DDR-II SDRAM connected to the host computer by a PCI-X bus. In designs such as PHAST, where the data is created and managed locally and the connection is not the bottleneck, FPGAs can yield very good acceleration. Using PHAST, we achieved a fifty times application speedup in actual running hardware compared to Murφ on an example of a counter. The same structure developed for this simple example can be reused for much more complex models. Our model of the DASH protocol is similar in size and complexity to models Intel uses to validate proposed features of future processors: state sizes between 1200 and 1800 bits and a transition relation with more than 100 rules. Preliminary analysis of the DASH model as verified by PHAST indicates that with a careful implementation, the speedup will stay constant. This paper focuses on the generic structure developed for a hardware implementation of model checking as well as the changes to Murφ to accommodate FPGA hardware. PHAST is the first complete implementation of model checking in FPGA hardware.

FPGAs

Typically used in prototyping
Reconfigurable Hardware blocks
Flexible memory access - single cycle access to block memory
Pipelined designs
Exploit parallelism in algorithm
Typical bottleneck occurs in moving data between FPGA and host

Target Hardware: FPGA Board from Alpha Data

Chip: Xilinx
4,608 Kb Block RAM
17,280 logic slices

Virtex 5 Board:
1 GB DDR-II RAM
133MHz PCI-X

Xilinx, 2008

Software Implementation - Murφ

Model

Phast's target models are of similar complexity to models of an Intel processor feature

The DASH multiprocessor and cache coherency protocol were developed to have a scalable shared-memory architecture. With current processors adding more and more cores on a single die, scalable memory architectures are of interest. The above figure shows a simple layout for the DASH multiprocessor.

Hardware Implementation - PHAST

PHAST is a pipelined hardware implementation of the Murφ verifier. States start in the Next State Generator, and stored in the Lookup Pending Queue while they are hashed and looked up in the hash table.

Blue modules are RAM
stored full and shifted states
Green modules are model specific
safety properties
state transitions
White modules are PHAST specific

3 possibilities on a hash table lookup:

the state already exists in the table - state is discarded
state does not exist in the table - new state, written into the unvisited queue
lookup collided with another state - state is written into two collision queues in original and shifted form

Hash Compaction

The Hash Compaction module was modified before it could be pipelined with the rest of PHAST.

New xor tree version of Hash Compaction

References

M. E. Fuess, M. Leeser, and T. Leonard, "An FPGA Implementation of Explicit State Model Checking". FCCM, 2008.

Future Work

Implement Symmetry Reduction
Get hardware results for the DASH model
Create a Murφ to PHAST generator

Acknowledgements

We would like to thank Tim Leonard and Mark Tuttle from Intel for their assistance. This project is also supported by gifts from Intel and Xilinx Corporations.

Go Back to Home Page