|
The 2005 EDA Branding Study shows that
71% of FPGA projects have difficulty
meeting their timing budgets. Several
strategies exist to help you meet your timing
goals, such as HDL code changes and
synthesis and implementation tools settings.
In this article, well describe the
Xplorer implementation tools strategy to
maximize design performance, whether
you are evaluating the best achievable performance
for a specified clock domain or
attempting to meet timing requirements
for designs with user constraints.
Implementation with Xplorer
Xplorer is a perl script that seeks the best
design performance using Xilinx® ISE
software. After synthesis generates an EDIF
(*.edf ) file, the design is ready for implementation.
During this phase, you could
use Project Navigator in ISE software to
apply design constraints and explore different
tools settings for best performance.
An alternative approach may be to use
Xplorer. Xplorer is designed to help achieve
optimal results by employing smart constraining
techniques and various physical
optimization strategies. Because no unique
set of ISE options or timing constraints
works best on all designs, Xplorer finds the
right set of tools options to either meet
design constraints or find the best performance
for the design. Hence, Xplorer has two
modes of operation: best performance mode
and timing closure mode.
Best Performance Mode
In this mode of operation, Xplorer optimizes
design performance for a user-specified clock
domain, allowing easy evaluation of the maximum
achievable performance. You specify
the design name and a single clock to optimize.
Xplorer implements the design with
different architecture-specific optimization
strategies in conjunction with timing-driven
place and route (PAR). It tightens or relaxes
the timing constraints depending on
whether or not the frequency goal is
achieved, as shown in Figure 1. Xplorer estimates
the starting frequency based on pre-PAR timing data. Adjusting timing
constraints such that PAR is neither undernor
over-constrained enables Xplorer to deliver optimal design performance.
In addition to timing constraints,
Xplorer also uses physical optimization
strategies such as global optimization and
timing-driven packing and placement.
Global optimization performs pre-placement
netlist optimizations on the critical
region, while timing-driven packing and
placement provides closed-loop packing
and placement such that the placer can recommend
logic packing techniques that
deliver optimal placement. If the design has
a user constraint file (UCF), Xplorer optimizes
for the user constraints in addition to
the specified clock domain.
Timing Closure Mode
If you have a design with timing constraints
and your intent is for the tools to meet the
specified constraints, use the timing closure
mode. In this mode, you should not specify
a clock using the -clk <clock name> switch.
Xplorer looks at the UCF to examine the
timing constraints goals. Using these constraints
together with optimization strategies
such as global optimization,
timing-driven packing and placement, register
duplication, and cost tables, Xplorer
implements the design in multiple ways to
deliver optimal design performance.
Because Xplorer runs approximately 10
iterations, you will experience longer PAR
runtimes. However, Xplorer is something
that users typically run once during their
design cycle. After an Xplorer run, you can
capture the set of options that will give the
best result from the xplorer.rpt file and use
that set of options for future design runs.
Typically, designers will run the tools
many times in a design cycle, so a longer
initial runtime will likely reduce the number
of PAR iterations later.
All Xilinx FPGA architectures are supported
by Xplorer; optimizations are performed
based on architecture features.
Using Xplorer
Xplorer is run from the command prompt
by typing:
xplorer <design name> [-clk <clkname>]
[-p <partname>]
<design name>: Name of the top level
edif/ngc file.
-clk <clkname>: Name of the clock to be
optimized. If the -clk option is omitted,
the script uses the timespecs defined in the
UCF file.
-p <partname>: The device name (for
example, XC4VLX100-11FF1152). The
default value is the part specified in the
input design.
-uc <ucf file>: The UCF file name. The
default value is <design name>.ucf.
Here is an example command for Best
Performance Mode:
xplorer cordic -clk clk -p XC4VLX15-12FF668
Here is an example command for
Timing Closure Mode:
xplorer <design name> -uc <ucf file> -p
<partname>
Results, with the tools settings used, are
summarized in xplorer.rpt. The best run is
identified at the end of the report file.
Performance Improvement Results
To highlight the performance impact of
these optimization strategies, we compared
baseline results (attainable using
tightly constrained, high-effort timingdriven
PAR) with Xplorer. Figure 2 shows
the two flows.
For a high-density, high-performance
Virtex-4 customer design suite of more
than 75 designs Xplorer provides up to
70% and on average 10% performance
improvement, as shown in Figure 3. The
designs range in density from LX15 to
LX200, covering (but not limited to) market
segments such as consumer, video, storage,
telecom/datacom, DSP, and glue logic.
In addition, performance improvements
for eight OpenCores designs for
Spartan-3 FPGAs are shown in Figure 4.
These OpenCores designs are written in
synthesizable RTL and synthesized to the
target technology without any code modifications.
The designs can be downloaded
from www.opencores.org. For the eight
OpenCores designs, the average performance
improvement using Xplorer is 10%,
with the AES design realizing a 38% performance
improvement.
How to Achieve
Additional Performance Gains
At times, the Xplorer implementation strategy
still might not be enough to meet your
target timing goals. In these cases, adopt
synthesis and RTL coding strategies geared
towards performance.
Conclusion
Xplorer helps you optimize logic performance
and meet your timing goals by using
smart constraining techniques and
employing the right set of implementation
tools strategies. Xplorer provides an average
performance improvement of 10% for
Xilinx FPGAs.
To view advanced options and to download
Xplorer, visit www.xilinx.com/xplorer.
Printable PDF version of this article with graphics. (12/1/05) 305 KB
|