SEU Mitigation for Arria 10 Devices

8
SEU Mitigation for Arria 10 Devices
2014.08.18
A10-SEU
Subscribe
Send Feedback
The single event upset (SEU) mitigation capabilities include fast error detection cyclic redundancy check
(EDCRC) and scrubbing for configuration RAM (CRAM), error correction code (ECC) for user memory,
usage of ECC in Altera's hard intellectual property (IP), and usage of ultra-low alpha materials for
packaging. This chapter covers the fast EDCRC and scrubbing capabilities.
Related Information
• Arria 10 Device Handbook: Known Issues
Lists the planned updates to the Arria 10 Device Handbook chapters.
• Embedded Memory Blocks in Arria 10 Devices
Provides more information about hard ECC for M20K memory blocks.
Error Detection Features
The hardened on-chip EDCRC circuitry allows you to perform the following operations without any
impact on the fitting or performance of the device:
•
•
•
•
Auto-detection of cyclic redundancy check (CRC) errors during configuration.
Optional soft errors (SEU and MBU) detection and identification in user mode.
Fast soft error detection. The error detection speed is improved.
Two types of check bits:
• Frame-based check bits—stored in CRAM and used to verify the integrity of the frame.
• Column-based check bits—stored in registers and used to protect integrity of all frames.
Configuration Error Detection
When the Quartus® II software generates the configuration bitstream, the software also computes a 32-bit
CRC value for each CRAM frame. A configuration bitstream can contain more than one CRC value
depending on the number of data frames in the bitstream. The length of the data frame varies for each
device.
As each data frame is loaded into the FPGA during configuration, the precomputed CRC value shifts into
the CRC circuitry. At the same time, the CRC engine in the FPGA computes the CRC value for the data
frame and compares it against the precomputed CRC value. If both CRC values do not match, the
nSTATUS pin is set to low to indicate a configuration error.
© 2014 Altera Corporation. All rights reserved. ALTERA, ARRIA, CYCLONE, ENPIRION, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos are
trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as
trademarks or service marks are the property of their respective holders as described at www.altera.com/common/legal.html. Altera warrants performance
of its semiconductor products to current specifications in accordance with Altera's standard warranty, but reserves the right to make changes to any
products and services at any time without notice. Altera assumes no responsibility or liability arising out of the application or use of any information,
product, or service described herein except as expressly agreed to in writing by Altera. Altera customers are advised to obtain the latest version of device
specifications before relying on any published information and before placing orders for products or services.
www.altera.com
101 Innovation Drive, San Jose, CA 95134
ISO
9001:2008
Registered
8-2
A10-SEU
2014.08.18
User Mode Error Detection
You can test the capability of this feature by modifying the configuration bitstream or intentionally
corrupting the bitstream during configuration.
User Mode Error Detection
In user mode, the contents of the configured CRAM bits may be affected by soft errors. These soft errors,
which are caused by an ionizing particle, are not common in Altera® devices. However, high-reliability
applications that require the device to operate error-free may require that your designs account for these
errors.
During error detection in user mode, a number of EDCRC engines run in parallel for Arria® 10 devices.
The number of error detection CRC engines depends on the frame length—total bits in a frame.
Each column-based error detection CRC engine reads 128 bits from a frame and processes within four
cycles. To detect errors, the error detection CRC engine needs to read back all frames.
Figure 8-1: Check Bits Calculation for Error Detection CRC
128-Bits
Data
128-Bits
Data
128-Bits
Data
Frame 0
128-Bits
Data
128-Bits
Data
128-Bits
Data
Frame 1
128-Bits
Data
Frame 2
128-Bits
Data
Last Frame
128-Bits
Data
32-Bits
Column-Based CRC
32-Bits ColumnBased CRC
Column 0
Column 1
Last Column
EDCRC Check Bits Updates
Frame-based EDCRC is calculated on-chip during configuration. Column-based EDCRC is updated after
configuration.
When you enable the EDCRC feature, after the device enters user mode, the EDCRC function starts
reading CRAM frames. The data collected from the read-back frame is validated against the frame-based
CRC.
After the initial frame-based verification is completed, the column-based CRC will calculate the columnbased CRC check bits.
CRC_ERROR Pin Behavior
Rising edge of the CRC_ERROR signal indicates a CRAM error is detected. Falling edge of the CRC_ERROR
signal indicates error message register (EMR) contains valid information about the error location and
type.
Altera Corporation
SEU Mitigation for Arria 10 Devices
Send Feedback
A10-SEU
2014.08.18
Retrieving Error Information
8-3
Retrieving Error Information
You can retrieve the EMR contents via the core interface or the JTAG interface using the
SHIFT_EDERROR_REG JTAG instruction. Altera provides soft EMR Unloader IP that upload EMR content
via core interface and allows it to be shared between several design component.
Error Correction
When internal scrubbing is enabled, Arria 10 devices use column-based CRC to detect errors. When
errors are found, frame-based CRC is used to locate and correct the errors.
Recovering from CRC Errors
Arria 10 devices support the internal scrubbing and external scrubbing capabilities. The internal
scrubbing feature corrects CRAM upsets automatically when an upset is detected.
The system that hosts the Arria 10 device must control the device reconfiguration. If you do not use the
external scrubbing feature, you can reconfigure the Arria 10 device by driving the nCONFIG signal low.
When reconfiguration completes successfully, the Arria 10 device operates as intended.
Related Information
Configuration, Design Security, and Remote System Upgrades for Arria 10 Devices
Provides more information about configuration sequence.
Specifications
This section lists the error detection frequencies and CRC calculation time for error detection in user
mode.
Error Detection Frequency
You can control the speed of the error detection process by setting the division factor of the clock
frequency in the Quartus II software. The divisor is 2n, where n can be any value listed in the following
table.
The speed of the error detection process for each data frame is determined by the following equation:
Figure 8-2: Error Detection Frequency Equation
Error Detection Frequency
=
Internal Oscillator Frequency
2n
Table 8-1: Error Detection Frequency Range for Arria 10 Devices
The following table lists the frequencies and valid values of n.
Error Detection Frequency
Internal Oscillator
Frequency
Maximum
Minimum
50 – 100 MHz (1)
100 MHz
390 kHz
SEU Mitigation for Arria 10 Devices
Send Feedback
n
Divisor Range
0, 1, 2, 3, 4, 5, 6, 7, 8
1 – 256
Altera Corporation
8-4
A10-SEU
2014.08.18
CRC Calculation Time
CRC Calculation Time
The time taken by the error detection circuitry to calculate the CRC for each frame is determined by the
device in use and the frequency of the error detection clock.
Using Error Detection Features in User Mode
This section describes the pin, registers, and procedures for error detection in user mode. The error
detection features will only be available in the future Quartus II software release.
CRC_ERROR Pin
Table 8-3: Pin Description
Pin Name
Pin Type
Description
I/O
An active-high signal, when driven high indicates that an
error is detected in the CRAM bits. This dedicated pin is
only used when you enable error detection in user mode.
Otherwise, the pin can be used as a user I/O pin.
Output or
output opendrain
CRC_ERROR
Error Detection Registers
This section describes the registers used in user mode.
Figure 8-3: Block Diagram for Error Detection in User Mode
The block diagram shows the registers and data flow in user mode.
Readback
Bitstream
CRC
Calculation
Syndrome
Error Detection
Search Engine
Correction
Pattern
Write Back to
CRAM for Correction
CRC_ERROR
Error Message Register
(1)
JTAG Update
Register
User Update
Register
HPS Update
Register
JTAG Shift
Register
User Shift
Register
HPS Shift
Register
JTAG
TDO
General
Routing
HPS
Output
Pending characterization.
Altera Corporation
SEU Mitigation for Arria 10 Devices
Send Feedback
A10-SEU
2014.08.18
8-5
Error Detection Registers
Figure 8-4: Error Message Register Map
MSB
LSB
Column-Based
Syndrome
Frame Address
32 bits
16 bits
Column-Based
Double Word
Column-Based
Bit
2 bits
Column-Based
Type
5 bits
3 bits
Frame-Based
Syndrome
Frame-Based
Double Word
32 bits
10 bits
Frame-Based
Bit
5 bits
Frame-Based
Type
3 bits
VEH/HEC
Consistency
1 bit
Column Check
Bits Update
1 bit
Table 8-4: Error Detection Registers
Name
Width
(Bits)
Description
Frame-based syndrome register
32
Contains the 32-bit CRC signature calculated for the
current frame. If the CRC value is 0, the CRC_ERROR pin is
driven low to indicate no error. Otherwise, the pin is
pulled high.
Column-based syndrome
register
32
Contains the 32-bit combined CRC signature for all
columns of EDCRC. If the CRC value is 0, no CRC error is
detected.
Error message register (EMR)
110
Contains error details for single-bit and double-adjacent
errors. The error detection circuitry updates this register
each time the circuitry detects an error. Figure 8-4 shows
the fields in this register and Table 8-5 lists the possible
error types.
User update register
110
This register is automatically updated with the contents of
the EMR one clock cycle after the contents of this register
are validated. The user update register includes a clock
enable, which must be asserted before its contents are
written to the user shift register. This requirement ensures
that the user update register is not overwritten when its
contents are being read by the user shift register.
User shift register
110
This register allows user logic to access the contents of the
user update register via the core interface.
JTAG update register
110
This register is automatically updated with the contents of
the EMR one clock cycle after the content of this register is
validated. The JTAG update register includes a clock
enable, which must be asserted before its contents are
written to the JTAG shift register. This requirement
ensures that the JTAG update register is not overwritten
when its contents are being read by the JTAG shift
register.
JTAG shift register
110
This register allows you to access the contents of the JTAG
update register via the JTAG interface using the SHIFT_
EDERROR_REG JTAG instruction.
SEU Mitigation for Arria 10 Devices
Send Feedback
Altera Corporation
8-6
A10-SEU
2014.08.18
Document Revision History
Table 8-5: Error Type in EMR
The following table lists the possible error types reported in the error type field in the EMR.
Error Types
Frame-based Type
Column-based Type
Bit 2
Bit 1
Bit 0
Description
0
0
0
No error
0
0
1
Single-bit error
0
1
X
Double-adjacent error
1
1
1
Uncorrectable error
0
0
0
No error
0
0
1
Single error
0
1
0
0
1
1
1
0
0
1
0
1
1
1
0
1
1
Double-adjacent error
1
Uncorrectable error
Document Revision History
Date
August 2014
Altera Corporation
Version
2014.08.18
Changes
• Updated the Error Detection Features
section.
• Updated the Configuration Error
Detection section to revise the CRC value.
• Updated the User Mode Error Detection
section to add in the check bits calculation
for error detection CRC.
• Updated the CRC_ERROR Pin Behavior
section.
• Updated the Retrieving Error Information
section.
• Updated the CRC_ERROR Pin section to
update the pin description.
• Updated Table 8-4 to updated the descrip‐
tion of the frame-based syndrome register,
user update register, and user shift
register.
• Updated Table 8-5 to update the error
types naming to frame-based and columnbased types.
SEU Mitigation for Arria 10 Devices
Send Feedback
A10-SEU
2014.08.18
Document Revision History
Date
December 2013
SEU Mitigation for Arria 10 Devices
Send Feedback
Version
2013.12.02
8-7
Changes
Initial release.
Altera Corporation