link to page 1 link to page 5 link to page 5 ADSP-21467/ADSP-21469 As shown in Figure 1 on Page 1, the processor uses two compu- results. These 10-port, 32-register (16 primary, 16 secondary) tational units to deliver a significant performance increase over register files, combined with the processor’s enhanced Harvard the previous SHARC processors on a range of DSP algorithms. architecture, allow unconstrained data flow between computa- With its SIMD computational hardware, the processors can tion units and internal memory. The registers in PEX are perform 2.7 GFLOPS running at 450 MHz and 2.4 GFLOPS referred to as R0-R15 and in PEY as S0-S15. running at 400 MHz. Context SwitchFAMILY CORE ARCHITECTURE Many of the processor’s registers have secondary registers that The processors are code compatible at the assembly level with can be activated during interrupt servicing for a fast context the ADSP-2137x, ADSP-2136x, ADSP-2126x, ADSP-21160, switch. The data registers in the register file, the DAG registers, and ADSP-21161, and with the first generation ADSP-2106x and the multiplier result registers all have secondary registers. SHARC processors. The ADSP-21467/ADSP-21469 processors The primary registers are active at reset, while the secondary share architectural features with the ADSP-2126x, ADSP-2136x, registers are activated by control bits in a mode control register. ADSP-2137x, and ADSP-2116x SIMD SHARC processors, as Universal Registers shown in Figure 2 and detailed in the following sections. These registers can be used for general-purpose tasks. The SIMD Computational Engine USTAT (4) registers allow easy bit manipulations (Set, Clear, The processor contains two computational processing Toggle, Test, XOR) for all system registers (control/status) of elements that operate as a single-instruction, multiple-data the core. (SIMD) engine. The processing elements are referred to as PEX The data bus exchange register (PX) permits data to be passed and PEY and each contains an ALU, multiplier, shifter, and between the 64-bit PM data bus and the 64-bit DM data bus, or register file. PEX is always active, and PEY may be enabled by between the 40-bit register file and the PM/DM data buses. setting the PEYEN mode bit in the MODE1 register. When this These registers contain hardware to handle the data width mode is enabled, the same instruction is executed in both pro- difference. cessing elements, but each processing element operates on different data. This architecture is efficient at executing math Single-Cycle Fetch of Instruction and Four Operands intensive DSP algorithms. The processors feature an enhanced Harvard Architecture in Entering SIMD mode also has an effect on the way data is trans- which the data memory (DM) bus transfers data and the pro- ferred between memory and the processing elements. When in gram memory (PM) bus transfers both instructions and data SIMD mode, twice the data bandwidth is required to sustain (see Figure 2). With the its separate program and data memory computational operation in the processing elements. Because of buses and on-chip instruction cache, the processor can simulta- this requirement, entering SIMD mode also doubles the band- neously fetch four operands (two over each data bus) and one width between memory and the processing elements. When instruction (from the cache), all in a single cycle. using the DAGs to transfer data in SIMD mode, two data values are transferred with each access of memory or the register file. Instruction Cache The processors contain an on-chip instruction cache that Independent, Parallel Computation Units enables three-bus operation for fetching an instruction and four Within each processing element is a set of computational units. data values. The cache is selective—only the instructions whose The computational units consist of an arithmetic/logic unit fetches conflict with PM bus data accesses are cached. This (ALU), multiplier, and shifter. These units perform all opera- cache allows full speed execution of core, looped operations tions in a single cycle. The three units within each processing such as digital filter multiply-accumulates, and FFT butterfly element are arranged in parallel, maximizing computational processing. throughput. Single multifunction instructions execute parallel ALU and multiplier operations. In SIMD mode, the parallel Data Address Generators With Zero-Overhead Hardware ALU and multiplier operations occur in both processing ele- Circular Buffer Support ments. These computation units support IEEE 32-bit single- The two data address generators (DAGs) are used for indirect precision floating-point, 40-bit extended precision floating- addressing and implementing circular data buffers in hardware. point, and 32-bit fixed-point data formats. Circular buffers allow efficient programming of delay lines and other data structures required in digital signal processing, and Timer are commonly used in digital filters and Fourier transforms. A core timer that can generate periodic software interrupts. The two DAGs of the processors contain sufficient registers to The core timer can be configured to use FLAG3 as a timer allow the creation of up to 32 circular buffers (16 primary expired signal. register sets, 16 secondary). The DAGs automatically handle address pointer wraparound, reduce overhead, increase perfor- Data Register File mance, and simplify implementation. Circular buffers can start A general-purpose data register file is contained in each pro- and end at any memory location. cessing element. The register files transfer data between the computation units and the data buses, and store intermediate Rev. B | Page 4 of 76 | March 2013 Document Outline Summary Table Of Contents Revision History General Description Family Core Architecture SIMD Computational Engine Independent, Parallel Computation Units Timer Data Register File Context Switch Universal Registers Single-Cycle Fetch of Instruction and Four Operands Instruction Cache Data Address Generators With Zero-Overhead Hardware Circular Buffer Support Flexible Instruction Set Variable Instruction Set Architecture (VISA) On-Chip Memory On-Chip Memory Bandwidth Nonsecured ROM ROM-Based Security Digital Transmission Content Protection Family Peripheral Architecture External Port External Memory SIMD Access to External Memory VISA and ISA Access to External Memory Shared External Memory DDR2 Support DDR2 DRAM Controller Asynchronous Memory Controller External Port Throughput Link Ports MediaLB Pulse-Width Modulation Digital Applications Interface (DAI) Serial Ports S/PDIF-Compatible Digital Audio Receiver/Transmitter Asynchronous Sample Rate Converter Input Data Port Precision Clock Generators Digital Peripheral Interface (DPI) Serial Peripheral Interface UART Port Timers 2-Wire Interface Port (TWI) I/O Processor Features DMA Controller Delay Line DMA Scatter/Gather DMA IIR Accelerator FFT Accelerator FIR Accelerator System Design Program Booting Power Supplies Target Board JTAG Emulator Connector Development Tools Integrated Development Environments (IDEs) EZ-KIT Lite Evaluation Board EZ-KIT Lite Evaluation Kits Software Add-Ins for CrossCore Embedded Studio Board Support Packages for Evaluation Hardware Middleware Packages Algorithmic Modules Designing an Emulator-Compatible DSP Board (Target) Additional Information Related Signal Chains Pin Function Descriptions Specifications Operating Conditions Electrical Characteristics Total Power Dissipation Absolute Maximum Ratings Package Information ESD Sensitivity Timing Specifications Core Clock Requirements Voltage Controlled Oscillator (VCO) Power-Up Sequencing Clock Input Clock Signals Reset Running Reset Interrupts Core Timer Timer PWM_OUT Cycle Timing Timer WDTH_CAP Timing Pin to Pin Direct Routing (DAI and DPI) Precision Clock Generator (Direct Pin Routing) Flags DDR2 SDRAM Read Cycle Timing DDR2 SDRAM Write Cycle Timing AMI Read AMI Write Shared Memory Bus Request Link Ports Serial Ports Input Data Port (IDP) Parallel Data Acquisition Port (PDAP) Sample Rate Converter—Serial Input Port Sample Rate Converter—Serial Output Port Pulse-Width Modulation (PWM) Generators S/PDIF Transmitter S/PDIF Transmitter-Serial Input Waveforms S/PDIF Transmitter Input Data Timing Oversampling Clock (HFCLK) Switching Characteristics S/PDIF Receiver Internal Digital PLL Mode SPI Interface—Master SPI Interface—Slave Media Local Bus Universal Asynchronous Receiver-Transmitter (UART) Ports—Receive and Transmit Timing 2-Wire Interface (TWI)—Receive and Transmit Timing JTAG Test Access Port and Emulation Test Conditions Output Drive Currents Capacitive Loading Thermal Characteristics Thermal Diode CSP_BGA Ball Assignment—Automotive Models CSP_BGA Ball Assignment—Standard Models Outline Dimensions Surface-Mount Design Automotive Products Ordering Guide