|
DSP libraries for Cortex M3 and
other ARM processors
We have developed fast DSP library for the Cortex M3.
For evaluation version and commercial license details please contact us at
imellen@embeddedsignals.com
Quick summary:
Four groups of functions:
-
Windowing function
-
Fast Fourier Transform
-
Complex magnitude (absolute value of complex
frequency)
-
Miscellaneous functions: logarithm(x), exp(x), pseudorandom
generator
Three library versions
Windowing functions (e.g. Hamming window)
-
Windowing is very common step before FFT
calculation
-
Perform speed optimized windowing of input signal
before FFT
-
16 to 32 bit version performs proper scaling of 16
bit signal for 32 bit FFT
FFT functions
-
Complex and real FFT, 16 and 32bit FFT versions
-
Radix4/2 FFT – sizes
4,8,16,32,64,128,256,512,1024,2048 and 4096
-
Inverse FFT available
-
Real FFT enables much more efficient processing of
the real signals
-
16 bit FFT precision comparable with other fixed
point implementation – precision determined by necessary scaling
by 0.5 in every FFT stage
-
32 bit FFT increases dynamic range by 90 dB , needs
extra 20% to 50% cycles
-
Coefficients located in Flash. RAM location means
faster FFT for higher latencies.
Magnitude functions
-
Calculate complex frequency magnitude mag=sqrt (re^2
+ im^2)
-
Based on custom 32 bit square root algorithm (7/13
cycles)
-
Multiple versions of different speed / precision
tradeoffs for 64 bit sqrt
Logarithm
and exponent functions
-
Calculate
log2(x) and exp2(x) = 2^x
-
log2
input, exp2 output: 16q16 unsigned 1/65536 to
65535+65535/65536
-
log2
output, exp2 input : 5q27 signed -15.99999 to 15.99999
-
speed:
11/10 cycles ; precision 0.4 ppm / 3 ppm for log2 / exp2
-
single
multiply conversion to log10(x), ln(x), 10^x, e^x and generic
base log, exp
Parallel
MLS pseudorandom generator for ARM cpu
-
Maximum
Length Sequence generated by Linear Feedback Shift Registers
-
Periode
2^31-1 to 2^64-1 words (1 to 64 bits wide)
-
1 to 64
bits generated in parallel
-
Order of
magnitude faster than bit based approach, 3-10 cycles per whole
word
General information
-
Libraries are free for personal use - thoroughly
tested by the large developer community
-
Successfully deployed in many commercial products
-
Libraries passed extensive validation and
verification process, compared against baseline floating point
implementation in Matlab
-
Guaranteed to be overflow safe for valid input data
-
Written in hand optimized assembly, speed gain
based on deep knowledge of ARM processor functionality
-
Always tested on real hardware
-
Focused on Cortex M3 core, some libraries ported to
ARM 9E core on customer requests
-
Reasonable compromise between execution speed and
code size, can be tailored to customer request
-
Not restricted to single processor manufacturer as
is the case with manufacturers libraries
-
Currently fastest FFT / SQRT implementation on Cortex
M3 (as of April 2010)
Examples of FFT library customization to match
customer needs
-
different input / output scaling (e.g. full scale
input 32 bit real FFT)
-
generate second half of the real FFT (omitted in the
standard version due to symmetry)
-
calculate 2 real FFTs simultaneously using 1 complex
FFT
-
calculate only subset of output frequency bins
-
different precision than 16 or 32 bits, for example
20 bit data / 12 bit coefficients
-
custom input/output formatting (interleaving,
scaling, normal/bit reversed order)
-
coefficient location (Flash or RAM)
-
speed optimization for higher Flash latency
Downloads
FFTlibrary2bench.pdf - Benchmark document with
function list
FFTCM3.s -
16 bit complex FFT,16 64 256 1024 4096 points, (Crossworks
gcc)
FFTr2CM3.s -
16 bit complex FFT 32 128 512 2048 points, (Crossworks gcc)
FFT128real32.zip
- 32 bit real FFT 128 points + windowing + magnitude (IAR,
Keil, gcc)
FFT4096Complex32b_ARM9E.s
- 32 bit complex FFT for ARM 9E; size16,64,256,1024, 4096
log2exp2.zip -
32 bit logarithm and exponent functions with error plots and
description (IAR, Keil, gcc)
MLS_Rnd_Arm.pdf- Parallel MLS pseudorandom generator
for ARM, description + code (C, asm)
Please contact us if you need to evaluate functions
not posted in the download section.
|