ULTRASOUND
Processing power available from many DSP processors has allowed
a reduction of the amount of analog processing required in Ultrasound applications. In addition,
they have reached performance levels that will allow additional
features to be implemented without a significant increase in cost.

Visual and auditory feedback are critical to the
placement of the probe, which is extremely important to the quality
of the test results. To assist in this difficult task, the
ultrasound system must process the data it collects and render an
image as quickly as possible. This would allow the technician to
receive essential visual feedback in the placement of the ultrasound
probe without significant lag. This requirement generates a tight
latency requirement on the processing of ultrasound data.
A typical ultrasound machine might take data at 20
million samples-per-second of 12-bits on two channels. This rate
allows operation at carrier frequencies of 10 MHz or less, which is
typical for ultrasound systems.
Several processing steps must be performed on this
high sample rate data to reliably detect the signals generated by
the sound returns. Two components of the signal are of interest –
the amplitude of the reflection and its frequency. The
amplitude of the return is used to detect density of tissue, while
the frequency is used to detect motion via the Doppler effect.
The emitted sound beam is generated from a phased
array emitter, which is pulsed with a carrier frequency of from 2 to
10 megahertz. The phased array emitter allows the sound beam to be
positioned electrically (i.e. ‘beam steering’) and focused into a
small area (i.e. ‘beam forming’). The sound reflected back is
received by the emitter and passed to an A-D converter for
conversion to digital form. The A-D converter processes the return
signal and the original emitted signal generating two streams of
digital data – a signal and a quadrature channel.
The digital data is processed in two ways. First, the amplitude
of the signal return is demodulated from the high frequency carrier
with a synchronous AM detector. Second, the phase of the returns are
compared to the originally emitted carrier, with a synchronous
quadrature detector. This phase measurement is combined with other
phase measurements to detect the Doppler shift in the return signal.
In addition to these demodulation steps, the signals are gated –
that is selected by time and angle – to report data from a region
selected by the operator. Additional processing can be performed to
remove artifacts such as echo returns, variable sound velocities,
and geometrical artifact (e.g. off angle Doppler measurements).

The quadrature phase detector is implemented in
software on the processor array. The pair of synchronous mixers and
the low-pass-filters operate at the 20MHz sample rate. The output of
the low-pass-filters is sample rate converted down to a 25 KHz for
further processing (see diagram above).
To implement the high sample rate processing, the
following is required: a multiply in each synchronous mixer; a delay
for the 90 degree phase shift; and a multistage rate converting low
pass filter. This filter is implemented as a three tap filter,
followed by a rate conversion down to 1MHz, then another three tap
filter and a rate conversion down to 100KHz, then a 17 tap filter
and a rate conversion down to 25 KHz. The rate conversions reduce
the computational load and allow a sharp cutoff filter
implementation. A non-rate converting filter would require
significantly greater processing bandwidth. Each filter tap involves
a multiply and an add. The phase shifts are essentially free as they
are implemented through modifying the address from which the data is
taken. Totaling the computational requirement we have 8 multiplies
and 6 adds at 20 MHz, then 6 multiplies and 6 adds at 1 MHz, then 34
multiplies and 34 adds at 100 KHz. The required performance is
(280+12+6.8=298.8) about 300 Mflops. This is well within the reach
of DSP processors on the market today.
After this processing the two channels are combined
to form the Doppler shift signal, which is gated and then processed
with an FFT for display. The gating selects a window determined by
the operator in which signals are taken, out of which the signal is
ignored. This involves varying the data selection window, which
translates into a start and stop time, for each beam position.
The demodulated amplitude signal is corrected by
applying an increasing gain with time to correct for attenuation,
and warped to correct for speed variations. The amplitude signal is
plotted on the display at the locations dictated by the beam
position. This plot creates the image of the area being studied.
These post processing steps, except for the FFTs,
are insignificant when compared to the processing needed by the
demodulator and the FFTs. The FFT processing is performed to display
the spectrum of the Doppler return, which translates directly into
velocity, after considerations for geometry. The character of the
spectrum (noisy, smooth, wide band, narrow band) is an indication of
the condition of the area being examined.
The processing required by the FFTs depends on the
rates selected by the operator. The 25 KHz Doppler signal can be
rate converted up to give finer frequency (i.e. velocity)
resolution, and can be performed lapped (i.e. overlapping sliding
buffers) to improve low velocity sensitivity of the system. All of
these effects translate into an equivalent number of 1K complex FFTs
per second. The table and graph below show how well various
processors perform the FFT processing.
Two considerations are significant to this processing, the rate
at which data is passed to the process performing the FFTs, and the
bus architecture used. As the number of FFTs per second increase,
the computational demand on the processor increases. At the same
time, the demands of the processor for data increase as well.

This results in the following requirements for
processors assuming almost linear scaling. This is true for most
designs except the "native" PIII-450 case.
| NUMBER
OF PROCESSORS REQUIRED |
| 16
bit 1024 CFFTs/sec |
TM1300
|
ADSP21160
|
TMS320C6701
|
TMS320C6201
|
PIII-450
|
| 5000
|
1
|
1
|
1
|
1
|
1
|
| 7000
|
1
|
1
|
1
|
1
|
2
|
| 10000
|
1
|
1
|
2
|
1
|
2
|
| 20000
|
2
|
2
|
3
|
2
|
4
|
| 50000
|
4
|
5
|
6
|
5
|
NP
|
| 70000
|
6
|
6
|
8
|
7
|
NP
|
| 100000
|
8
|
9
|
11
|
9
|
NP
|
Conclusion DSP processors available
today are now fast enough to assume responsibility for the signal
processing performed in ultrasound systems at the carrier frequency,
eliminating the requirement for special purpose demodulators
implemented in hardware.
Click
to Download the application
note in pdf format |