Final1/QuestionSet2_Solutions

advertisement
1) Below are 2 sets of data of the same sine wave (sorry for my drawing skills) of which an FFT is taken. Draw the
results of the FFT & explain why they are different (they ARE different). (2 marks total, ½ mark per picture, ½ mark
for explanation, ½ mark for explanation of reducing windowing effects)
V
a)
n
b)
V
n
Explanation:
In b), the signal is not syncronously sampled, so windowing effects occur, essentially smearing the original
frequency impulse across the frequency domain.
For the case in b) above, what could you do to reduce these effects?
One can apply a windowing filter (ex. Hamming) to smooth the edges to ensure a return to zero by the edge of the
window.
2) For a certain active noise cancellation algorithm (LMS), the FIR and FIR coefficient update equations are as
follows:
𝑁−1
𝑦(𝑛) = ∑ π’˜
Μ… (𝑛) 𝒙̅ (𝑛)
0
Μ…(𝑛)𝑒(𝑛)
π’˜
Μ… (𝑛 + 1) = π’˜
Μ… (𝑛) + πœ‡π’™
where:
̅̅̅𝑖𝑠 π‘‘β„Žπ‘’ 𝐹𝐼𝑅 π‘π‘œπ‘’π‘“π‘“π‘–π‘π‘–π‘’π‘›π‘‘ π‘π‘’π‘“π‘“π‘’π‘Ÿ π‘œπ‘“ 𝑠𝑖𝑧𝑒 𝑁
π’˜
e is the error from the desired signal detected.
Μ… 𝑖𝑠 π‘‘β„Žπ‘’ 𝑖𝑛𝑝𝑒𝑑 π‘π‘’π‘“π‘“π‘’π‘Ÿ π‘œπ‘“ 𝑠𝑖𝑧𝑒 𝑁
𝒙
πœ‡ is an arbitrary step size.
Unfortunately, due to external supply limitations, a developer trying to implement this was unable to get a SHARC.
The China-brand SHAYU (shark in mandarin, according to google) cannot do SIMD operations, and can only do a
mult OR an add in parallel with a memory operation. In this configuration, what is the best cycle/N theoretically
possible for the above math operations?
First, breaking this apart, the top section is essentially an FIR, and the second is essentially a mult with constant
into a sum equation. In pseudocode, the functions look essentially as follows:
y = 0;
for(i=0;i<N;i++){
t1 = w[i];
t2 = x[i];
t2 = t1 + t2;
y = y + t2;
}
//Important to note that the u & e(n) are constant through matrix multiplys and can be calculated outside of loop,
so one mult saved
U_mult_e = u*e[n];
for(i=0;i<N;i++){
t1 = w[i];
t2 = x[i];
t2 = t2*U_mult_e;
t1 = t1 + t2;
W[i] = t1;
}
Assuming that w & x are separated into dm & pm space respectively (doesn’t really matter based on data, as
indicates a single memory operation), the table becomes as follows for the loop(unoptimized):
ADD & MULT
----------------------------------------------
T2 = t1 + t2
Y = y + t2
----------------------------------------------
DM/PM
Loop 1
T1 = dm(post-mod, W++)
T2 = pm(post-mod,X++)
Loop 2
T1 = dm(W, mod with 0)
T2 = pm(post-mod,X++)
T2 = t2*U_mult_e
T1 = t1+t2;
Dm(post-mod) = t1
(up to 3 marks here, should show how the resource chart should look like with counters inside or equivalent.)
When compressed, the limiting factor ends up being 5 cycles/N due to the DM/PM loop. (2 marks for answer).I
don’t need to care about register-forwarding as I can simply unroll the loop without issues.
Notes after writing/marking this set of questions: 2nd question was written a bit confusing, as time sample (n)
didn’t correspond to buffer size N – would have probably been more clear if the sample variable is k instead (as
mentioned by student). Personally, the 2nd question came out a bit on the easy side, but seemed to fit within time
limits. Various modifications on said question (examples):
Compare a processor with SIMD but certain operation(mult) takes 2 or more cycles
“
“ without register forwarding
“
“ without HW loops
Download