⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 iir_flp32.asm

📁 是一个TS201的源代码
💻 ASM
字号:
/*************************************************************************

-rev1.0 PM, 9/2003
-rev1.1 PM, 7/2004, project passed to VDSP3.5

-this program implements a floating point IIR filter as if one output at a 
time would be necessary
-it can be compiled and run on TS101 and TS201.
-TS101 has only 2 memory blocks (section 1 and 2) where data buffers may be placed.
-TS201 has 5 memory blocks (section 1, 2, 3, 4, 5) where data buffers may be placed.
-In this program only 2 memory blocks have been used to maintain compatibility between
TS101 and TS201

-2 tcl files are provided, one for TS101 and one for TS201. Both of them build
the correspondent project and save the output buffer into a file output.dat
-.align_code 4 instruction has been introduced throughout the main part of the program
to efficientize the cycle count for TS201 (On this processor they may be even discarded
if the cycle count is not of interest). For TS101, the assembly option -align-branch-lines 
placed in the project properties tab has the same effect (On TS101 they are a must for
IF instructions).


Overview:

This program computes an IIR filter as if one output at a time would be necessary.

The equations of the filter are:

  w(n)=x(n)*scale+w(n-1)*a4+w(n-2)*a3+w(n-3)*a2+w(n-4)*a1
  y(n)=w(n)+w(n-1)*b4+w(n-2)*b3+w(n-3)*b2+w(n-4)*b1

The coefficients a4,a3,a2,a1,b4,b3,b2,b1 are saved in the coeffs buffer
in the following order: a2,a4,b2,b4,a1,a3,b1,b3

The delay line saves w(n-1), w(n-2), w(n-3), w(n-4) in the following order:
    w(n-3), w(n-1), w(n-4), w(n-2).

This order simplifies the process that updates the delay line. For example,
the dealy line is loaded in the following registers:
  r3=w(n-1) r2=w(n-3) r1=w(n-2) r0=w(n-4)

When w(n) must be saved into the dealy line, r1:0 is shfted right 32 bits
throwing away w(n-4) and making place for w(n). The delay line looks now as:
  r3=w(n-1) r2=w(n-3) r1=w(n) r0=w(n-2) and is saved in the following order:
  w(n-2), w(n), w(n-3), w(n-1).

This procedure repeates each time the filter is used.

-N represents the number of input points considered
-the outputs are saved into the output buffer. They may be compared with
the outputs saved into expected_output buffer

- at the end of the program, the cycle_count variable contains the
cycle count of the main program
************************************************************************/

// N number of data points in input
#define N_MAX    100
#define N         30
#define SECTIONS 4

#define scale    0.05078125

/************************************************************************/

.section data1;
.align 4;
.var inputs[N] =

5000.0, 4333.0, 5465.0, 13556.0, 7423.0, -5000.0, -4333.0, -5465.0, -13556.0, -7423.0,
5000.0, 4333.0, 5465.0, 13556.0, 7423.0, -5000.0, -4333.0, -5465.0, -13556.0, -7423.0,
5000.0, 4333.0, 5465.0, 13556.0, 7423.0, -5000.0, -4333.0, -5465.0, -13556.0, -7423.0;

.var cycle_count;    // execution cycle counts

.align 4;
.var expected_output[N] =

 253.906250,  1394.910156,  3400.899369,  5451.189211,  7461.879024,
 8509.488592,  5519.537707, -1306.039745, -6936.575354, -8920.200002,
-8397.397895, -4821.647412,  1507.253965,  6689.519912,  8738.028980,
 8444.656939,  4922.543154, -1492.260614, -6730.424864, -8759.801131,
-8433.783468, -4908.582993,  1492.343760,  6724.040670,  8757.440081,
 8435.857839,  4910.422599, -1492.627568, -6724.989198, -8757.654792;

.align 4;
.var output[N];

.section data2;
.align 4;
//                       a2,     a4,     b2, b4, a1,     a3,       b1, b3
.var coeffs[2*SECTIONS]= 0.1412, 0.6272, 4., 4., -0.0255, -0.6108, 1., 6.;

.section data3;
.align 4;
.var delayline[SECTIONS]=0.0, 0.0, 0.0, 0.0;

/************************************************************************/
#ifdef __ADSPTS201__
  #include <defts201.h>
#endif

  #include "cache_macros.h"


.section program;
.global _main;

/************************************** Power up code *****************************************/
_main:
powerup:

#ifdef __ADSPTS201__
/*in the case of TS201, at the beginning of the program the
cache must be enabled. The procedure is contained in the
cache_enable macro that uses the refresh rate as input parameter
      -if CCLK=500MHz, refresh_rate=750
      -if CCLK=400MHz, refresh_rate=600
      -if CCLK=300MHz, refresh_rate=450
      -if CCLK=250MHz, refresh_rate=375
*/
  cache_enable(750);

    j0 = j31 + coeffs; LC0 = 2;;
//due to a TS201 rev0 anomaly, the initialization of LC0 must be at least
//4 instruction lines before the end of the loop (jump instruction included)
//and the loop must be at least 2 cycles long
    nop;nop;;
.align_code 4;
ini_cache:
    xr3:0 = q[j0+=0];;
.align_code 4;
    if NLC0E, jump ini_cache; q[j0+=4] = xr3:0;;

#endif

end_powerup:


/************************************** Start of code *****************************************/

//j1 is used to fetch the inputs.
  j1 = j31 + inputs; LC0 = N;;

//j2 is used to save the outputs
  j2 = j31 + output;;

.align_code 4;
iir_loop:

//yr8=x(n)
  yr8 = [j1+=1];;

//read cycle counter

  ini_cycle_count;
/************************************** Start of IIR code***************************************/

  j0 = j31 + delayline;;
  k0 = k31 + coeffs;;

//r3:2=w(n-1) w(n-3)
//                 yr7:6=a4,a2 xr7:6=b4,b2
  r3:2 = l[j0+=2]; r7:6 = q[k0+=4];;

//r1:0=w(n-2) w(n-4)
//                 yr5:4=a3,a1 xr5:4=b3,b1
  r1:0 = l[j0+=2]; r5:4 = q[k0+=4];;

//yr9=w(n-1)*a4
//xr9=w(n-1)*b4
  fr9 = r3 * r7; yr13 = scale;;

//yr10=w(n-3)*a2
//xr10=w(n-3)*b2
  fr10 = r2 * r6;;

//yr11=w(n-2)*a3
//xr11=w(n-2)*b3
  fr11 = r1 * r5;;

//yr12=w(n-4)*a1 yr9=w(n-1)*a4+w(n-3)*a2
//xr12=w(n-4)*b1 xr9=w(n-1)*b4+w(n-2)*b2
  fr12= r0 * r4; fr9 = r9 + r10;;

//yr8=x(n)*scale   w(n-4) is shifted out of the delay line
  yfr8 = r8 * r13; lr1:0 = lshift r1:0 by -32;;

//yr10=w(n-2)*a3+w(n-4)*a1
//xr10=w(n-2)*b3+w(n-4)*b1
//                  the delay line is saved
  fr10 = r11 + r12; l[j31 + delayline+2] = yr3:2;;

//yr1=x(n)*scale+w(n-1)*a4+w(n-3)*a2
  yfr9 = r8 + r9;;

//yr1=x(n)*scale+w(n-1)*a4+w(n-3)*a2+w(n-2)*a3+w(n-4)*a1=w(n)
//xr1=           w(n-1)*b4+w(n-2)*b2+w(n-2)*b3+w(n-4)*b1
//in this moment w(n) is introduced in the new delay line
  fr1 = r9 + r10;;

//w(n) is passed in the X block
  xr4 = yr1;;

//xr4=w(n)+w(n-1)*b4+w(n-2)*b2+w(n-2)*b3+w(n-4)*b1=y(n)
//                the delay line is saved
  xfr4 = r4 + r1; l[j31 + delayline] = yr1:0;;

/******************************************* Done ***********************************************/
//read cycle counter and compute the program's cycle count

  comp_cycle_count;

//end of IIR program. Save y(n) into the output buffer
.align_code 4;
  if NLC0E, jump iir_loop; [j2+=1] = xr4;;






_main.end:
___lib_prog_term:

  nop;nop;nop;nop;;



⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -