⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ofdm_ldpc_sse2.lyx

📁 软件无线电的平台
💻 LYX
📖 第 1 页 / 共 3 页
字号:
 After this, a number of iterations were carried out by calling the \begin_inset Quotes eld\end_inset variablemessagemap()\begin_inset Quotes erd\end_inset  and \begin_inset Quotes eld\end_inset checkmessagemap()\begin_inset Quotes erd\end_inset  functions. This represents the decoding algorithm of the message passing decoder. \layout Standard\align center \begin_inset Graphics	filename ldpc.fig	lyxscale 75	scale 75\end_inset \layout Standard\align center Figure 5: The node structue for the message passing decoder. \layout StandardIn the \begin_inset Quotes eld\end_inset variablemessagemap\begin_inset Quotes erd\end_inset , the outgoing message on each edge is determined as the exponential of the sum of the received value and the incoming messages on all other edges connected to that vnode.\layout StandardIn the \begin_inset Quotes eld\end_inset checkmessagemap\begin_inset Quotes erd\end_inset , the product of all incoming messages is calculated as \layout Standardprod =\begin_inset Formula $\frac{1+prod*message}{prod+message}$\end_inset  \layout Standardwith the erasuernum being incremented if the message =1 = exp(0) (which represents an erasure).\layout StandardIf there are no erasures, the outgoing message on each socket is calculated as \layout Standardoutmessage = \begin_inset Formula $\log\frac{1-prod*message}{prod-message}$\end_inset .\layout StandardThe outgoing message is also restricted to be within certain bounds.\layout StandardIf there is a single erasure, then the outgoing message is 0 on all sockets except the one on which the erasure occured for which the outgoing message is \begin_inset Formula $\log(prod)$\end_inset . \layout StandardIf there are two or more erasures, then the outgoing messages on all sockets are 0.\layout StandardThis program was decoding in 186 ms for the 0.25 rate code.\layout StandardAll the operations like calculations of the new messages were being done sequentially and the arithmetic was in \emph on doubles\emph default . So we needed to optimise this by introducing some parallelism and also by trying to work with integers rather than \emph on doubles\emph default  whose additions take more time.\layout SubsectionOptimisations \layout Subsubsectiontan hyperbolic / arctanh Using Lookup Tables\layout StandardTo optimise the program, we went for a mathematical equivalent of the log/exp method described above. This alternate method used tanh and arc tanh. The idea was to take only the sum of the inputs at the vnode side and send it out as the message.\layout Standard\begin_inset Formula $m_{i(\mathrm{new})}$\end_inset = \begin_inset Formula $\Sigma_{k=0,k\ne i}^{d}m_{k}$\end_inset \layout Standardwhere d is the degree of the vnode and \begin_inset Formula $m_{i(\mathnormal{new})}$\end_inset is the new outgoing message on the socket i.\layout StandardAt the check nodes, first the product of the \begin_inset Formula $\tanh(\frac{message}{2})$\end_inset  over all the sockets is calculated. The outgoing message on each of the sockets would be 2*\begin_inset Formula $\arctan(\frac{prod}{\tanh(\frac{message}{2})})$\end_inset . \layout Standardprod = \begin_inset Formula $\Pi_{i=0}^{d}\tanh(\frac{m_{i}}{2})$\end_inset \layout Standard\begin_inset Formula $m_{i}$\end_inset  = 2*\begin_inset Formula $\arctan(\frac{prod}{\tanh(\frac{m_{i}}{2})})$\end_inset \layout Standardwhere d is the number of degrees of the cnode,\begin_inset Formula $m_{i}$\end_inset  is the message received on the ith socket and \begin_inset Formula $m_{i(\mathrm{new})}$\end_inset  is the new outgoing message.\layout StandardInitailly we used the tanh and atanh of the math library and so the time taken for the execution of the program went up to 285 ms.\layout StandardSo then we implemented lookup tables for both the tanh and the atanh.\layout StandardThe files \begin_inset Quotes eld\end_inset lookup.h\begin_inset Quotes erd\end_inset  and \begin_inset Quotes eld\end_inset lookup.c\begin_inset Quotes erd\end_inset  have the intialization and calling routines pertaining to the lookup tables. The inputs and the outputs of the lookup functions were \emph on doubles\emph default . We chose a step size of 0.001 for both the tanh and the atanh. The maximum input considered for the lookup tanh was \begin_inset Formula $\mathrm{A_{0}(MAX\_ VALUE})$\end_inset  which was set to 25 and the maximum for the arctanh is 1 because the output of tanh is restricted to the range [-1,1]. In the initialization function \begin_inset Quotes eld\end_inset lookup_init()\begin_inset Quotes erd\end_inset  , we initialized the \begin_inset Quotes eld\end_inset tval\begin_inset Quotes erd\end_inset  array to hold the tanh values.\layout Standardtval[i] = \begin_inset Formula $\tanh($\end_inset \begin_inset Formula $-A_{0}+i*\bigtriangleup$\end_inset ) \layout Standardwhere \begin_inset Formula $\bigtriangleup\equiv STEP\_ SIZE$\end_inset \layout Standardand similarly for the aval which held the arctanh values.\layout StandardThe calling functions \begin_inset Quotes eld\end_inset lookup_tanh()\begin_inset Quotes erd\end_inset  and \begin_inset Quotes eld\end_inset lookup_atanh()\begin_inset Quotes erd\end_inset  had a double input x and returned tval[(\begin_inset Formula $\frac{x+A_{0}}{\bigtriangleup}$\end_inset )] and aval\begin_inset Formula $[(\frac{x+1}{\bigtriangleup})]$\end_inset \layout StandardThis gave a speedup of performance and the execution time fell to 185 ms.\layout SubsubsectionConversion to Short Integers\layout StandardUntil now, all the calculations involved \emph on doubles\emph default  which are 64 bit double precision floating point numbers. So we thought we will shift to short integers which are 16 bit and so we can save on the time spent in additions of \emph on doubles\emph default  because operations on short integers take much less time. So we changed all the messages at the vnodes and cnodes to short integers. However the input that we got was in \emph on doubles\emph default . So we worked with \emph on doubles\emph default  upto the demodulation stage. And in the \begin_inset Quotes eld\end_inset initializegraph()\begin_inset Quotes erd\end_inset  function where we assigned the received values to the vnodes, we scaled and converted them to integers in the range -T to T, where T was defined in \begin_inset Quotes eld\end_inset lookup.h\begin_inset Quotes erd\end_inset . This was done so that we could maintain some precision while shifting from \emph on doubles\emph default  to short integers. We also defined another parameter TA which defined the range of the outputs of \begin_inset Quotes eld\end_inset lookup_tanh()\begin_inset Quotes erd\end_inset  and this would be the domain of the \begin_inset Quotes eld\end_inset lookup_atanh()\begin_inset Quotes erd\end_inset  function. We had to change the \begin_inset Quotes eld\end_inset lookup_tanh\begin_inset Quotes erd\end_inset  and the \begin_inset Quotes eld\end_inset lookup_atanh()\begin_inset Quotes erd\end_inset  functions because now they had to take an integer input. So in the initialization function \begin_inset Quotes eld\end_inset lookup_init()\begin_inset Quotes erd\end_inset , for the tval array, we first converted the input from the range -T to T back to \begin_inset Formula $-A_{0}$\end_inset  to \begin_inset Formula $A_{0}$\end_inset and then scaled the output of tanh from -1 to 1 to the range -TA to TA. \layout Standardtval[i] = (short int) ((double)TA*\begin_inset Formula $\tanh(\frac{((double)(i-T)*A_{0}}{2*T})$\end_inset \layout StandardWe have divided by 2 inside the tanh bracket because in the algorithm, the tanh(\begin_inset Formula $\frac{message}{2}$\end_inset ) is calculated. So we have incorporated the division by 2 in the initialization itself so that when we call lookup_tanh, we call it with the message instead of \begin_inset Formula $\frac{message}{2}$\end_inset .\layout StandardIn the initialization of the aval array, we convert the input to the range -1 to 1 from -TA to TA and then the output of the atanh is scaled by \begin_inset Formula $\frac{T}{A_{0}}$\end_inset  and restricted to the range -T to T.\layout Standard\added_space_top smallskip \added_space_bottom smallskip \align center \begin_inset Graphics	filename mapping.fig	lyxscale 75	scale 75	keepAspectRatio\end_inset \layout Standard\align center Figure 6: The scaling of the ranges of tanh and arctanh for integer arguments.\layout StandardWe have also absorbed the multiplication by 2 in the initialization because again always 2 * atanh is called from the program.\layout StandardThe lookup functions just return the value from the array at the index x+T for the tanh and x+TA for the atanh where x is the input integer. This was working in 140 ms. However , this was not giving very good performance.\layout SubsubsectionRestructuring of the graph structure\layout StandardUntil now the parsing of the nodes in the graph was according to the order in which they were found in the \begin_inset Quotes eld\end_inset graphs.c\begin_inset Quotes erd\end_inset  file. We wanted to restructure this so that we could have nodes of the same degree together. The motivation for this came from the fact that if the number of additions for vnodes of same degree are the same and if we group them together, then we can do the additions in parallel by using Intel SSE instructions. So we restructured the graph structure and arranged the vnodes and cnodes as 3 Dimensional arrays to hold the messages received on each edge. The first parameter, degindex, ranges from 0 to the number of different types of vnode/cnode degrees, the second parameter, socket, ranges from 0 to the degree of the vnodes/cnodes. This was because we kept all vnodes/cnodes of same degree together. The third parameter, dispindex, ranges from 0 to the number of vnodes/cnodes of that particular degree.\layout StandardWe have also made an edge structure which has two integers pointers, msgforvnode and msgforcnode. Each edge connects a socket of a vnode vi to a socket of a cnode ci. When the edge is initalized, msgforcnode points to the location, in the 3D array of cnodes, corresponding to the socket of the cnode associated with this edge. Similarly, msgforvnode points to the location, in the 3D array of vnodes, corresponding to the socket of the vnode associated with this edge. This initialization takes place when we initialize the vnodes.\layout StandardWe have made two 3 Dimensional arrays of edges, guided by the same parameters degindex, socket and dispindex, one each for the cnodes and the vnodes. So now for each socket at the vnode side, the corresponding edge can be found in the 3D edge array using the same 3 parameters. The same holds for sockets on the cnode side. When we initialize the edge structure, we copy the edge into the 3D edge arrays of the vnodes and of the cnodes at appropriate locations.\layout StandardWhen a vnode has to send a message on an edge, it will simply write out the message in the msgforcnode pointer of the corresponding edge in the 3D array of vnode edges. \layout Standardvedges[degindex][socket][dispindex].msgforcnode \layout Standard= newmessage from vnode[degindex][socket][dispindex]\layout Standard\added_space_bottom medskip \align center This will place the message in the appropriate location in the 3D array of cnodes. This code was working in 45 ms\newline .\newline \begin_inset Graphics	filename graph.fig	scale 60	keepAspectRatio	rotateOrigin center\end_inset \layout Standard\align center Figure 7: Organisation of the nodes in the graph structure\layout SubsubsectionSSE Optimisations\layout StandardNow the messages from a particular socket(socketindex) of the vnodes/cnodes having the same degree(deg) are arranged is contiguous locations. \layout StandardOn the vnode side, the messages on all the sockets are to be added to obtain the new message. The messages, being short integers(16-bit), can be loaded 8 at a time in an xmm register(128-bit) and added for all the sockets from socketindex=0,1,...,deg-1. The dispindex is then increased by 8 and the same procedure repeated.\layout Standard\align center \begin_inset Graphics	filename vnode.fig	scale 90	keepAspectRatio\end_inset \layout Standard\align center 

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -