📄 article3.htm

📁 基于FPGA的嵌入式机器人视觉识别系统模块源代码
💻 HTM
📖 第 1 页 / 共 5 页
字号:
<li>The last file is a pin definition file.  It maps the port connections of the top level Verilog file to make physical connections with pins on the physical device.  It can also be used to implement different signal drive strengths to compensate for routing or loading purposes.
</ol>









<table align="right"><tr><td><center><h3>Figure 6.1: State Diagram<p>video_capture.v</h3><p>
<img src="state_machine.gif"width="264"height="665">
<tr><td><center><h3>Figure 6.2: D1 Res State Digram<p>Modified Algorithm</h3><p><img src="state2_machine.gif"width="264"height="1172">
</table>
<p>
<h2>Algorithm:</h2>
<p>
The primary algorithm that "captures" the digital video stream coming from the SAA7111A Video Input Processor is contained in the Verilog file video_capture.v, presented above.
<p>
The video capture function consists of a conversion process, where the <u>color space information is transformed from a data vs time context to a data vs storage location context</u>.  In this translation process, the timing of the data coming into the FPGA from the VIP must be translated into the address in which to store the data in the external SRAM.  The algorithm is just that simple.
<p>
The state machine that implements the video capture algorithm consists of only 10 discrete states.  In the coding style, it is implemented in two pieces, the first half is dedicated to state transitions and latched register operation, while the second half (dependant only upon the current activation of a particular state occupation) uses continuous assignment to implement its functionality.
<p>

The state machine simply keeps three running counters, a pixel_counter that tracks pixels along the video image, a line_counter that tracks lines down the video image, and an addr_index_counter that tracks the address where the next "pixel" will be stored in the external SRAM.
<p>
<blockquote>
The detailed list below, is best used / read side by side with the Verilog file video_capture.v and the state machine flow diagram outlined in fig.6.1.
<p>
<ol>
<li>In State one, the system is dormant.  It loops back on itself, always returning to state one until it receives the go ahead from the processor bus interface.  This transition is denoted by an, active high signal called <code><b>begin_capture</b></code>.  The signal <code><b>begin_capture</b></code> is generated by the processor bus interface in the top-level file dj_top.v.

<li><ul><li>State two initializes a variable that keeps track of the number of video lines (<code><b>line_count</b></code>).<li>  It also initializes the address variable (<code><b>addr_index_count</b></code>) that points to the location in SRAM to store the first video pixel.<li>Finally, this state loops back on itself, until it recognizes a valid vertical sync pulse, thus successfully locking the capture function to the beginning of a valid video frame.  (This is indicated by the active high signal <code><b>vsync_edge_found</b></code>.)  This signal becomes active after the first 25 lines of the video frame have passed.</ul>

<li><ul><li>State three initializes a variable (<code><b>pixel_count</b></code>) that keeps track of the "<i>pixel</i>" or time increment across the video line that is being captured.<li>This state loops back on itself until the signal <code><b>hsync_edge_found</b></code> becomes active high.  This signal becomes active after the front porch and color burst have expired indicating the visible first pixel of any given line.</ul>

<li><ul><li>State four waits repetitively on valid data clock signals (<code><b>llc_edge_found</b></code>).<ul><li>The address where the video data will be stored (<code><b>addr_index_count</b></code>) is latched.<li>The data to be stored (<code><b>video1_raw</b></code>) is latched<li>this state is exited</ul><li>
Otherwise this state loops back on itself awaiting the assertion of <code><b>llc_edge_found</b></code>.</ul>

<li>State five accomplishes three tasks.<ul><li>From the back ground continuous assigner, a pulse is generated to send the latched address and latched data to the RAM scheduler for storage in the SRAM.<li>The address to store the next data value in the SRAM (<code><b>addr_index_count</b></code>) is incremented.<li>The variable that tracks the current pixel with-in the video line (<code><b>pixel_count</b></code>) is incremented.<li>Finally, this state passes through to the next state automatically.</ul>

<li>State six waits for the next activation of the signal <code><b>llc_edge_found</b></code>.  Waiting for this signal in a repetitive looping fashion, and then doing nothing with the data effectively skips every other "pixel".

<li>State seven checks to see if the variable <code><b>pixel_count</b></code> has been incremented to the end of a valid line length of pixels.  If not at the end of a video line yet, loop back to State four.  Otherwise we have reached the end of a video line and the state machine vectors to state eight.

<li>State eight prepares for a new video line.<ul><li>The address to store the next pixel in the SRAM (variable:<code><b>addr_index_count</b></code>) is incremented.<li>The variable that tracks the number of lines down the video image (<code><b>line_count</b></code>) is incremented.<li>The state is automatically exited to state nine.</ul>

<li>State nine checks to see if we're done with the current image capture.  This is done by comparing the variable <code><b>line_count</b></code> to a constant that represents the number of lines expected in the image.  The state machine loops to state three if the number of video lines has not yet expired, otherwise, it progresses to state ten.

<li>State ten outputs a signal pulse through the variable <code><b>end_of_screen_capture</b></code> in the background continuous assignment phase and then loops all the way back to state one to await the launch of another capture cycle.

</ol>
</blockquote>
<p>
At the top of the Verilog file, (video_capture.v) there are three other pieces to review.  The signals <code><b>video1_llc</b></code>, <code><b>video1_hsync</b></code>, <code><b>video1_vsync</b></code> are each latched into a register on rising clock edges.  On successive clock edges these signals are again latched into a second set of registers before use.  This is a technique called double buffering.  This technique is required to assure proper setup and hold times for signals that cross clock boundaries.  Since these signals come from external circuitry run from independent (asynchronous) clock sources, they qualify.

<p>
<u>Challenge to the reader</u>:  First, read through the Verilog file, review the state machine in fig.6.1 and the detailed explanation above.  Once you have a good understanding, try to answer the following question without reading ahead for the answer: How many horizontal pixels and how many vertical lines are being captured?
<p>
If you could not find the answer and need a hint, the answers are in the file video_capture.v, above, in state 7 (title: VID_CAP5_CASE) and in state 9 (title: VID_CAP7_CASE).
<p>
<h2>Changing the Algorithm:</h2>
<p>
The Verilog files presented here in, were designed to work at approx.~1/4-D1 resolutions.  The constraint was arbitrarily determined, aligning the image capture size to the display pixel resolution of the LCD the author is installing on the robot that will first implement this vision system.  Making changes to the captured digital image resolution is easy, next, a look at how to change the capture resolution to full CCIR/ITU601 - D1 resolution.
<p>
The system is capable of (and has been tested) capturing full D1 video frames.  With the configuration file provided in this article, the SAA7111A Video Input Processor is already configured to convert analog data at D1 resolution.  It merely requires a bit of cut and paste with a few minor changes to the state machine logic running in the FPGA to store the additional data properly.  Fig.6.2, right, illustrates these changes (highlighted green <img src="green_box.gif">) and rearranging denoted by state numbers.
<p>
First, remove state-6 in the video_capture.v state machine.  This state was merely throwing away every other pixel.  At the end of each video line, the address must be incremented by a full line width to by-pass a line of video that will be filled in during the next interlaced frame.  To fill in all of these interlaced lines, state machine lines 2-10 are replicated at the end of the state machine.  State #10 can be discarded as the point in the state machine only indicates the end of the first interlaced frame, not the end of the complete D1 image.  In the new states appended to the end of the state machine, the <code><b>line_counter</b></code> and <code><b>pixel_counter</b></code> variables are still initialized to zero and treated as they were in the first half of the state machine.  The only piece that changes is that the initial address stored into <code><b>addr_index_count</b></code> is offset by one line length to start filling the interlaced lines.  Finally, the numbers used to compare against for looping are adjusted appropriately.
<p>
Of course, before the second state machine can be implemented, the state numbers must be linearized.  One simple way to do this is, as the author has done, give parametrically defined names to the states, and then rearrange the numerical values in the parameter list.  An example of this can be seen in the two, parameter lists [9:0] near the top of the file video_capture.v.
<p>
Now that the reader has seen the physical components, HDL code and algorithmic approach to this project, it is time to see how all of these come together in a completely functional system.
<p>




<br clear="right">
<p>
<h2>Application 1: Colored Blob Tracking</h2>
<p>
<table align="right"><tr><td><center><h3>Figure 7.1: Color Blob Detection</h3>
      <p> <img src="vid_cap4.jpg"width="400"height="283"><tr><td><center><h3>Figure 7.2: Color Filter Example (blue)</h3><p><img src="vid_cap5.jpg"width="400"height="260">
</table>
One of the many applications the vision system currently implements is that of simple color blob detection.  The image, right, is a digital snap shot of a color LCD display screen that is also driven by the FPGA on this board.  In this image, the system is identifying the green centroid.  Green is defined to the system as follows:
<ul>
<li>red_bounds   {<b>8:2</b>}
<li>green_bounds {<b>31:10</b>}
<li>blue_bounds  {<b>24:10</b>}
</ul>
<p>
These bounds are specified to the system through registers in the top level file called <code> red_upper, red_lower, green_upper, green lower, blue_upper & blue_lower</code>.  The readers can identify them by name easily in the file dj_top.v.  These six (5-bit) registers are written into from the u-Processor bus interface (memory mapped), and their contents in turn are output to the blob_detect module for comparison of incoming raw video.
<p>
Color depth in this system is 5-bits in each of R,G & B.  This leads to a color span of 0~31, with the color white being represented by [31,31,31].  For storage and processing purposes, these are dealt with as a single 16-bit word formatted MSB first as [0, R4, R3, R2, R1, R0, G4, G3, G2, G1, G0, B4, B3, B2, B1, B0]
<p>
The second image right, fig.7.2, displays an image of the system that has been tuned to filter only blue values.  In the image, the FPGA has removed all other color information that does not pass the six filter criteria.  In this image the reader can see how other, unexpected, objects tend to contain bits of the color in question.  This can be observed in the circular object, which is actually the outer reflection of the lamp in the logic analyzer's display.  In conjunction with this image, the following filter values are used, <i>Notice the substantial green content in the blue filter values</i>:
<ul>
<li>red_bounds   {<b>16:5</b>}
<li>green_bounds {<b>30:16</b>}
<li>blue_bounds  {<b>31:12</b>}
</ul>
<p>
The text / font, cursor and locating lines are overlaid on the image real-time via information sent to the FPGA from the u-Processor.  The entire section of the SRAM that contains the video image being captured is memory mapped into the u-Processors memory space.  The X & Y data points (blob centroid) are read directly from the FPGA through memory mapped registers available to the u-Processor after video frame conversion is complete.  All that is required of the u-Processor system to overlay text and graphics then is to write to the corresponding memory locations.
<p>
There is some hidden complexity in action here, as there are really two storage spaces in the external SRAM for each image being captured.  The first is the location for the image currently being captured.  This location is swapped (ping/pong style) with the secondary location which is the location currently being drawn on the display screen and also being used for processor graphics & font overlay.  This necessary evil insures that torn / partially captured video displays are never output to the LCD which is also a real-time system.
<p>

<table width="660"align="left"><tr><td><center>
<h3>Figures 7.3 ~ 7.6:</h3>
      <p> <img src="new_set_000.gif"width="320"height="240"><img src="set3_025.gif"width="320"height="240"> 
        <img src="set3_026.gif"width="320"height="240"> <img src="set3_027.gif"width="320"height="240">
    </table>
If the reader is paying close attention, they may notice that the numbers graphically overlaid (fig.7.1 & fig.7.2) are a little "<i>off</i>" for the location indicated on a typical CCIR/ITU601 video signal.  In this application 1/2 of the vertical video lines and a little more than 1/2 of the horizontal video samples are being thrown away.  This is only for formatting purposes of the LCD being used given its 320x240 resolution.  The discussion above in the section on the capture algorithm could be applied if a higher resolution LCD were used.
<p>
The next four images in this section (figures 7.3 ~ 7.6) are direct digital captures from the vision system.  The first pair of images show the laser light tracked on the authors finger between a distance of ~2' and 4'.  The second image pair shows the differences tracked between 4' and ~6' with a lateral shift.  Graphics overlay takes place in the FPGA prior to image download.  In this set up, the vertical displacement between laser field generator and camera is small (~2") thus leading to lower parallax displacement between images.  At this point the reader may be wondering about the color content on these images and the fact that it looks a little blotchy.  These images were captured before a misalignment in the passing of the 4lsb green bits was discovered.
<br clear="left">
<p>
<hr>
<p>

<h2>Application 2: Structured Light Extraction</h2>
<p>
If the reader has read the author's earlier articles like the one from Oct of '91 on laser range finding, you will then understand the concepts and importance of this functionality to mobile robotics.  As a second application, this system has been used to implement a laser range finding function in the FPGA as well.
<p>
<a name="dj10v"></a>
<table align="left"width="80"background="grid.gif"><tr><td><center>
<a href="djv10.htm"><img src="code.gif"><p>video ranging.v</a></table>
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -