📄 article3.htm

📁 基于FPGA的嵌入式机器人视觉识别系统模块源代码
💻 HTM
📖 第 1 页 / 共 5 页
字号:
<p>
<h2>Introduction:</h2>
<p>
<table align="right"border="2"cellpadding="6"background="grid.gif"><tr><td>
<center><b>Article Format:</b></center>
<ul>
<li>Introduction
<li>Overview
<li>Interconnects
<li>VIP
<li>FPGA
<li>Code
<li>Algorithm
<li>Image Transfer
<li>Application
<li>Other Functions
<li>Schematics
<li>Implementation
<li>First Steps
<li>Future Steps
<li>Wrap-up
</ul>
</table>
In the author's past Encoder Articles the review of technology related to amateur robotics grows and grows in an ever-increasing fashion.  Some of the more recent articles focus on digital logic like CPLDs and still others on advanced logic implementations like laser range finding.  This article will take the progression of advanced logic application one step further.  The article will detail the implementation of a generalized color vision system for embedded robotics applications.
<p>
Typical of many of the systems developed by the author, this system offloads most of the functionality from the processor making it ideally suited for single processor embedded robotics application.  Offloading the processor from vision data gathering tasks makes this system truly real time.  Not that it is extremely fast, or resource efficient, but it is real-time in the sense that the processor is unloaded to the point that it may handle any number of other events without being bogged down with the need to service the vision system constantly.  Compared with other vision systems available to amateur robotics enthusiasts today, there is no overhead involved with moving the data into or out of the processors memory space.
<p>
This article, in its presentation, will attempt to cover all high level aspects of the vision system at a level appropriate for intermediate-advanced reader already familiar with some concepts relating to video imaging, programmable logic and general high speed embedded processor design.  An overview of the technology will be followed by application data and wrapped up through the presentation of implementation details.
<p>
<h2>Overview:</h2>
<p>
Where past vision articles present specific implementations detailing data extraction from NTSC style video signals, this article presents a generalized vision system implementation.  The system presented in this article is more generalized in that it allows for full frame real time video capture.  As the article progress, detailed implementation as well as some specific applications are presented, like color blob detection and a new version of laser range finding.  To start off, an overview with description of the large / granular blocks of functionality follows:
<p>
This design makes use of six basic functional blocks to build the vision system (u-Processor, FPGA, SRAM, Video Input Processor & Camera).  Referring to fig.1.1 below the reader can identify each of these blocks and the connectivity they share.  The thicker lines with arrows represent the video paths (analog & digital).
<p>
<center>
<h3>Figure 1.1: Design Overview</h3>
<p>
<img src="figure_1.gif"width="448"height="226">
</center>
<p>
The first of these functional blocks is the u-Processor system.  Since this is a generic vision system, the article will not focus in on any u-Processor specifically.  The implementation is generic enough that just about any processor with an address bus and data bus that allow for memory mapped operations will support it.  Near the end of this article the author will demonstrate the specifics of the design implementation and processor choice.  It is enough for now to understand that practically any processor will suffice.  It is also relevant to note that even under powered 8-bit processors can be used as the high overhead of running the system is taken on through the use of programmable logic.
<p>
The second functional block in the vision system is the heart of the operation.  Programmable logic is used to implement a number of state machines, memory mapped registers and other functions to off-load the routine image capture functions from the processor.  The logic quantity and density to implement all of the functionality required by the color vision system require devices much larger than the CPLDs used in the author's previous projects.  In this project an FPGA produced by Xilinx will be used.  In choosing an FPGA it was important to find one that used voltage levels compatible with the processor.  The need for voltage level conversion chips just drives up board size and debugging complexity.  The choice that best suited the color vision system comes from the Spartan II-E family by Xilinx, the XC2SE-300.  A much smaller FPGA could have been used in it's place, (even the smallest one they make the XC2SE-50) however the robot uses many vision systems in the same chip concurrently as well as other features, not discussed in this article, that need the extra logic.
<p>
The third functional block in the vision system is the analog front end.  In a sense it represents an analog to digital conversion function.  Since the input to this system is color NTSC video and the output from this system is raw digital video, it takes a bit more circuitry than a standard high speed analog to digital converter.  If the goal was black and white image capture, simple A-to-D would be enough, but capturing the phase relationship to high-speed carriers in color video requires much more power.  This raw silicon power comes in the format of a VIP (Video Input Processor).  The VIP used in this design is the SAA7111A produced by Phillips Semiconductor.
<p>
<table align="right"><tr><td><center>
<h3>Figure 1.2:<p>Camera</h3>
      <p> <img src="color_cam.jpg"width="150"height="387"></table>
The next functional block (fourth) in the vision system is optional in some cases.  The block represents a static RAM pool.  It is connected directly to and only to the FPGA.  Data is moved into and out of the RAM by a state machine with-in the FPGA.  In this case, the data is that of the video image being captured.  This component is optional based on some of the design constraints.  Today's modern FPGAs contain large amounts of RAM.  The amount of RAM required to store an image depends on the size (X-Y + color depth) of the image the system is designed to capture.  The FPGA used in this example has enough internal RAM (referred to by Xilinx as Block-RAMs) to store an entire video image (of size sufficient for many robotics applications like 1/4-D1).  This system hosts external SRAM due to system requirements to implement multiple concurrent video systems in the same FPGA as well as other functions.  To simplify the interface, a pair of SRAMs was chosen giving 16-bit wide access to accommodate color depth and timing issues.  More on this later...  The parts are part number AS7C34096 from Alliance Semiconductor.
<p>
The fifth functional block in the vision system is the color video camera.  The choice of video camera, many years back, centered around physical size parameters required to meet the space available on the robot.  In this application, as long as the optics and CCD are of reasonable quality, the NTSC - CCIR/ITU601 definition assures that most cameras will work.  Cameras covering the entire range from JameCo to Edmund Scientific will fit the bill.  The image to the right (<u>fig.1.2</u>) shows the camera, including optics, installed in a test fixture on the author's bench.  The laser, sonar and IR sensors are also included in the picture to offer the reader a sense of just how small high quality color NTSC cameras can be packaged these days.
<p>
The sixth and final functional block in the vision system is that of a built in color LCD direct drive display.  It is drawn with a dotted line in the figure above, as it will not be detailed in full in this article.  The complexity of the vision system compounded with that of the display system and their interactions to the same SRAM will have to wait for yet another article.  It is noteworthy that built in display technology greatly simplifies the debugging of video systems.  
<p>
<br clear="right">
<table align="right"background="grid.gif"><tr><td>
<center><b>Table 1:</b></center>
<ul><li>Camera Interface:<ul><li>Power +12VDC & Return<li>Composite video & signal return</ul></ul>
<p>
<ul><li>VIP Digital Output:<ul><li>Digital Color Space Data<ul><li>5-bits Red<li>5-bits Green<li>5-bits Blue</ul><li>H-sync<li>V-sync<li>data clock</ul></ul>
<p>
<ul><li>SRAM Interface:<ul><li>16-data bits (15 used, Bi-Dir)<li>19-address bits<li>Write Enable<li>Output Enable<li>Chip Select</ul></ul>
<p>
<ul><li>Processor Interface:<ul><li>8-data bits (bidirectional)<ul><li>Processor Data Bus [31:24]</ul><li>18-address bits<li>Read/Write<li>Chip Select (8 bit access)</ul></ul>
</table>
<h2>Interconnects:</h2>
<p>
In fig.1.1, above, the arrows represent electrical interconnection.  The thicker lines depict data paths for video data, both analog and digital.  This data takes on different formats between the different blocks in the picture.  The easiest data path to follow is that of data capture (right to left), from the camera to the processor.  Between blocks 5 & 3 (Video Camera & VIP) the video is in analog NTSC format and is conducted through impedance matched 75ohm coaxial cable.  Between blocks 3 & 2 (VIP & FPGA) the data is in a multi-wire time synchronous digital format with separate syncs (H&V)m a clock wire and 16 bits of color data.  Between blocks 2 & 4 (FPGA & SRAM) the video data is only content data, and the time content (H-Sync, V-Sync & Clock) has been abstracted as the address in which the data is stored.  (More about this later.)  Finally, between blocks 1 & 2 (u-Processor & FPGA) the video data is randomly accessed memory mapped image data <i>(across the processors address & data bus)</i>.  The application outlined in this article will demonstrate an 8-bit data bus, to reach directly the largest reader base, but any width can be used.
<p>
The table, right, outlines the number and type of signals represented by each of the thick black interconnect lines in Fig.1.1, above.
<p>
There are two other types of interconnect depicted in Fig.1.1.  Thin lines represent these connections.  The first of these interconnects, between block 1 & 2, connects some general purpose IO pins on the u-Processor to the programming pins on the FPGA used to load the FPGA at boot time.  The second interconnect, between blocks 2 & 3, is an I2C connection.  Phillips uses I2C to interface to the control registers in most of their video processing ICs, so it is no surprise to find it here.  The FPGA in this application implements several concurrent I2C engines.  (More on this later in section 9, Other Verilog Functions.)
<p>
<p>
<h2>Operational Overview:</h2>
<p>
Referring to Fig.2, below, the reader can see the physical instantiation of each of the six elements from the preceding discussion.  There are three major differences between the physical implementation and the overview depicted in Fig.1.1.  First, the vision system design is a daughter card and as such the u-Processor system resides on another circuit card.  The 64-pin expansion header interconnects the u-Processor address and data bus including a handful of control signals between PCB-assemblies.  There are bi-directional bus buffers on the u-Processor circuit card to add signal drive integrity, at speed, through the connector.  The second primary difference is that there are 3x VIPs installed on this circuit card for interface to 3x video subsystems all running concurrently with-in the FPGA.  The third and final deviation from the design template of Fig.1.1 is the connection of the I2C busses.  These don't run from the FPGA on this board as indicated in the overview Fig.1.1.  Rather, these I2C communication channels run from a separate FPGA on the main (u-Processor) circuit card assembly and are connected to the 3x VIPs via discrete signals on the 64-pin expansion stacking bus.  Even though a 208-pin TQFP package is used in this design, a little later the article will demonstrate that a limitation of available IO pins drove this '<i>awkward</i>' design choice.
<p>
<center>
<h3>Figure 2: Design Footprint</h3>
  <p> <img src="board_top.gif"width="507"height="433">
</center>
<p>
Looking at these two figures (1.1 & 2) as data flow templates, the reader can identify a right to left data path for the video as it passes through the system. (5=>3=>2=>4=>2=>1)
<p>







<a name="saaregload"></a>
<h2>VIP: (Video Input Processor)</h2>
<p>
<table align="right"width="160"background="grid.gif"><tr>
    <td><center><a href="http://www.semiconductors.philips.com/acrobat_download/datasheets/SAA7111A_4.pdf"> 
      <img src="adobe.gif">
      <p>SAA7111 Data Sheet</a> 
    <td><center><a href="saaregload.htm"><img src="code.gif"><p>VIP Config 'C'-code</a>
</table>
The article is focusing in on digital FPGA capture and processing of video images, however as indicated in the section above, the connection to the video camera is analog NTSC video.  It follows then that somewhere in between a conversion process from analog to digital must take place for the capture process to begin.  The device for the task is called a VIP, or video input processor.
<p>
The VIP used in this project is the SAA7111A made by Phillips Semiconductor.  A link to the PDF data sheet is available, right.  Given the complexity of a 72 paged data sheet, a couple dozen configuration registers controlling wide ranging filter values and digital interface settings, the 'C' code used in this project to initialize and configure the VIP has also been provided.
<p>
The <code>i2c_write</code> and <code>i2c_read</code> functions used in the 'C' source to the right are presented, later in the article, in the section that deals with I2C Verilog functions.  For now it is enough to know, that the parameters passed to the functions are as follows: 1, I2C bus number to write to.  2, Address of the unit on the bus to access.  3, Sub address to access.  4, Data value to write.
<p>
<table align="left"background="grid.gif"cellpadding="6"><tr><td><center><b>Table 2:</b></center><ul><li>R = Y + 1.371 Cr<li>G =Y-0.336 Cb - 0.698 Cr<li>B = Y + 1.732 Cb.</ul></table>
The SAA7111A proved the right choice for this project due to its support of legacy RGB raw digital video format with separate syncs (H&V) plus clock.  In contrast most of the newer VIPs, on the market today, use the YUV color space.  The SAA7111A provides the color space conversion for the designer, which removes computational complexity from the FPGA.  The conversion from Y,Cr,Cb to RGB is governed by the equations in table 2, left.  Various blocks of open source, Verilog code for FPGAs exist to covert between color spaces.  Another design criteria considered was the parallel 19 bit wide data output format of the SAA7111A Vs the 8-bit time multiplexed coded CCIR-656 format supported by so many other VIPs on the market.  At the VIP decision point of the design, this project was somewhat complex as far as amateur projects go.  Thus, the decision was made to simplify and choose a chip set with the desired output format.
<p>
In this application, RGB color space is preferred for the ease in which color "closeness" can be determined.  Color closeness has application in edge detection, and advanced blob detection.  The relationship is governed by projecting the 3 components of color (RGB) as Cartesian coordinates into a three dimensional color space (cube).  The closeness of a color match between two differently colored pixels is then given by the radius of the 3D sphere encompassing both pixels with one at its center.  The quantitative value is governed by the standard 3D distance formula were R2,G2,B2 and R1,G1,B1 are the color components of the respective pixels being compared.  Future developments related to this vision system will take advantage of this convenient relationship.
<p>
<center>
<h3>Figure 3: Color Space Distance Equ.</h3>
<p>
<img src="figure_3.gif"width="627"height="72">
</center>
<p>
The connectivity of the SAA7111A is pretty straight forward, as the snippet from the app-note below (Fig.4.1) indicates.  Complexity arises when dealing with the volume of data produced by the VIP in real time.  In the digital configuration used, the VIP outputs a 16-bit word of data + 2 syncs and a data clock at a rate of <u>13.5Meg-words/sec</u>.  Storing and algorithmic processing of data with these high volumes is where the FPGA that the article will address next comes into play.  First, a quick review of some issues when mixing analog and very high speed digital circuitry.
<p>

<center>
<table cellspacing="10"><tr><td><center>
<h3>Figure 4.1: Implementation Snippet</h3>
<p>
<img src="saa_chip.gif"width="386"height="492">
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -