⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 readme

📁 基于Hu矩和支持向量机手完整性检测算法及应用.手势识别
💻
字号:
Introduction============The ideas presented here are part of the results obtained in the framework of my PhD thesis in visual recognition of gestures at Tec de Monterrey Campus Cuernavaca in Cuernavaca Morelos, Mexico. The source code is intended primarily for people interested in gesture recognition and to share my implementations of color detection, feature extraction, etc. using a single videocamera. The visual system is based on the OpenCV libraries. General procedure=================The system starts with a person positioned in front of the videocamera at a distance between 2m/3m, approximately --although this depends on your camera's field of view. Once the user's face has been located, anthropological measures based on the face width are used to estimate the user's torso and the expected initial position of the right-hand,assuming that the arm is initially at a rest position. At this stage, skin and non-skin "personal" color histograms are generated from the user's face and torso, respectively; for simplicity, for this implementation all that process is done with a single image. Also, face region is fixed, so it is assumed that the person does not move once the face has been located. Upon completion of the above steps, skin pixels over the expected position of the hand are segmented with Camshift. The backprojected image required by Camshift is obtained by fusioning the "particular" and "general" skin color histograms, as described below. If the hand --or a patch of skin pixels-- has been found, the system marks the initial region where the hand has been segmented for the first time, and starts to track the hand motion. Tracking is focused on a region of interest of fixed size. Spatial criteria is used to record the user's "gesture" information: if the hand is outside its initial position, the user's posture is queued, and when the hand has returned to its initial region, the system stops queueing and record the observation sequence of the gesture to a file. Gesture information is composed by two rows. The first row starts with 'TPL:XX' where XX is the number of observations for that gesture.The second row contains the complete sequence of the gesture with each individual observation within parenthesis. The values of the observations are (x,y) coordinates of the user's hand, torso and head in the following order: 1) (x, y)-coordinates of the upper and lower corners of the rectanglethat segments the right-hand,2) (x, y)-coordinates of the upper and lower corners of the rectanglethat segments the user's torso, and3) (x, y)-coordinate of the center of the user's face. (Yes, I know, I wish I had paid more attention to my advisors comments and recordedthe whole rectangle of the face...)An example of a complete gesture sequence:TPL:13(281 268 308 325 315 183 389 294 353 137)(278 248 307 314 315 183 389 294 353 137)(276 231 301 265 315 183 389 294 353 137)(277 212 305 239 315 183 389 294 353 137)(283 189 311 215 315 183 389 294 353 137)(285 159 309 202 315 183 389 294 353 137)(287 147 308 194 315 183 389 294 353 137)(284 146 307 200 315 183 389 294 353 137)(281 170 301 210 315 183 389 294 353 137)(282 188 303 224 315 183 389 294 353 137)(283 206 305 240 315 183 389 294 353 137)(281 237 318 258 315 183 389 294 353 137)(279 268 318 292 315 183 389 294 353 137)All (x,y) coordinates are relative to the usual upper-left corner of the image and assuming a resolution of 640x480 pixels. You can see an example on how to extract some posture and motion featuresby downloading the gesture database provided in the same SourceForge project.Results=======When the face has been located you will see in the image the following:* A rectangle showing the face region* A rectangle of the user's torso* A blue rectangle that shows the estimated initial hand positionAlso, when the hand has been located you will see:* A red rectangle that delimits the position where the hand has been located* A yellow rectangle that corresponds to the region of interest * A small green rectangle inside the ROI to segment the hand   A video that shows the ejecution of the visual system and the test environment can be seen at:http://www.youtube.com/watch?v=dFff01TjvwwUseful parameters=================* User's anthropological measures --body_dimensions() function. In particular,  the hand must lie within the expected initial hand position --depicted in blue.   This could avoid some   problems, for example if you use a belt color brown like me.   Silver colors are a real pain in the neck too. * The fit() function makes an adjustment of the rectangle that segments  the user's face; modify the fitting criteria if you do not have an   oval or round face. Given that anthropometrical features are based on  face width, and I have seen this could be an important parameter.   * The size of the region of interest used to track the handSome assumptions and advices============================* OpenCV is installed and running on your computer* A videocamera is attached to your computer accesible via /dev/video0* The image resolution is 640x480 pixels * Use constant and controlled illumination (at least as possible)* Environments with white walls and white lamps are not suitable. Contrasting   background color against the user's clothes and skin color is convenient * You should use suitable clothes (long-sleeves with contrasting color to skin) * Fix your camera settings (bright, contrast, etc.) to proper values.   For example, the auto-gain function, equalizes the image accordingly to  the lighting conditions; this produces color distortions so the initial   color histograms could not be valid anymore* Please, be sure that your face and right-hand are visible within   the field of view of the videocamera at start up. If your are working alone,   place your keyboard close to your hand. * Place the videocamera with its base parallel to the floor   with the tilt angle equal to 0. I used to put my videocamera at 97cm   to the floor.   Skin color adaptation=====================Following the work on skin classification:Michael J. Jones, James M. Rehg: Statistical Color Models with Application to Skin Detection. International Journal of Computer Vision 46(1): 81-96 (2002)I constructed "general" skin and non-skin probability functions --P(RGB|skin) and P(RGB|non-skin), respectively-- with skin pixels taken from more than 30 people under different environments and using 4 videocameras. Around 2 millions of skin pixels and 20 millions of non-skin pixels were sampled. Skin pixels of the hand were classified with the probability rule:P(RGB|skin) > P(RGB|non-skin)Unfortunately, due to extreme lighting conditions of the experimental environment (that included white walls), the colors perceived by the camera (Sony EVI-D30) were low-saturated and the system was not able to classify the skin pixels accurately. Then, I decided to obtain some color samples from the user's face to combine this information with the general color function previously constructed. I tested different fusion rules based on the classic expected value P(RGB|skin) = [V_g * P_g(RGB|skin)] + [V_p * P_p(RGB|skin)] where V_g + V_p = 1. Unfortunately, the function failed, probably because of the needed to select good parameters, V_g and V_p.To solve this problem I used the next rule to combine both histograms:P_f(RGB|skin) = P_g(RGB|skin) * P_p(RGB|skin),where P_g() is the probability function of the "general" color histogrampreviously constructed and P_p() is the function of the color taken by the user's face. This function detected well skin pixels by applying:P_f(RGB|skin) > P_f(RGB|non-skin).P_f(RGB|non-skin) is constructed by fusioning P_g(RGB|non-skin), a generalnon-skin function with P_p(RGB|non-skin), colors taken from the user's torso.An intuitive explanation of the behaviour of this rule is that increasesprobability of the pixels that agree in both histograms, while making less probable those which don't.From the system described above, I was able to obtain more than 7000 gesture samples taken from 15 people. Development and test environments=================================The program was compiled on Linux Ubuntu Gutsy using g++ V. 4.1 and tested with a Logitech QuickCam Pro 9000 webcam. The initial versions of this program (actually lost in one of several departures) were tested on Silicon Graphicsmachines and IBM workstations with an EVID 30 videocamera.Contact info============Please feel free to send me corrections, comments, suggestions, extensions, etc. to haviles@turing.iimas.unam.mx, with copy to hector_hugo_aviles@hotmail.com. In case of corrections or improvements, I will add them with the corresponding credits. Help me to improve this work!If you find this information useful, I'd really appreciate youcould drop me a line, or better, cite one my papers:* H.H. Aviles-Arriaga and  L.E. Sucar and  C.E. Mendoza  Visual Recognition of Similar Gestures18th International Conference on Pattern Recognition ICPR 2006. Volume: 1,  pp: 1100-1103 Location: Hong Kong, 2006. * Hector Aviles and Enrique Sucar. Real-Time Visual Recognition of Dynamic Arm Gestures. Video-Based Surveillance Systems Computer Vision and Distributed Processing. Remagnino, P.; Jones, G.A.; Paragios, N.; Regazzoni, C.S. (Editors),  296 pages. 2002.  Hardcover ISBN: 978-0-7923-7632-3.* Hector Aviles and Enrique Sucar. Dynamic Bayesian networks for visual recognition of dynamic gestures. Journal of Intelligent and Fuzzy Systems, Volume 12, Number 3-4/2002, pages 243-250. 2002. ISSN:1064-1246.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -