📄 http:^^www.tc.cornell.edu^visualization^education^cs718^fall1995^landis^index.html
字号:
When an image is added to the database, the <b>AnalysisManager</b> calls the
<tt>extract()</tt> member function on the <b>Pattern</b> class,
<!WA156><!WA156><!WA156><!WA156><A href="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/patternx.h">patternx.h</A>.
<b>Pattern</b> creates a 3-dimensional feature vector where each element
contains a percentage of bias in the horizontal, diagonal, and vertical directions.
<p>
The pattern extraction algorithm is a multistep process. First it creates
a greyscale copy of the original image. RGB values are converted to
brightness values using the same translation as for black and white television.
Television images are transmitted using the YIQ color scale. Black and white
television recievers only display the Y portion, the luminance signal:
<br>
<br>
<p align=center>
<!WA157><!WA157><!WA157><!WA157><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bright.jpg">
</p>
The greyscale image is then processed to detect edges using the Sobel
operator<!WA158><!WA158><!WA158><!WA158><a href="#ref10">[10]</a>. This method applies two 3 x 3 kernels to the
nieghborhood of each pixel to estimate the brightness derivatives
<em>dB/dx</em> and <em>dB/dy</em>. For the derivative in the horizontal direction,
the following kernel is used:
<pre>
1 0 -1
1 0 -1
1 0 -1
</pre>
And for vertical direction, the following kernel is used:
<pre>
1 1 1
0 0 0
-1 -1 -1
</pre>
Conceptually, a kernel is applied to an image by sliding the kernel over the image
and summing the products of the values in the kernel with the brightness values under
them. The result is the derivative, or brightness slope at the pixel under the center
of the kernel.
After the application of both kernels to a pixel nieghborhood, the magnitude is computed:
<br>
<br>
<p align=center>
<!WA159><!WA159><!WA159><!WA159><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/magn.jpg">
</p>
This value is assigned to the pixel under the center of the kernel. To emphasize the
edges, the algorithm thresholds the value to black or white. The following image
is a result of this process:
<p align=center>
<!WA160><!WA160><!WA160><!WA160><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/sobel.jpg"> <!WA161><!WA161><!WA161><!WA161><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/plaid1.jpg">
</p>
Finally the edge image is traversed applying the two kernels again. This time the
derivatives are much stronger in the bias of the image. The direction of the slope is
determine by:
<br>
<br>
<p align=center>
<!WA162><!WA162><!WA162><!WA162><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/direct.jpg">
</p>
This is a value between zero and PI. The algorithm quantizes the direction into one
of the three bins in the histogram. After this process is applied to the entire
image, the sums in the bins are converted to percentages.
<p>
The result is a feature vector containing percentages of bias in horizontal,
diagonal, and vertical directions. The distance metric ranks the three
direction biases as: very strong, strong,
not strong, and weak. It then computes the <!WA163><!WA163><!WA163><!WA163><a href="#L1 norm">L1 norm</a>
distance between the two histograms. This distance is compared to a user-definable
threshold value to determine similarity. The returned distance is used by the
<b>QueryManager</b> to order similar images.
<br>
<br>
<!WA164><!WA164><!WA164><!WA164><IMG SRC="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/line_col.gif"><br>
<H2><a name="Results">Results</a></H2>
The results of my project were positive. Using a scanner, I was able
to produce on-line pixmaps for 50 wallpaper samples. I implemented
software that supported both color and pattern based queries of a wallpaper database.
The software provides an intuitive user interface and supports easy addition of new
query classes.
<h3><a name="Color Queries">Color Queries</a></h3>
There are two ways to query the database by color: dialog and
example. The <b>Query By Color</b> dialog was described <!WA165><!WA165><!WA165><!WA165><a href="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/qcolor.jpg">above</a>.
A dialog query using three user-selected colors demonstrates the effectiveness of the
<!WA166><!WA166><!WA166><!WA166><a href="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/clrdlgex.jpg">query</a>. Three images are returned in order of similarity.
<p>
An example-based query uses the color histogram of a displayed wallpaper sample as
the query key. The user clicks the right mouse button on the desired example and
is presented with a popup menu. Selecting <b>Query by Color Example</b> will initiate
the <!WA167><!WA167><!WA167><!WA167><a href="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/clrbyex.jpg">retrieval</a>. <b>STRIPE7.BMP</b> was used as the example
wallpaper image and the threshold was set at 38.
<h3><a name="Pattern Queries">Pattern Queries</a></h3>
The software currently supports pattern queries by example only. The
example-based queries are initiated from a popup menu. A query based on wallpaper
sample <b>PLAID1.BMP</b> results in a set of <!WA168><!WA168><!WA168><!WA168><a href="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/patex.jpg">wallpapers</a>
containing strong directional bias in vertical and horizontal directions.
This query was restricted to display 10 images.
<p>
The current implementation
does not provide enough capacity to distinguish between directional bias from lines
and bias from <em>noisy</em> patterns. This would require a more sophisitcated
feature extraction algorithm. For example, a smoothing step could be used to supress
the noise before performing the gradient computations.
<h3><a name="User Interface">User Interface</a></h3>
The user interface is fairly intuitive. The interface presents a familiar
environment to the Windows user by following
Microsoft Windows interface guidelines.
<p>
A sample user, unfamiliar with image processing, easily navigated through
the system. The user was confused about the meaning of the
threshold parameters which must be set as numbers. This confusion
could be alleviated by presenting a slider control representing a scale of
relative distance.
<h3><a name="Design">Design</a></h3>
The software design I used takes full advantage of object-oriented principles. The
key features of the design are:
<dl>
<dt><!WA169><!WA169><!WA169><!WA169><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/reddot.gif"> <b>Change is encapsulated</b>
<dd> The area of greatest change is the addition of new query classes. This
functionality is encapsulated in the feature extractor concrete classes.
A new query class is added by copying an existing feature extractor
and modifying it to suit the needs of the new query class.
<dt><!WA170><!WA170><!WA170><!WA170><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/reddot.gif"> <b>Implemenation details are encapsulated</b>
<dd> Two classes hide implementation details: <b>Pixmap</b> and
<b>StorageManager</b>. To support multiple image types, <b>Pixmap</b> could
be converted into an abstract class providing only interface; specific image types
would inherit from this class. The <b>StorageManager</b> class hides
details of the database and operating system and is responsible
for logical image management. In a full-featured application,
the <b>StorageManager</b> would delegate system details to another class, and
would only manage images.
<dt><!WA171><!WA171><!WA171><!WA171><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/reddot.gif"> <b>Abstraction is enforced</b>
<dd> Abstraction is enforced using well-defined manager classes.
<dt><!WA172><!WA172><!WA172><!WA172><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/reddot.gif"> <b>Interface inheritance eases expansion</b>
<dd> The <b>FeatureExtractor</b> abstract class defines the interface for all
concrete feature extractors. Client code can be written
to an interface without knowledge of query class-specific details.
</dl>
Ghese design principles produced an extensible and embeddable system.
<p>
The prototype could be made embeddable by converting it to
a server in a client/server arrangement. First, the user interface
code must be externalized. This code is sensitive to change and properly resides in
the client application. Second, a client/server communication protocol must be used.
In the Microsoft Windows environment, OLE would provide this capability.
<h3><a name="Limitations">Limitations</a></h3>
The limitations of the prototype are performance and resource
management. Adding a new image to the database takes a long time because of
image analysis. Although the current algorithms are not optimized, this step
will always be too expensive to perform in real time. A production system would have
so many feature vectors that a batch processing mechanism would be necessary.
<p>
Image management is too crude for a production system. The entire image
database is
loaded into memory and no mechanism exists for maintaining only
a working subset of images.
There is no support for thumbnail versions of images. The size of an
image is about 512x512 even though the software always scales down to
64x64 for display.
<p>
The software is a prototype intended for exploring ideas and
therefore does not contain the polish required in production systems.
<br>
<br>
<!WA173><!WA173><!WA173><!WA173><IMG SRC="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/line_col.gif"><br>
<H2><a name="Conclusions">Conclusions</a></H2>
The project goals were met: a prototype CBIR system was built demonstrating color
and pattern queries; an intuitive user interface provides ease of use; an
object-oriented design supports extensibility and embeddability.
<h3><a name="Usefulness">Usefulness</a></h3>
With the
vast number of images available on-line, quality CBIR systems are critical. By
using the right system, people can quickly find the image they need.
<p>
In the field of interior design, designers and their
customers search through hundreds of carpet,
drapery, paint, and wallpaper samples. Their selections must be combined to create
a pleasing result. Producing an
asthetic result often requires even more searching.
<p>
Through the use of CBIR, designers could access vast amounts of material, (either
on CD-ROM or in a vendor database), and rapidly create high-qu
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -