📄 http:^^www.tc.cornell.edu^visualization^education^cs718^fall1995^landis^index.html
字号:
exact matching; similarity matching complicates the indexing structure.
Finally, it is difficult to combine multi-dimensional, similarity-based indexing
methods to efficiently support queries composed of mutliple query classes.
<p>
By using similarity metrics only for initial indexing, Pickard and
Minka<!WA113><!WA113><!WA113><!WA113><a href="#ref5">[5]</a> avoid the problem of combining query classes with
different metrics. Similarity is encoded into clusters of image regions in a tree
structure and distance is measured by ancestral distance between clusters. This
has the effect of normalizing different query classes so they can be treated
identically.
<h3><a name="Extensibility">Extensibility</a></h3>
Systems must be extensible to overcome the immaturity of indexing methods,
query specification, and feature extraction. Most of the important
work in CBIR lies ahead.
<p>
The importance of extensibility
was recognized by Wu, et. al.<!WA114><!WA114><!WA114><!WA114><a href="#ref6">[6]</a> when they
developed CORE based on a generic framework for a multimedia DBMS. To demonstrate
the flexibility of the architecture, they developed two different applications: a
computer-aided facial image inference and retrieval system (CAFIIR), and a trademark
archival and registration system (STAR). A medical information system is currently
in development. In their conclusion, they state: "Object orientation is very
important for a retrieval engine like CORE. One advantage...is an increase in
the reusability of codes...to increase its reusability and extensibility."
<h3><a name="Artificial Intelligence">Artificial Intelligence</a></h3>
Research focuses on three applications of Artificial Intelligence to CBIR: reasoning
about logical features, similarity metrics, and index construction and maintenance.
For example, Pickard and Minka<!WA115><!WA115><!WA115><!WA115><a href="#ref5">[5]</a> apply AI when annotating
images to dynamically select
from multiple feature models based on how the user labels image regions. Their system
is also capable of improving the indexing structure based on new positive and negative
examples.
<p>
Wu, et. al.<!WA116><!WA116><!WA116><!WA116><a href="#ref6">[6]</a>, apply fuzzy reasoning to queries where the logical
features of the query are only partially defined by the user. They also used a
<em>learning based on experiences</em>
neural network model to generate self-organizing nodes
in a content-based index tree. This allows their system to fuse composite feature
measures to support complex and fuzzy queries.
<h3><a name="Maximizing Domain Knowledge">Maximizing Domain Knowledge</a></h3>
Query performance can be drastically improved in cases where assumptions can be made
about the nature, or domain, of the images in the database. Wu, et. al., observe that
"CORE has comprehensive functions. However each application has its domain-specific
problems." And "In application development, domain expertise must be added to customize
the indexing and retrieval module."
<br>
<br>
<!WA117><!WA117><!WA117><!WA117><IMG SRC="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/line_col.gif"><br>
<H2><a name="Project Overview">Project Overview</a></H2>
<b> Exploration of CBIR problems in the domain of interior design.</b>
I am interested in the methods of CBIR which apply to
problems faced by interior designers. Interior designers work with paint, wallpaper,
fabric, and floor coverings. They also follow general principles of form,
space, color, and style. Designers and their customers regularly face the
tedious chore of manually searching for, and matching materials according to
general design principles and taste. There is an opportunity for a high degree
of computer assistence with these tasks.
<p>
Designers could compose queries using primitive and logical features and
specify constraints according to design principles. Results
of a series of queries, for wallpaper, paint, and carpet, must be
self-consistent according to designer-specified rules of form, space, color,
and style. These requirements led my interest toward the following query
classes:<br>
<dl>
<dt> <!WA118><!WA118><!WA118><!WA118><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Color</b>
<dd> Fabric, wallpaper, etc., are often selected based upon color
content. This is a perfect application of color histograms.
<dt> <!WA119><!WA119><!WA119><!WA119><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Texture</b>
<dd> Floor coverings, wallpaper, and fabric all have important textural
components ideal for queries based on textural features.
<dt> <!WA120><!WA120><!WA120><!WA120><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Shape</b>
<dd> Examples of shapes are stripes, plaids, floral and patterns.
<dt> <!WA121><!WA121><!WA121><!WA121><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Objective features</b>
<dd> Styles such as Victorian could be modeled as objective features.
<dt> <!WA122><!WA122><!WA122><!WA122><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Subjective features</b>
<dd> Taste, mood, or sensation related design concepts can be specified using subjective
feature queries. Examples are: feminine vs. Masculine, cheery, cool, warm, etc.
<dt> <!WA123><!WA123><!WA123><!WA123><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Text</b>
<dd> Product attributes such as part numbers, supplier information, and first
production date are all text features.
<dt> <!WA124><!WA124><!WA124><!WA124><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Domain specific</b>
<dd>The design rules described above are domain specific features.
</dl>
Query by example is a powerful tool for interior designers. For
example, given a particular carpet sample, the designer could find window
covering that compliments the carpet.
<h3> <a name="Project Definition">Project Definition</a> </h3>
I implemented
a software prototype that allowed me to explore the following areas: <br>
<dl>
<dt><!WA125><!WA125><!WA125><!WA125><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/reddot.gif"> <b>Color</b>
<dd> Color is an important feature of the materials interior designers
use. Color also allows automatic image analysis. I explored a few
implementations of color histogram feature vectors and related
similarity metrics.
<dt><!WA126><!WA126><!WA126><!WA126><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/reddot.gif"> <b>Pattern</b>
<dd> Patterns are also very important to interior designers. I
focused on automated feature vector generation using edge detection.
<dt><!WA127><!WA127><!WA127><!WA127><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/reddot.gif"> <b>Software Design</b>
<dd> One of the difficulties of CBIR systems is that no small number of
query classes provide enough flexibility to support interior design.
Further, the technology of image feature vectors,
similarity metrics, and indexing is immature. A
real-world system needs to be extensible and configurable. Using
object oriented techniques, I designed a system that encapsulates the areas
of highest change. I created a framework to support the addition of new query
classes and distance metrics. The framework divides major system
tasks among manager objects that interact through well-defined interfaces.
<dt><!WA128><!WA128><!WA128><!WA128><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/reddot.gif"> <b>User Interface</b>
<dd> I defined a simple user interface for color histogram queries and
provided the ability to query by example using color histograms and shapes.
<dt><!WA129><!WA129><!WA129><!WA129><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/reddot.gif"> <b>Image Management</b>
<dd>Basic to a QBIC system is the ability to efficiently manage the performance,
storage, and memory requirements of images. Because of their size and complexity,
images have special computation and resource
requirements. I have identified some of
these issues and suggest some solutions.
</dl>
I specifically avoided exploring indexing issues in this project.<br>
<br>
<!WA130><!WA130><!WA130><!WA130><IMG SRC="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/line_col.gif"><br>
<H2><a name="Implementation">Implementation</a></H2>
I implemented a software prototype for the purpose of exploring the details of CBIR.
The software was written using Microsoft(tm) Visual C++ Compiler, version 2.0. The
platform was a Intel 486DX2 system running the Windows NT operating system,
version 3.51. The software architecture is shown in the following block diagram:
<br>
<p align=center>
<!WA131><!WA131><!WA131><!WA131><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/arch.jpg">
</p>
<br>
This architecture is loosely based on CORE<!WA132><!WA132><!WA132><!WA132><a href="#ref6">[6]</a>. Each of the
<em>managers</em> is implemented as a singleton object; the class
definition restricts instantiation to only one system-wide object.
<h3> <a name="Storage Manager">Storage Manager</a> </h3>
The Storage Manager provides an interface to the image database. It is responsible for
maintaining an in-memory virtual mapping of images and for performing system specific
I/O operations. The following class diagram<!WA133><!WA133><!WA133><!WA133><a href="#ref14">[14]</a> depicts the key
relationships.
<br>
<p align=center>
<!WA134><!WA134><!WA134><!WA134><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/stormgr.jpg">
</p>
The <b>StorageManager</b> class, <!WA135><!WA135><!WA135><!WA135><A href="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/stormgr.h">stormgr.h</A>, provides
member functions to access the image database. A client uses <tt>store()</tt>
to associate the pixmap in <tt>file</tt> with the <tt>image</tt> object and store
the image in the database. This function is used by the <b>AnalysisManager</b> to
process a user request to add an image to the database. It also attaches the
pixmap to the <tt>image</tt> which allows the <b>AnalysisManager</b> to perform
feature extractions on the image.
<p>
Clients can request a single image using <tt>getImage()</tt>, or the entire image
database list using <tt>getImageList()</tt>. The <tt>loadDB()</tt> function is used
to initialize the <b>StorageManager</b> object.
<p>
The current implementation of the <b>StorageManager</b> maintains an unordered list of
<b>Image</b> objects representing the entire database. In a real system, the
<b>StorageManager</b> would maintain multiple indices used to retrieve
images. The <b>Image</b> objects would be at the leaves of the index, e.g.,
in the case of a tree-based index structure.
<p>
For efficiency, this implementation only loads the pixmap data for an image when
necessary. The <b>Image</b> object determines when to load the data;
the task of loading is delegated to the <b>StorageManager</b>.
<p>
The <b>Image</b> class, <!WA136><!WA136><!WA136><!WA136><A href="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/image.h">image.h</A>, encapsulates the details
of the pixmap implementation and feature vectors. The system manipulates
<b>Image</b> objects for convenience. The <b>Pixmap</b> class provides an interface to
a color pixel representation of an image
(<!WA137><!WA137><!WA137><!WA137><A href="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/pixmap.h">pixmap.h</A>). This class allows the implementation of the on-screen
image to change without affecting the rest of the code.
<p>
The <b>Features</b> class, <!WA138><!WA138><!WA138><!WA138><A href="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/features.h">features.h</A>, encapsulates a set
of feature vectors. This class allows new feature vectors to be added
as new query classes are added.
<p>
Two changes would be required in a production system to efficiently manage memory.
First, the current implementation only loads pixmaps on demand and never
invalidates them. Invalidating pixmaps would conserve large amounts of physical
memory. A typical pixmap requires 512 x 512 = 262144
bytes of memory just to store the image data, assuming 256 colors. Also a
pixmap has a color table and other overhead making for a total of about
264000 bytes.
<p>
A second memory saver would be the use of thumbnail versions of the pixmaps. Thumbnails
are smaller representations, usually 64 x 64 = 4096 bytes, which are used for
display. The current system only requires the entire image during feature
extraction. Support for thumbnails would either require a time
tradeoff to reduce a large pixmap when it is loaded into memory, or a disk space tradeoff
if the thumbnails were generated at load time and stored in the database.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -