📄 http:^^www.tc.cornell.edu^visualization^education^cs718^fall1995^landis^index.html
字号:
Date: Mon, 16 Dec 1996 22:15:24 GMTServer: NCSA/1.5Content-type: text/htmlLast-modified: Fri, 15 Dec 1995 21:05:39 GMTContent-length: 58986<html>
<title>Sean Landis' Fall 718 Context-Based Image Retrieval Project Page</title>
<head>
<h1>Sean Landis' CS718 Project, Fall 1995</h1>
</head>
<body>
<!WA0><!WA0><!WA0><!WA0><IMG SRC="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/line_blu.gif"><br>
<h1 align=center><i>Content-Based Image Retrieval Systems for Interior Design</i></h1>
<!WA1><!WA1><!WA1><!WA1><IMG SRC="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/line_blu.gif"><br>
<br><br>
<h2>Table of Contents</h2>
<!WA2><!WA2><!WA2><!WA2><a href="#Introduction">Introduction</a>
<br>
<!WA3><!WA3><!WA3><!WA3><a href="#Background">Background</a><br>
..........<!WA4><!WA4><!WA4><!WA4><a href="#Manual Image Analysis">Manual Image Analysis</a><br>
..........<!WA5><!WA5><!WA5><!WA5><a href="#Automated Image Analysis">Automated Image Analysis</a><br>
..........<!WA6><!WA6><!WA6><!WA6><a href="#Image Features">Image Features</a><br>
..........<!WA7><!WA7><!WA7><!WA7><a href="#Indexing and Queries">Indexing and Queries</a><br>
<!WA8><!WA8><!WA8><!WA8><a href="#Current Research">Current Research</a><br>
..........<!WA9><!WA9><!WA9><!WA9><a href="#Feature Extraction">Feature Extraction</a><br>
..........<!WA10><!WA10><!WA10><!WA10><a href="#Query Specification">Query Specification</a><br>
..........<!WA11><!WA11><!WA11><!WA11><a href="#Distance Metrics">Distance Metrics</a><br>
..........<!WA12><!WA12><!WA12><!WA12><a href="#Indexing">Indexing</a><br>
..........<!WA13><!WA13><!WA13><!WA13><a href="#Extensibility">Extensibility</a><br>
..........<!WA14><!WA14><!WA14><!WA14><a href="#Artificial Intelligence">Artificial Intelligence</a><br>
..........<!WA15><!WA15><!WA15><!WA15><a href="#Maximizing Domain Knowledge">Maximizing Domain Knowledge</a><br>
<!WA16><!WA16><!WA16><!WA16><a href="#Project Overview">Project Overview</a><br>
..........<!WA17><!WA17><!WA17><!WA17><a href="#Project Definition">Project Definition</a><br>
<!WA18><!WA18><!WA18><!WA18><a href="#Implemention">Implemention</a><br>
..........<!WA19><!WA19><!WA19><!WA19><a href="#Storage Manager">Storage Manager</a><br>
..........<!WA20><!WA20><!WA20><!WA20><a href="#Analysis Manager">Analysis Manager</a><br>
..........<!WA21><!WA21><!WA21><!WA21><a href="#Query Manager">Query Manager</a><br>
..........<!WA22><!WA22><!WA22><!WA22><a href="#Display Manager">Display Manager and the User Interface</a><br>
....................<!WA23><!WA23><!WA23><!WA23><a href="#Image Menu">Image Menu</a><br>
....................<!WA24><!WA24><!WA24><!WA24><a href="#View Menu">View Menu</a><br>
..........<!WA25><!WA25><!WA25><!WA25><a href="#Query by Color Algorithms">Query by Color Algorithms</a><br>
..........<!WA26><!WA26><!WA26><!WA26><a href="#Query by Pattern Algorithms">Query by Pattern Algorithms</a><br>
<!WA27><!WA27><!WA27><!WA27><a href="#Results">Results</a><br>
..........<!WA28><!WA28><!WA28><!WA28><a href="#Color Queries">Color Queries</a><br>
..........<!WA29><!WA29><!WA29><!WA29><a href="#Pattern Queries">Pattern Queries</a><br>
..........<!WA30><!WA30><!WA30><!WA30><a href="#User Interface">User Interface</a><br>
..........<!WA31><!WA31><!WA31><!WA31><a href="#Design">Design</a><br>
..........<!WA32><!WA32><!WA32><!WA32><a href="#Limitations">Limitations</a><br>
<!WA33><!WA33><!WA33><!WA33><a href="#Conclusions">Conclusions</a><br>
..........<!WA34><!WA34><!WA34><!WA34><a href="#Usefulness">Usefulness</a><br>
..........<!WA35><!WA35><!WA35><!WA35><a href="#Future Work">Future Work</a><br>
<!WA36><!WA36><!WA36><!WA36><a href="#References">References</a><br>
<br>
<!WA37><!WA37><!WA37><!WA37><IMG SRC="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/line_col.gif"><br>
<H2><a name="Introduction">Introduction</a></H2>
Computers are beginning to replace photographic archives as the preferred
form of repository. Computer-based image repositories provide a flexibility
that cannot be attained with collections of printed images. Recently there has
been an explosion in the number of images available to computer users.
As this number increases, users require more sophisticated methods of retrieval.
Content-based image retrival (CBIR) promises to fill this requirement.
<p>
There are many diverse areas where CBIR can play a key role in the use of
images<!WA38><!WA38><!WA38><!WA38><a href="#ref1">[1]</a>:<br>
<br>
<!WA39><!WA39><!WA39><!WA39><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Art galleries and museum management</b> <br>
<!WA40><!WA40><!WA40><!WA40><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Architectural and engineering design</b> <br>
<!WA41><!WA41><!WA41><!WA41><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Interior design</b> <br>
<!WA42><!WA42><!WA42><!WA42><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Remote sensing and natural resource management</b> <br>
<!WA43><!WA43><!WA43><!WA43><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Geographic information systems</b> <br>
<!WA44><!WA44><!WA44><!WA44><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Scientific database management</b> <br>
<!WA45><!WA45><!WA45><!WA45><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Weather forecasting</b> <br>
<!WA46><!WA46><!WA46><!WA46><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Retailing</b> <br>
<!WA47><!WA47><!WA47><!WA47><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Fabric and fashion design</b> <br>
<!WA48><!WA48><!WA48><!WA48><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Trademark and copyright database management</b> <br>
<!WA49><!WA49><!WA49><!WA49><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Law enforcement and criminal investigation</b> <br>
<!WA50><!WA50><!WA50><!WA50><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Picture archiving and communication systems</b> <br>
<!WA51><!WA51><!WA51><!WA51><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Education</b> <br>
<!WA52><!WA52><!WA52><!WA52><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Entertainment</b> <br>
<p>
With so many applications, CBIR has attracted the attention of researchers
across several disciplines.
<br>
<br>
<!WA53><!WA53><!WA53><!WA53><IMG SRC="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/line_col.gif"><br>
<H2><a name="Background">Background</a></H2>
Content-based retrieval is based on an understanding of the
semantics of the objects in a collection. Semantic analysis is performed
when the object is inserted into the collection.
Given a semantic representation of the objects in a collection,
a user can compose a query that retrieves a set of objects with
similar semantics. Query analysis is usually performed on an index structure that
summarizes the data in the collection.
<p>
Content-based image retrieval is the semantic analysis and
retrieval of images. Semantic
analysis may involve manual intervention, or it may be entirely
automated. Manual analysis involves human
interpretation to associate semantic properties with an image.
Automated semantic analysis extracts image features that are
correlated with some semantic meaning of the image. Both analysis methods
have their advantages and their drawbacks.
<h3> <a name="Manual Image Analysis">Manual Image Analysis</a> </h3>
Traditional databases use text key words as labels to efficiently access
large quantities text data. Even complex text data can be automatically
summarized and labeled using natural language processing and artificial
intelligence<!WA54><!WA54><!WA54><!WA54><a href="#ref5">[5]</a>.
<p>
When the data are images rather than text, summarizing the data with labels
becomes considerably more difficult. For example, consider a repository of
news photographs. A user may wish to pose a query such as
<blockquote><em> Give me all new photographs containing a US President and a
communist leader.</em></blockquote>
To support queries like this, images require labeling that indicates
the people in the images, their title, their nationality,
and their political alignment.
<p>
It is not known how humans can process electromagnetic signals and convert
them into highly detailed semantic interpretations. Therefore, human analysis
is required to generate labels that support sophisticated queries like the
one above. But there are problems with human analysis:<br>
<dl>
<dt><!WA55><!WA55><!WA55><!WA55><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Personal perspective</b>
<dd> One person's interpretation
of the important features of an image may not match another person's
interpretation. Personal perspective leads to variance in image analysis and labeling.
<dt><!WA56><!WA56><!WA56><!WA56><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Domain mismatch</b>
<dd> A person's domain of interest may influence image feature selection
and analysis.
<dt><!WA57><!WA57><!WA57><!WA57><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Interface expressiveness</b>
<dd>Human-computer
interfaces provide a limited bandwidth of expressive capability.
Image analysis os limited by the expressiveness of the interface.
<dt><!WA58><!WA58><!WA58><!WA58><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Data entry errors</b>
<dd>Humans are error-prone, especially
when set to a task which is tedious or redundant.<br>
</dl>
Because of these, and other problems, it is best to
automate image analysis as much as possible. Where intervention is required,
the user should be limited to a set of unambiguous choices.
<h3> <a name="Automated Image Analysis">Automated Image Analysis</a> </h3>
Automated image analysis calculates approximately invariant statistics
which can be correlated to the semantics of the image data. Example
statistics are color histograms, invariants
of shape moments, and edges. Statistical analysis is useful because it provides
information about the image without fickle and costly human interaction.
<p>
Despite its appeal, automated image analysis suffers drawbacks. The primary
problem with statistical analysis is that extracted features can
only support a very specific type of query. The features apply to
a particular domain, but they are not useful for posing general purpose
queries against diverse data sets.
<p>
Consider an image database indexed by color histogram. For each image, a feature vector is
generated such that each element of the vector represents the percentage of a color
quantum found in the image. A three element vector could have quantums representing
red, green, and blue (in practice a color feature vector requires more than three elements).
The feature vector for an image contains the quantized
percentage of red, green, and blue. The more quantums available,
the greater the accuracy of the feature vector and the greater the cost of indexing and
comparison.
<p>
If the database contained fabric images, a color histogram would be a powerful way
to pose a query. A user interested designing a men's casual shirt for spring
wants bright, spring-like colors. The query is posed with the desired color mix, and
all fabrics containing similar mixes of the specified colors are retrieved.
On the other hand, if the database contained news photographs as described earlier,
then color histograms would not be very useful. The semantics of the images in the
database do not correlate well with color histograms.
<h3> <a name="Image Features">Image Features</a></h3>
An image feature is a piece of semantic information extracted from the image. There are
several properties for measuring the quality of a feature:<br>
<dl>
<dt><!WA59><!WA59><!WA59><!WA59><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Capacity</b>
<dd>The number of distinguishable images that can be
represented<!WA60><!WA60><!WA60><!WA60><a href="#ref7">[7]</a>.
<dt><!WA61><!WA61><!WA61><!WA61><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Maximal Match Number</b>
<dd>The maximum number of images a query could possibly
retrieve<!WA62><!WA62><!WA62><!WA62><a href="#ref7">[7]</a>.
<dt><!WA63><!WA63><!WA63><!WA63><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Complexity</b>
<dd>The amount of computation required to determine if two images are similar
for a particular feature.
<dt><!WA64><!WA64><!WA64><!WA64><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Compactness</b>
<dd>The amount of space required to store and compare a feature.
</dl>
Image features can be categorized as either <em>primitive</em> or
<em>logical</em><!WA65><!WA65><!WA65><!WA65><a href="#ref1">[1]</a>.
A primitive feature is a low-level or statistical attribute of an image such as an object
boundry or color histogram. Primitive features are
automatically extracted directly from the image. A logical feature represents an abstract
attribute such as the label <em>grass</em> assigned to a region of an image.
Logical features rely on information beyond that contained in the image.
<p>
The delineation between primitive and logical features is not always clear. Consider
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -