📄 http:^^www.tc.cornell.edu^visualization^education^cs718^fall1995^landis^index.html

📁 This data set contains WWW-pages collected from computer science departments of various universities
💻 HTML
📖 第 1 页 / 共 5 页
字号:
an image which is a 2D representation of a 3D scene containing several objects. 
Features representing the objects might be either primitive or logical features. If
the extraction 	generates a feature containing edge information, then it is
a primitive feature. On the other hand, if the extraction identifies the object by name,
say by utilizing a model-based approach, it is a logical feature.
<p>
Primitive features are often used as the basis for generating logical features. A common 
CBIR system architecture layers logical feature extraction on top of primitive 
featue extraction. Primitive
features are extracted directly from the image to generate a <em>segemented</em> image.
From this information, more abstract, logical features are generated<!WA66><!WA66><!WA66><!WA66><a href="#ref6">[6]</a>.
Segementation is the process of dividing the image into regions that correspond to 
structural units of interest<!WA67><!WA67><!WA67><!WA67><a href="#ref10">[10]</a>. 

<h3> <a name="Indexing and Queries">Indexing and Queries</a></h3>

The goal of indexing is to create a compact summary of the database contents to
provide an efficient mechanism for retrieval of the data.
The summary data is based on feature vectors:

<blockquote>
Since in content based visual databases, all items (images or objects) are represented
by pre-computed visual features, the key attribute for each image will be a feature
vector which corresponds to a point in a multi-dimensional feature space; and search
will be based on similarities between the feature vectors. Therefore, to achieve a
fast and effective retrieval...requires an efficient multi-dimensional indexing
scheme<!WA68><!WA68><!WA68><!WA68><a href="#ref11">[11]</a>.
</blockquote>

Multiple indexing schemes may be required to support queries involving a 
combination of features. 
To utilize multiple indexes, a hierarchical approach is often used where each 
component of a query is applied against an appropriate index. A higher layer merges results
for presentation to the user.
<p>
CBIR queries are posed in a fuzzy fashion. The user is typically interested in results
according to similarity rather than equality. This requirement influences the indexing 
scheme, the methods of feature comparison, and the means by which queries are 
solicited from the user. 
<p>
Image similarity is usually determined by computing a distance measure between the
query and the appropriate feature vectors in the index structure. Similar images
are ranked according to distance. Thresholding may be used to reduce the number of 
similar images presented to the user.
<p> 
A query is created by composing primitive and logical feature vectors. To present
a simple and structured query environment, CBIR systems define query classes.
Some typical query classes 
are<!WA69><!WA69><!WA69><!WA69><a href="#ref1">[1]</a>:<br>

<dl>
<dt><!WA70><!WA70><!WA70><!WA70><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Color</b>
	<dd> A partial histogram is created by specifying colors and 
		percentages<!WA71><!WA71><!WA71><!WA71><a href="#ref3">[3]</a><!WA72><!WA72><!WA72><!WA72><a href="#ref6">[6]</a>
		<!WA73><!WA73><!WA73><!WA73><a href="#ref7">[7]</a><!WA74><!WA74><!WA74><!WA74><a href="#ref12">[12]</a><!WA75><!WA75><!WA75><!WA75><a href="#ref13">[13]</a>. 
<dt><!WA76><!WA76><!WA76><!WA76><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Texture</b>
	<dd> Texture features include directionality, periodicity, randomness, 
		roughness, regularity, coarseness, color distribution, contrast, and
		complexity<!WA77><!WA77><!WA77><!WA77><a href="#ref5">[5]</a><!WA78><!WA78><!WA78><!WA78><a href="#ref12">[12]</a>
		<!WA79><!WA79><!WA79><!WA79><a href="#ref13">[13]</a>.
<dt><!WA80><!WA80><!WA80><!WA80><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Sketch</b>
	<dd> The user creates a sketch representing an outline to be matched against
		dominant image edges<!WA81><!WA81><!WA81><!WA81><a href="#ref3">[3]<!WA82><!WA82><!WA82><!WA82><a href="#ref12">[12]</a>.
<dt><!WA83><!WA83><!WA83><!WA83><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Shape</b>
	<dd> An example shape is created using simple painting tools. The shape is 
		compared to objects within images for similarity<!WA84><!WA84><!WA84><!WA84><a href="#ref3">[3]</a>
		<!WA85><!WA85><!WA85><!WA85><a href="#ref4">[4]</a><!WA86><!WA86><!WA86><!WA86><a href="#ref12">[12]</a><!WA87><!WA87><!WA87><!WA87><a href="#ref13">[13]</a>
<dt><!WA88><!WA88><!WA88><!WA88><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Volume</b>
	<dd> Volumetric relationships are specified using 3D tools. Feature vectors contain
		3D information.
<dt><!WA89><!WA89><!WA89><!WA89><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Spatial constraints</b>
	<dd> The feature vector contains topological relationships among the objects in an image.
<dt><!WA90><!WA90><!WA90><!WA90><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Browsing</b>
	<dd> The user is presented with a structured method of viewing the entire database.
<dt><!WA91><!WA91><!WA91><!WA91><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Objective features</b>
	<dd> Objective features are attributes such as date of image acquisition, 
		light direction, and view
		direction. These features lend themselves to the methods used in
		traditional databases<!WA92><!WA92><!WA92><!WA92><a href="#ref5">[5]</a><!WA93><!WA93><!WA93><!WA93><a href="#ref9">[9]</a>.
<dt><!WA94><!WA94><!WA94><!WA94><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Subjective features</b>
	<dd> Feature extraction is manual or semi-automatic and is subject to human
		interpretation. Examples are region labels and 
		manual object identification<!WA95><!WA95><!WA95><!WA95><a href="#ref5">[5]</a><!WA96><!WA96><!WA96><!WA96><a href="#ref9">[9]</a>.
<dt><!WA97><!WA97><!WA97><!WA97><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Motion</b>
	<dd> Motion is applicable to a series of images such as video segments. Motion features 
		measure movement of objects in the sequences or other movement such
		as camera viewpoint and camera focal point<!WA98><!WA98><!WA98><!WA98><a href="#ref3">[3]</a>.
<dt><!WA99><!WA99><!WA99><!WA99><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Text</b>
	<dd> Either simple or complex text can be associated with images. For the 
		simple case, traditional database methods can be used. Complex
		systems use natural language processing and artificial intelligence to
		reason about text annotations<!WA100><!WA100><!WA100><!WA100><a href="#ref5">[5]</a>.
<dt><!WA101><!WA101><!WA101><!WA101><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/bludot.gif"> <b>Domain concepts</b>
	<dd> Domain information lends itself to specific forms of feature vectors and
		queries. 
</dl>

Query classes provide a meaningful way for a user to create feature vectors that
correspond to their notion of image semantics.
Queries can be composed of multiple query classes. 
<p>
An alternative to user-composed queries are queries by example. The user submits
a query in the form of a prototype image and the system uses the feature vector(s)
of the appropriate query class(es). Often a session will
begin with user-composed queries which are then refined through query by example.
<p>
To run an interactive query on a system called Query by Image Content (QBIC), 
click
<!WA102><!WA102><!WA102><!WA102><A href="http://wwwqbic.almaden.ibm.com"> here.</A> 
<br> 
<br>

<!WA103><!WA103><!WA103><!WA103><IMG SRC="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/line_col.gif"><br>

<H2><a name="Current Research">Current Research</a></H2>

There are a large number of researchers exploring CBIR-related topics. 
I focused on recent work which has been more productive. 
The following sections describe some of the important topics I studied.

<h3><a name="Feature Extraction">Feature Extraction</a> </h3>

Feature extraction is performed when an image is added to the database. CBIR systems
provide support for multiple query classes.
Pickard and Minka<!WA104><!WA104><!WA104><!WA104><a href="#ref5">[5]</a> use 6 different 
features to characterize images from the MIT Photobook image retrieval system.
<p>
The CORE system<!WA105><!WA105><!WA105><!WA105><a href="#ref6">[6]</a>, is a retrieval engine that 
supports a wide range of features including
visual browsing, color similarity measures, and text. Primitive features are combined
to create higer level, logical features they call <em>concepts</em>. 
<p>
The QBIC system<!WA106><!WA106><!WA106><!WA106><a href="#ref3">[3]</a><!WA107><!WA107><!WA107><!WA107><a href="#ref12">[12]</a> extracts features
that support image query classes for color, texture, shape, sketching, location,
and text. The system also supports a set of video oriented query classes. 
<p>
For color histograms, many different extraction methods are used. The first issue
is the dimension of the color feature vector, e.g., the number of colors. Typical
numbers range from 64 to 256 dimensions (256 being the number of unique colors
representable with one byte).
The higher the dimension of the feature vector, the greater its capacity.
<p>
The values in each bin of a color histogram are usually either the total number
of pixels, or the percentage of pixels for the given color in 
the entire image.
 
<h3><a name="Query Specification">Query Specification</a></h3>

The papers I read treated query specification as a secondary issue. 
Researchers recognized the need for simple ways to specify queries. Unlike
text-based databases where the desired information is retrieved with
a single query, a suitable image may require many queries. 
CBIR systems typically return several of the <em>best</em> images for selection
by the user. Many systems allow the user to select one of these images as an
example for another query. This is an example of <em>query refinement</em>.
Researchers are exploring ways of providing easy refinement of queries that
yield high success.
<p>
The use of multiple query classes to compose
a query interests researchers. 
Although many systems claim to support composite queries, few of the
papers explained how to combine query classes successfully.

<h3><a name="Distance Metrics"</a>Distance Metrics</h3>

Most image query classes rely on similarity metrics rather than exact matching. 
Distance metrics produce a relative distance between two image feature vectors.
A threshold is used to determine if two features are similar.
In many cases, the user can control the threshold to relax or constrain a query.
<p>
Every distance metric has advantages and drawbacks. For example, 
Stricker<!WA108><!WA108><!WA108><!WA108><a href="#ref7">[7]</a> analyzes two common distance metrics, the L1 and
L2 (euclidean) norms. The <a name="L1 norm">L1 norm</a> computes the distance <em>d</em> 
between two <em>n</em> element color histograms <em>H</em> and <em>I</em> as:

<br>
<p align=center>
<!WA109><!WA109><!WA109><!WA109><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/L1.jpg">
</p>

And the <a name="L2 norm">L2 norm</a> is computed as:

<br>
<p align=center>
<!WA110><!WA110><!WA110><!WA110><img src="http://www.tc.cornell.edu/Visualization/Education/cs718/fall1995/landis/L2.jpg">
</p>

Stricker states that "Using the L1-metric results in false negatives, i.e., not all 
the images with similar color composition are retrieved because the L1-metric does
not take color similarity into account. Using a metric similar to the L2-metric
results in false positives, i.e., histograms with many non-zero bins are close to
any other histogram and thus are retrieved always."
<p>
The QBIC system<!WA111><!WA111><!WA111><!WA111><a href="#ref12">[12]</a> uses a 64 or 256 dimension color histogram where
each <em>i</em>-th element is the percentage of color <em>i</em>. The distance 
between histogram <em>r</em> and database image histogram <em>q</em> is
computed as <em>(r - q)T A(r - q)</em>. Where <em>T</em> is the transpose operator.
The locations <em>a(i,j)</em> in <em>A</em> contain the distance between color
<em>i</em> and color <em>j</em>.
<p>
IBM's Ultimedia Manager<!WA112><!WA112><!WA112><!WA112><a href="#ref13">[13]</a> uses a 64-dimensional vector of color
percentages. Each dimension represents a range in color space. At analysis time
the color of each pixel is quantized into one of the 64 ranges based on its location
in RGB space.

<h3><a name="Indexing">Indexing</a></h3>

The predominant CBIR research is in the area of image feature indexing. There are
many difficult problems to solve. First, image features are typically high dimensional
requiring complex, multi-dimensional indexing. Second, traditional indexing assumes
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -