📄 mdslong.hlp
字号:
{smcl}
{* 06apr2005}{...}
{cmd:help mdslong} {right:dialog: {bf:{dialog mdslong}}{space 11}}
{right:also see: {help mds postestimation}}
{hline}
{title:Title}
{p 4 22 2}
{hi:[MV] mdslong} {hline 2} Multidimensional scaling of proximity data in long format
{title:Syntax}
{p 8 24 2}
{cmd:mdslong} {depvar} {ifin}
{cmd:,} {opt id(var1 var2)} [ {it:options} ]
{synoptset 19 tabbed}{...}
{synopthdr}
{synoptline}
{syntab:Model}
{p2coldent:* {opt id(var1 var2)}}identify comparison pairs
(object1,object2){p_end}
{synopt:{cmd:s2d(}{cmdab:st:andard}{cmd:)}}convert similarity to
dissimilarity: d(ij) = sqrt(s(ii)+s(jj)-2s(ij)){p_end}
{synopt:{cmd:s2d(}{cmdab:one:minus}{cmd:)}}convert similarity to
dissimilarity: d(ij) = 1-s(ij){p_end}
{synopt:{opt force}}fix problems in proximity information{p_end}
{synopt:{opt dim:ension(#)}}configuration dimensions; default is
{cmd:dimension(2)}{p_end}
{synopt:{opt add:constant}}make distance matrix positive definite{p_end}
{syntab:Reporting}
{p2col:{opt neig:en(#)}}maximum number of eigenvalues to display; default is
{cmd:neigen(10)}{p_end}
{p2col:{opt con:fig}}display table with configuration coordinates{p_end}
{p2col:{opt nopl:ot}}suppress configuration plot{p_end}
{synoptline}
{p2colreset}{...}
{p 4 6 2}
* {opt id()} is required.
{p_end}
{p 4 6 2}
{cmd:by} and {cmd:statsby} are allowed; see {help prefix}.
{p_end}
{p 4 6 2}
The maximum number of compared objects allowed is the maximum matrix size;
see {help matsize}.
{p_end}
{p 4 6 2}
See {help mds postestimation} for features available after estimation.
{p_end}
{title:Description}
{pstd}
{cmd:mdslong} performs classical metric multidimensional scaling (MDS) for
two-way proximity data in long format with an explicit measure of similarity
or dissimilarity between objects.
{pstd}
For MDS with two-way proximity data in a matrix, see {helpb mdsmat}. If you
are looking for MDS on a data set, based on dissimilarities between
observations over variables, see {helpb mds}.
{title:Options}
{dlgtab:Model}
{phang}{opt id(var1 var2)}
is required. The pair of variables {it:var1} and {it:var2} should uniquely
identify comparisons. {it:var1} and {it:var2} are string or numeric variables
that identify the objects to be compared. {it:var1} and {it:var2} should be
of the same datatype; if value labeled, they should be labeled with the same
value label. Using value labeled variables or string variables is generally
helpful in identifying the points in plots and tables.
{tab}Example data layout for {cmd:mdslong proxim, id(i1 i2)}.
{space 19}{cmd:proxim i1 i2}
{space 19}{hline 18}
{space 16} 7 1 2
{space 16} 10 1 3
{space 16} 12 1 4
{space 16} 4 2 3
{space 16} 6 2 4
{space 16} 3 3 4
{space 19}{hline 18}
{phang}{cmd:s2d(standard}|{cmd:oneminus)}
specifies how similarities are converted into dissimilarities.
By default {cmd:mdslong} assumes dissimilarity data.
Specifying {opt s2d()} indicates that your proximity data are similarities.
{pmore}
Dissimilarity data should have zeros on the diagonal (i.e., an object is
identical to itself) and non-negative off diagonal values.
Dissimilarities need not satisfy the triangular inequality,
D(i,j)^2 {ul:<} D(i,h)^2 + D(h,j)^2. Similarity data should have ones on the
diagonal (i.e., an object is identical to itself) and have off-diagonal values
between zero and one. In either case, proximities should be symmetric. See
option {cmd:force} if your data violate these assumptions.
{pmore}
The available {cmd:s2d()} options, {cmd:standard} and {cmd:oneminus}, are
defined as:
{p2colset 13 25 27 2}{...}
{p2col:{cmd:standard}}d(ij) = sqrt(s(ii)+s(jj)-2s(ij)) = sqrt(2(1-s(ij))){p_end}
{p2col:{cmd:oneminus}}d(ij) = 1-s(ij){p_end}
{p2colreset}{...}
{phang}
{opt force} corrects problems with the supplied proximity information.
In the long format used by {cmd:mdslong}, multiple measurements on (i,j) may
be available. Including both (i,j) and (j,i) is treated as multiple
measurements. This is an error, even if the measures are identical. Option
{cmd:force} uses the mean of the measurements. {cmd:force} also resolves
problems on the diagonal, i.e., comparisons of objects with themselves; these
should have zero-dissimilarity or unit-similarity. {cmd:force} does not
resolve incomplete data, i.e., pairs (i,j) for which no measurement is
available. Out of range values are also not fixed.
{phang}{opt dimension(#)}
specifies the dimension of the approximating configuration. {it:#} defaults
to 2 and should not exceed the number of positive eigenvalues of the centered
distance matrix.
{phang}{cmd:addconstant},
specifies that if the double centered distance matrix is not positive
semi-definite (psd), a constant should be added to the squared distances to
make it psd, and, hence, Euclidean.
{dlgtab:Reporting}
{phang}{opt neigen(#)}
specifies the number of eigenvalues to be included in the table. The default
is {cmd:neigen(10)}. Specifying {cmd:neigen(0)} suppresses the table.
{phang}{opt config}
displays the table with the coordinates of the approximating configuration.
This table may also be displayed by the postestimation command
{cmd:estat config}; see {help mds postestimation}.
{phang}{opt noplot}
suppresses the graph of the approximating configuration. Note that the graph
can still be produced later via {cmd:mdsconfig} which also allows the standard
graphics options for fine tuning the plot; see {help mds postestimation}.
{title:Remarks}
{pstd}
The purpose of multidimensional scaling (MDS) is to produce a
representation of a dissimilarity relation between a set of n objects by
Euclidean distances between a constructed configuration of points in a
low-dimensional Euclidean space, typically two-dimensional. If this
low-dimensional representation offers a good enough approximation, we may plot
the points in this low dimensional space, and interpret the (Euclidean,
straight-line) distance between the points as the dissimilarity between the
original objects. Points mapped close together are similar, points mapped
widely apart are dissimilar.
{pstd}
{it:depvar} specifies proximity data in either dissimilarity or similarity
form. The comparison pairs are identified by two variables specified in the
required option {cmd:id()}. Exactly one observation with a non-missing
{it:depvar} should be included for each pair (i,j). Pairs are unordered; you
do not include observations for both (i,j) and (j,i). Observations for
comparisons of objects with themselves (i,i) are optional. See option
{cmd:force} if your data violate these assumptions.
{title:Example}
{pstd}
A famous example in the MDS literature is the data on the percentage of times
that pairs of Morse code signals for two numbers (1,..,9,0) were declared the
same by 598 subjects. We enter the Morse data in long format. The entries
are in the order 1,2,...,9,0.
{cmd}{...}
{tab}. input digit1 digit2 freqsame
{tab} 2 1 62
{tab} 3 1 16
{tab} 3 2 59
{tab} 4 1 6
{tab} (lines omitted)
{tab} 0 1 52
{tab} 0 2 18
{tab} (lines omitted)
{tab} 0 9 79
{tab}. end
{txt}{...}
{pstd}
Note that we entered the proximity of (2,1), but not (1,2). We may enter
either (1,2) or (2,1), it does not matter which. We did not enter proximities
between the same objects, e.g., (2,2).
{cmd}{...}
{tab}. gen sim = freqsame/100
{tab}. mdslong sim, i(digit1 digit2) s2d(standard)
{txt}{...}
{title:Also see}
{psee}
Manual: {bf:[MV] mdslong}
{p_end}
{psee}
Online: {help mds postestimation};{break}
{helpb mds}, {helpb mdsmat};{break}
{helpb ca},
{helpb canon},
{helpb factor},
{helpb pca}
{p_end}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -