lrfclass classifies an image using the Localized Receptive
Field classifier (LRF). The Localized Receptive Field (LRF)
is based on a single layer of self-organizing, "localized
receptive field" units, followed by a single layer percep-
tron. The single layer of perceptron units use the LMS or
Adaline learning rule to adjust the weights. The weights
are adjusted or "trained" using the companion program,
"llrftrain". After training the weights, using the
"llrftrain" program, a number of similar images may be
quickly classified with this program based on the training
data set.
LRF network theory
The basic network model of the LRF consists of a two layer
topology. The first layer of "receptive field" nodes are
trained using a clustering algorithm, such as K-means, or
some other algorithm which can determine the receptive field
centers. Each node in the first layer computes a receptive
field response function, which should approach zero as the
distance from the center of the receptive field is
increased. The second layer of the LRF model sums the
weighted outputs of the first layer, which produces the out-
put or response of the network. A supervised LMS rule is
used to train the weights of the second layer nodes.
The response function of the LRF network is formulated as
follows:
f(x) = SUM(Ti * Ri(x))
where,
Ri(x) = Q( ||x - xi|| / Wi )
x - is a real valued vector in the input space,
Ri - is the ith receptive field response function,
Q - is a radially symmetric functio with a single
maximum at the origin, decreasing to zero at
large radii,
xi - is the center of the ith receptive field,
Wi - is the width of the ith receptive field,
Ti - is the weight associated with each receptive
field.
The receptive field response functions ( Ri(x) ), should be
formulated such that they decrease rapidly with increasing
radii. This ensures that the response functions provide
highly localized representations of the input space. The
response function used in this algorithm is modeled after
the gaussian, and uses the trace of the covariance matrix to
set the widths of the receptive field centers.
The number of receptive field response nodes in the first
layer of the LRF is determined by the number of cluster
centers in the "cluster center" image. The number of output
classes, and hence the number of output nodes in the second
(ie. last) layer of the LRF, is determined by the number of
desired classes that was specified in the "supervised" clas-
sification phase of the clustering. This information is
contained in the last band of the cluster center image. The
number of weights in the network is determined by the number
of receptive field response nodes and the number of output
nodes. That is,
#Wts = (#rf_response_nodes * #output_nodes) +
#output_nodes
The resulting output image is classified with the desired
number of classes specified in the last band of the "cluster
center" (-i2) image. The number of desired classes
corresponds to the number of output nodes in the last layer
of the LRF network. This classified image is of data
storage type INTEGER.
Input Arguments
The following arguments must be specified in the following
order when calling this lib routine:
image is the original input image, which may be a
multi-band image containing all of the feature
bands used in the classification. This image MUST
be of data storage type FLOAT.
cc_img is the "cluster center" image, which specifies
the cluster center locations in the feature space.
This image MUST contain the desired class informa-
tion, obtained from the supervised classification
step, as the last band in the image. Therefore
this image will contain one more band than the
input image. This image MUST be of data storage
Tom Sauer and Charlie Gage
COPYRIGHT
Copyright 1991, University of New Mexico. All rights
reserved.
******************************************************************************
Documentation for avs module lrftrain
INPUT
image the input image used for training.
cc_img the cluster center image specifying the clus-
ter centers.
var_img the cluster variance image specifying the
variances.
cn_img the cluster number image specifying which
vector/pixel belongs to which cluster.
converge the convergence parameter.
meu the weight update parameter.
border the border width in pixels of the input
image.
max_iter the maximum number of iterations until termi-
nation.
prt_mse the iteration interval for printing the MSE
to the stats file.
delta_mse the minimum change in the MSE between itera-
tions, for termination.
OUTPUT
wt_img the resulting weight image after training.
printdev the file containing the training statistics.
This routine was written with the help of and ideas from Dr.
Don Hush, University of New Mexico, Dept. of EECE.
DESCRIPTION
llrftrain trains on an image for the weights used with the
Localized Receptive Field classifier (see llrfclass). The
Localized Receptive Field (LRF) is based on a single layer
of self-organizing, "localized receptive field" units, fol-
lowed by a single layer perceptron. The single layer of
perceptron units use the LMS or Adaline learning rule to
adjust the weights.
LRF network theory
The basic network model of the LRF consists of a two layer
topology. The first layer of "receptive field" nodes are
trained using a clustering algorithm, such as K-means, or
some other algorithm which can determine the receptive field
centers. Each node in the first layer computes a receptive
field response function, which should approach zero as the
distance from the center of the receptive field is
increased. The second layer of the LRF model sums the
weighted outputs of the first layer, which produces the out-
put or response of the network. A supervised LMS rule is
used to train the weights of the second layer nodes.
The response function of the LRF network is formulated as
follows:
f(x) = SUM(Ti * Ri(x))
where,
Ri(x) = Q( ||x - xi|| / Wi )
x - is a real valued vector in the input space,
Ri - is the ith receptive field response function,
Q - is a radially symmetric functio with a single
maximum at the origin, decreasing to zero at
large radii,
xi - is the center of the ith receptive field,
Wi - is the width of the ith receptive field,
Ti - is the weight associated with each receptive
field.
The receptive field response functions ( Ri(x) ), should be
formulated such that they decrease rapidly with increasing
radii. This ensures that the response functions provide
highly localized representations of the input space. The
response function used here is modeled after the gaussian,
and uses the trace of the covariance matrix to set the
widths of the receptive field centers.
The weights for the output layer are found using the LMS
learning rule. The weights are adjusted at each iteration
to minimize the total error, which is based on the differ-
ence between the network output and the desired result.
Prior to using the LRF algorithm, it is necessary to run
"vkmeans" on the input training image to fix the cluster
centers, followed by a supervised classification of the
clustered image, which assigns a desired class to each clus-
ter center. NOTE that the image resulting from the super-
vised classification MUST be appended to the "cluster
center" image before running the LRF. This is necessary
since it makes the appropriate desired class assignments to
the cluster centers for the training phase of the LRF.
The number of receptive field response nodes in the first
layer of the LRF is determined by the number of cluster
centers in the "cluster center" image. The number of output
classes, and hence the number of output nodes in the second
(ie. last) layer, is determined by the number of desired
classes that was specified in the "supervised" classifica-
tion phase of the clustering. This information is contained
in the last band of the cluster center image. The number of
weights in the network is determined by the number of recep-
tive field response nodes and the number of output nodes.
That is,
#Wts = (#rf_response_nodes * #output_nodes) +
#output_nodes
Input Arguments
The following arguments must be specified in the following
order when calling this lib routine:
image is the original input image, which may be a
multi-band image containing all of the feature
bands used in the classification. This image MUST
be of data storage type FLOAT.
cc_img is the "cluster center" image, which specifies
the cluster center locations in the feature space.
This image MUST contain the desired class informa-
tion, obtained from the supervised classification
step, as the last band in the image. Therefore
this image will contain one more band than the
input image. This image MUST be of data storage
type FLOAT.
var_img is the "cluster variance" image, which specifies
the variances of the data associated with each
cluster center. This image should contain the
same number of data bands as the input image.
This image MUST be of data storage type FLOAT.
cn_img is the "cluster number" image, which specifies
which vector or pixel belongs to what cluster.
This image will be a single band image of the same
dimensions as the input image. This image MUST be
of data storage type INTEGER.
converge is a float value that specifies the convergence
value for the algorithm. When the current MSE
value reaches the specified convergence value, the
algorithm will terminate. This can be any float
value greater than or equal to zero.
meu is a float value that specifies the weight update
This image will be stored as data
storage type FLOAT.
printdev is a file specified by the printdev argument that
contains the statistics for the training phase of
the LRF.
This routine was written with the help of and ideas from Dr.
Don Hush, University of New Mexico, Dept. of EECE.
SEE ALSO
lrftrain(1), intro(3), vipl(3), verror(3), vutils(3)
llrfclass(3)
RESTRICTIONS
All input images except the "cluster number" image (cn_img)
MUST be of data storage type FLOAT. The "cluster number"
image (cn_img) MUST be of data storage type INTEGER. The
output "weight" image (wt_img) is of data storage type
FLOAT.
AUTHOR
Tom Sauer and Charlie Gage
COPYRIGHT
Copyright 1991, University of New Mexico. All rights
reserved.
*******************************************************************************
Documentation for avs module viso2
INPUT
*img input image
*cc_img input cluster center image
min_pts_allowed
min number vectors/cluster
max_cl_allowed max number cluster
n_clusters inital number cluster
max_n_iters_iso
max iso data iterations
max_n_iters_kmeans
max kmeans data iterations
border image border
cc_flg initial cluster locations
*init_cluster[]
split_factor splitting factor
merge_factor merging factor
placement splitting placement factor
split_converge splitting convergence
merge_converge merging convergence
*printdev2 output ascii file descriptor
OUTPUT
*img output image
**outcc_img cluster center image
**outvar_img variance image
DESCRIPTION
viso2 converts an input image into vectors of equal size
and performs the iso2 clustering algorithm on the vectors
using the K initial cluster centers. The size of each vec-
tor is determined by the number of bands in the input image.
viso2 is a clustering algorithm which is derived from the
ISODATA clustering algorithm. viso2 initially clusters the
data using a standard K-means clustering algorithm. Then,
any cluster, along with its associated vectors, is discarded
if the membership of that cluster is below a user specified
threshold. Next, clusters exhibiting large variances are
split in two, and clusters that are too close together are
merged. After these operations, K-means is reiterated, and
the iso2 rules tested again. This sequence is repeated
until no clusters are discarded, split, or merged, or until
the maximum number of iterations has been met.
The merging and splitting rules used in iso2 are similar to
those used in ISODATA, except that several modifications
have been made which simplify parameter selection for the
user. The most significant change is that the splitting and
merging rules are now relative to the statistics of the data
rather than being dependent on absolute limits, and these
relative limits can be made to converge so that splitting
and merging are reduced as the algorithm progresses.
Secondly, the splitting rules are no longer dependent on a
user specified "desired number of clusters" nor on the
iteration index. Defining the splitting and merging
requirements so that they are relative to the data set
removes the need for the user to know the range of the data,
and makes characterization of the algorithm for the general
case possible. The methods of splitting and merging are
described below.
Splitting: relative standard deviation approach
The standard deviation in each dimension of each cluster is
computed, and the average of these standard deviations cal-
culated. The maximum standard deviation for each cluster is
then compared against the average standard deviation times a
user specified scaling factor (splitting factor). If the
maximum standard deviation is greater than that value, the
cluster is split in two in the dimension of maximum standard
deviation. Two other splitting conditions also exist: (1)
the number of points in the cluster must satisfy a minimum
condition, and (2) the average distance between members of a
cluster and their cluster center must be less than the aver-
age of this distance over all clusters. After all eligible
clusters have been split, the splitting factor is updated.
A cluster can only be split once per iteration.
Merging: relative distance approach
The merging process is similar to the splitting process
described above. The pairwise distances between cluster
centers are computed as is done in the original ISODATA
algorithm and, in addition, the average pairwise distance is
computed. If any distance between two cluster centers is
less than the average pairwise distance multiplied by a
scaling factor (merging factor), the two clusters are
merged. The user has the option to specify the maximum
number of cluster pairs that can be merged. Cluster pairs
with the smallest pairwise distances are merged first, and a
cluster can only be merged once per iteration. After all
eligible clusters are merged, the merging factor is updated.
If the cc_flg is set to TRUE, then viso2 will use the
*cc_img image for the initial cluster center values, other-
wise it will use the *init_clusters[] array of x-y image
corrdinates. viso2 takes the list of x-y corrdinates and
maps them to that location in the image and uses those pixel
values as the initial cluster centers.
viso2 will output a cluster center image, which is an image
of 1 row by N columns, where the N refers to the number of
clusters. The variance image is an 1 row by N column image
that specifies the variance for each cluster center. A
cluster number image is also computed. This image represents
the cluster number the vector (pixel) at that location in
the original image now belongs to. A statistics file is
also genterated and output to the printdev device.
Some Convensions:
viso2 can be run in 2 modes, one which produces all valid
centers and numbers and one that produces a cluster number
without a corresponding center. If minimum points
(min_pts_allowed) is set such that some vectors (pixels) are
discarded, then these vectors (pixels) are not associated
with any cluster. So, these vectors (pixels) will be
assigned to cluster 0, and the cluster center and variance
values will be set to zero. This all means that viso2 will
produce N+1 clusters. This is considered the incompatable
mode. This mode will still actually have valid input for
algorithms but will give strange results. To avoid
this, use min_pts_allowed = 1. If no vectors (pixels) are
discarded, then cluster number 0 will be a valid cluster and
contain a non-zero center and variance. In this case there
will be N clusters resulting.
Even though this man page contains much of the information
that the manual page for viso2 contains, it is highly recom-
mened that one reads the viso2 manual page. It will further
explain the parameters such as merge and split convergence.
All input images must be of data storage type FLOAT.
struct xvimage *img
Pointer to the input image to be clustered. Must be
data storage type FLOAT.
struct xvimage **outcc_img
Address of image that will contain the resulting clus-
ter center image.
struct xvimage **outvar_img
Address of image that will contain the resulting clus-
ter variance image.
int min_pts_allowed
The minimum number of vectors per cluster allowed.
int max_cl_allowed
The Maximum number of clusters desired.
int n_clusters
The initial number of cluster centers to start with.
int max_n_iters_iso
Maximum number of isodata iterations.
int max_n_iters_kmeans
Maximum number of k-means iterations per isodata
iteration.
int border
Specifies the border size, number of pixels, in the
image that will be ignored during clustering. The
border size is equal on all sides.
int cc_flg
This flag is set if the initial cluster centers will be
specified by the *cc_img.
struct xvimage *cc_img
Pointer to the image that will contain the initial
input cluster centers. This image must have the same
number of data bands as the *img does. cc_flg must be
set for this to be used.
int *init_cluster[]
A 2-d array containing the initial cluster center posi-
tions in the input image. The array is organized as a
list of x-y pairs that correspond to the position on
the initial cluster centers in the input image. This is
used only if cc_flg is set to FALSE.
float split_factor
specifies the splitting factor.
float merge factor
specifies the merge factor.
float placement
Specifies the distance between cluster centers when
they are split.
float split_converge
Specifies the splitting convergence.
float merge_converge
Specifies the merge convergence.
FILE * printdev2
Specifes the output device for the statistics informa-
tion.
SEE ALSO
viso2(1), intro(3), vipl(3), verror(3), vutils(3)
RESTRICTIONS
All input images must be of data storage type FLOAT.
AUTHOR
Tom Sauer, Donna Koechner
COPYRIGHT
Copyright 1991, University of New Mexico. All rights
reserved.
*******************************************************************************
Documentation for avs module vkmeans
INPUT
img input image structure.
img1_flag Output cluster centers in the "img1" image.
sqd_img_flag Output cluster variance in the "sqd_img"
image.
iter maximum number of iterations.
cluster number of desired clusters.
border specifies number of border pixels.
init_cluster array of initial cluster centers.
cc_flg indicator to use cluster center image as
input.
cc_img cluster center image structure.
map_flag If true output cluster center values in the
map of the cluster number image. If false do
not output cluster center values in the map
of the cluster number image.
OUTPUT
img1 image structure for cluster center image.
DESCRIPTION
vkmeans converts an input image into vectors of equal size
and performs the kmeans clustering algorithm on the vectors
using the K initial cluster centers. The size of each vec-
tor is determined by the number of bands in the input image.
The K-means algorithm is based on the minimization of the
sum of the squared distances from all points in a cluster to
a cluster center. The user chooses K initial cluster
centers and the image vectors are iteratively distributed
among the K cluster domains. New cluster centers are com-
puted from these results, such that the sum of the squared
distances from all points in a cluster to the new cluster
center is minimized.
Although the k-means algorithm does not really converge, a
practical upper limit is chosen for convergence. The user
has the option of specifying the maximum number of itera-
tions using the iter variable.
Results obtained by the k-means algorithm can be influenced
by the number and choice of initial cluster centers and the
geometrical properties of the data.
All input images must be of data storage type FLOAT.
User variables for the k-means algorithm are passed in as:
cc_flg If the cc_flg is set, a cluster center image,
passed in as cc_img, is used as input for the
K initial cluster centers. This image MUST
contain the same number of data bands as the
input image (img).
cc_img an image of type struct xvimage that contains
the first K cluster center values. This image
must have the same number of data bands as
img and as cluster.
sqd_img an image of type struct xvimage that will
contain the variance of the cluster centers.
This image must have the same number of data
bands as img and as cluster.
sqd_img_flag If this flag is set to TRUE, the "sqd_img"
image will contain the variance of the clus-
ter centers. If set to false, this informa-
tion is not output.
img a multiband image of type struct xvimage that
contains the data to be clustered.
img1 The cluster centers for the output image are
returned in the struct xvimage structure
img1.
img1_flag If this flag is set to TRUE, the "img1" image
will contain the cluster centers. If set to
false, this information is not output.
map_flag If "map_flag" is set to TRUE, the cluster
centers are output in the map of the cluster
number image "img". The number of rows in
the map is equal to the number of desired
clusters, while the number of columns is
equal to the dimensionality of the data.
Also, the ispare1 field in the image header
is set to 0 and the ispare2 field in the
image header is set to the dimension of the
vector. By setting the "map_flag" to true,
the output of vkmeans is compatable with the
spectrum classification application.
iter the maximum number of iterations that the
algorithm will perform to achieve the clus-
tering.
cluster the number of desired clusters; MUST match
the number of clusters in the cluster center
image (cc_img) or in the array of initial
cluster centers (init_cluster[]).
border specifies the number of border pixels, encom-
passing the desired region of the image to be
clustered using the k-means algorithm. Class
zero is reserved for the border pixels, while
the remainder of the image uses classes one
through k. The number of classes corresponds
to the number of cluster centers specified in
the variable cluster.
*init_cluster[]
The array of initial cluster centers
(init_cluster[][]) must be passed in as x-y
pairs, which specify the coordinates of the
initial cluster centers for the k-means algo-
rithm.
printdev The cluster centers are written to a file
specified by the 'printdev' argument.
This routine was written with the help of and ideas from Dr.
Don Hush, University of New Mexico, Dept. of EECE.
SEE ALSO
vkmeans(1), intro(3), vipl(3), verror(3), vutils(3)
RESTRICTIONS
All input images must be of data storage type FLOAT.
AUTHOR
Tom Sauer, Charlie Gage
COPYRIGHT
Copyright 1991, University of New Mexico. All rights
reserved.
*******************************************************************************
Documentation for avs module label
INPUT
OUTPUT
DESCRIPTION
vlabel performs a labeling in a multiband image or a clus-
ter image by attempting to merge connected pixels.
The principal of the algorithm is as follows: A pixel
receives the same label as its neighbor if the likelihood
distance between the two pixels is acceptable. The label
process is propagated for a given region number until it is
no longer possible to find a candidate.
Three different types of labeling choices exist:
1 First choice: typ = 0 uses a single or multi band
image, where the data storage must be VFF_TYP_FLOAT.
The input image corresponds to the xvimage structure
ima.
The distance will be computed using all the bands of
the image.
2. Second choice: typ = 1 uses a cluster number and clus-
ter center image obtained from an algorithm like
vkmeans or vquant. For this case, the cluster center
represents the value of a class of pixels that have
been grouped together. Therefore, the distance between
two neighbors will be the distance between their clus-
ters.
This case will require less computation time because
the algorithm will only compute the inter-class dis-
tance, instead of computing the distance of two neigh-
bors for the entire image.
An additional advantage with this choice, is that the
output from algorithms such as vkmeans or vquant may be
utilized, which may lead to better results.
The cluster center image corresponds to the xvimage
structure cc and must be FLOAT.
The cluster number image corresponds to the xvimage
structure cn and must be VFF_TYP_4_BYTE compatible with
either cc or ima depending on typ.
3. Third choice: typ = 2 is to use a single or multi-
band input image associated with a cluster number
image.
The advantage of this choice is that the results of a
clustering algorithm are used to keep the neighbor pix-
els that have the same cluster number in the same
class, and to rely on the distance in the single or
multi band image to group two neighbors that do not
belong to the same cluster.
The algorithmi also requires the following parameters:
Metric distance: There are 2 different metric distances that
can be used.
dist = 1 uses Euclidean distance: sqrt[(x-s)^2 + (y-
t)^2].
dist = 2 uses City Block distance: |x-s| + |y-t|.
Connectivity: There are two possible neighborhoods:
connect = 1 uses 4 connectivity to link the pixels
together.
connect = 2 uses 8 connectivity.
Minimum size of a region:
surf: determines the number of pixels required for a
region to be retained. The minimum number of pixels is
equal to:
Total number of pixels in image * surf / 100.0 (surf
corresponds to a percentage of the total number of pix-
els in the image).
Border Size:
Each pixel in the image is updated except those outside of
the border. The size of the border is specified by the
parameter bord.
Merging Process:
When the labeling process is computed, the user can expect
that the small rejected region will be merged together in a
bigger acceptable region or will be included inside another
connected region.
This choice is used by passing the parameter merge set to
TRUE. If merge is set to FALSE, the small regions will be
ignored and labeled as an UNDEFINED REGION (label number 0),
the same as the border.
The AUTOMATIC or MANUAL OPTION:
This option allows the user either to fix a threshold, or to
give an approximate number of regions. The user can choose
these options by passing a value TRUE or FALSE in the param-
eter opt_auto.
If the AUTOMATIC option is used, the algorithm will iterate
on the threshold until the number of regions labeled by the
process is comparable to the number of expected regions.
In fact, if the expected number is not reached after 30
iterations, the threshold that gives the closest number of
regions is used for the final labeling.
This option is easy to use, nevertheless, the function:
number of regions = F(Threshold) is not a monotoni-
cally increasing function, and the convergence toward a
solution may not exist.
reg_num (only used when opt_auto equals 1 or TRUE)
determines the approximate final number of regions
expected.
split (only used when opt_auto equals 0 or FALSE)
determines the threshold used by the labeling process.
Decreasing this value will increase the number of
regions found during the labeling process, but these
regions will get smaller and could be rejected by the
minimum size threshold.
Increasing this value will decrease the number of
regions found during the labeling process. At the same
time the number of small regions will decrease which
means that this area of the curve number of region =
F(Threshold) is more stable than the other one.
Once the user becomes accustomed to this routine, good
results are generally obtained. One way to become fam-
iliar with the routine, is to use the automatic option
and analyze the output ASCII file (Statistics on the
iteration process). This file contains the number of
regions labeled for each iteration, allowing the user
to see how the number of regions changes as the THRES-
HOLD is changed.
printdev File descriptor of the ascii file that will contain
all the information relative to the labeling process.
Output File lab will hold the labeled image in which every
pixel has as its region number a value. This image is of
VFF_TYP_4_BYTE data storage type. The region label numbers
are 1 to N. The region number 0 is reserved as an UNDEFINED
label or for the border.
SEE ALSO
vlabel(1), intro(3), vipl(3), verror(3), vutils(3)
lvkmeans(3), lviso2(3), lvquant(3), vkmeans(1), viso2(1),
vquant(1).
RESTRICTIONS
ima must be of VFF_TYP_FLOAT data storage type.
cc must be of VFF_TYP_FLOAT data storage type and compatible
with cn.
cn must be of VFF_TYP_4_BYTE data storage type and compati-
ble with cc or ima.
AUTHOR
Pascal ADAM
COPYRIGHT
Copyright 1991, University of New Mexico. All rights
reserved.
*******************************************************************************
Documentation for avs module vmindis
INPUT
img the input image that is to be classified.
center the image containing the center values (or
prototype vectors) with the last data band
being the class image.
border the border width on the image. The border
corresponds to a no class assignment.
OUTPUT
img a single band classified image.
DESCRIPTION
vmindis is a simple minimum distance classifier. The dis-
tance metric used to the Euclidean distance. This routine
accpets two images as the input. The image that corresponds
to the variable image is the image that needs to be classi-
fied. This image can be have as many data bands as neces-
sary. The number of data bands depicts the dimensionality of
the vectors. Each pixel in the image is a vector with dimen-
sionality of the number of data bands.
The other input image which corresponds to the variable
center is the prototype image. The last data band of this
image is the class data band. This image must contain the
same number of data bands as the other input image plus an
extra data band that represents the class mapping. This
image would most likely have been created by vkmeans or some
other routine that will give cluster centers. This image
contains vectors that correspond to the prototype of each
class.
As stated above the center image's last data band is a class
data band. The class data band simply maps each vector in
the center image to the final class.
The border variable allows the user to specify a border
width, in pixels, encompassing the image. The border region
is skipped by vmindis when classification is performed. This
usefull if neighborhood operators have been used previously
and have not updated edge pixels.
All input images must be of data storage type FLOAT.
img - struct xvimage the input image that is to be
classified.
center - struct xvimage the image containing the
center values (or prototype
vectors) with the last data
band being the class image.
border - int the border width on the image.
The border corresponds to a no
class assignment.
vmindis will return a 1 upon sucess and a 0 upon failure.
SEE ALSO
vmindis(1), intro(3), vipl(3), verror(3), vutils(3)
RESTRICTIONS
All input images must be of data storage type FLOAT.
AUTHOR
Tom Sauer
COPYRIGHT
Copyright 1991, University of New Mexico. All rights
reserved.
*******************************************************************************
Documentation for avs module vqerr
INPUT
img1 quantized VIFF image structure
img2 original VIFF image structure
img3 gating mask image
mflg a flag set (equal 1) if gating mask available
rms pointer to a DOUBLE to receive the RMS error
OUTPUT
rms is modified to contain the RMS quantization
error
Return Value: 1 on success, 0 on failure.
DESCRIPTION
vqerr computes the RMS quantization error between a quan-
tized image and the original image.
The first input image is the quantized image and should have
maps attached. The data type of the first image should be
BYTE or INT. The map should be of the same data type as the
data in the second image (the original). The original image
should have no maps attached.
The mask image must be the same data storage type and size
as the first input image.
The number of columns in the map of the first input image
should match the number of bands in the second (original)
input image.
The output is in text format, with the RMS quantization
error printed in using the %e format.
SEE ALSO
vqerr(1), intro(3), vipl(3), verror(3), vutils(3)
RESTRICTIONS
vqerr will not operate on VFF_TYP_BIT, VFF_TYP_COMPLEX, or
VFF_TYP_DOUBLE data storage types.
COPYRIGHT
Copyright 1991, University of New Mexico. All rights
reserved.
*******************************************************************************
Documentation fro avs module vquant
INPUT
i pointer to vector image to be quantized.
nvecs number of vectors to generate
mapflg output image map enable
ccimage cluster center output image structure
cvimage cluster variance output image structure
sp split point method: 0 is mid-span, 1 is mean
axis split axis: 1 is max-span axis, 2 is max
variance axis, 3 is principal eigenvector
OUTPUT
i the image pointed to by i on input is modi-
ifed to contain the single-band classifica-
tion pixel plane and the map is modified so
that it contains the representative vectors.
DESCRIPTION
vquant performs N-dimensional vector quantization, also
known as non-parametric vector classification. The technique
is based on Paul Heckbert's median cut, with the modifica-
tion that the histograms may have many independent axes, and
are stored in a sparse data structure. Additionally, the
splitting of a cluster is governed by the 2-norm of the sub-
space spanned by that cluster.
When splitting a cluster, the split axis is normally chosen
to be the axis with the largest span across the enclosed
data points, and the split point is the middle of the span.
By setting the variance split flag, the split axis is chosen
to be the axis with the largest variance of data across that
axis, which should give better results at the cost of more
computation. When the variance split flag is selected, the
split point is at the mean of the distribution of data long
the split axis.
Inputs:
i pointer to vector image to be quantized.
nvecs number of vectors to generate
mapflg output image map enable
ccimage cluster center output image structure
cvimage cluster variance output image structure
s split point method: 0 is mid-span, 1 is mean
axis split axis: 1 is max-span axis, 2 is max
variance axis, 3 is principal eigenvector
Outputs:
i the image pointed to by i on input is modi-
ifed to contain the single-band classifica-
tion pixel plane and the map is modified so
that it contains the representative vectors.
SEE ALSO
vquant(1), intro(3), vipl(3), verror(3), vutils(3)
RESTRICTIONS
vquant will currently only work on images of type
VFF_TYP_FLOAT. Additionally, the output is currently res-
tricted to type VFF_TYP_1_BYTE, and this in turn restricts
the maximum number of representative vectors to 256.
Additionally, vquant suffers from the same statistical
influences that vgamut does; see vgamut(1) for more details.
AUTHOR
Scott Wilson
COPYRIGHT
Copyright 1991, University of New Mexico. All rights
reserved.
*******************************************************************************
Documentation for avs module vwmdd
INPUT
img the input image that is to be classified.
center the image containing the center values (or
prototype vectors) with the last data band
being the class image. This image must have n
+ 1 data bands where n is the number of data
bands in img.
varimg the image containing the variance values for
the clusters. Should have the same number of
data bands as the input image (img).
border the border width on the image. The border
corresponds to a no class assignment. i.e.
the border region is ignored during the clas-
sification process.
k_factor scalar, fudge factor.
use_avg 1 - use summing method 0 - use non-summing
method
OUTPUT
img a single band classified image.
This routine was written with the help of and ideas from Dr.
Don Hush, University of New Mexico, Dept. of EECE.
DESCRIPTION
vwmdd is an Weighted minimum distance detector or Classif-
ier. The main idea behind this detector is that it will
distinguish a single class form the rest of the data. How-
ever, it will also work to distinguish multiple classes from
each other.
Theory
To classify or detect spatial patterns in an image, we must
have a template to compare against. Creating a template can
be thought of as training on representive (ideal) data. This
can be done by using a clustering algorithm on the represen-
tive data. Clustering routines such as vkmeans, vquant, iso-
data, etc can be used to create a template. The idea here
is to over cluster the representive data, and quantize that
into k classes. Pseudo color in xvdisplay, and vcustom can
be used to create a class image. When the clustering is per-
formed, two items need to be computed; (1) the cluster
center values, and (2) the variances for each axis in the
cluster. The variances can be obtained by looking at the
diagonal of the covariance matrix.
Explained below is a method of creating a the class assign-
ments for each cluster center. The class assignments may,
for example, quantize the space from 12 clusters to 2
classes. Example
f Cluster Center Class
rms N-dimensional vector quantization
1 --------------------- 2
2 --------------------- 2
3 --------------------- 1
4 --------------------- 2
.
.
.
12 --------------------- 1
This routine expects an image that contains the diagonal
terms of the covariance matrix, called the variance image,
and an image that contains the cluster center values and
class assignment for each cluster center. And of course the
image to be classified. This detector is to approximate the
following equation:
(X - Mi)~ Inv(C) (X - Mi)
(1)
where Mi is the mean, inv(C) is the inverse of the Covari-
ance matrix, and X is the data points, and ~ denotes tran-
spose. This can be written as
(d1 d2 ... dn)~ - Q1 - (d1 d2 ... dn) (2)
| Q2 |
| . 0 |
| . |
| 0 . |
| Qn |
- -
Which equals
sq(d1) sq(d2) sq(dn)
------ + ------ + ..... + ------ (3)
Q1 Q2 Qn
Notation:
sq => square
Qi => the ith variance
di => the ith (X - Mi) value
the matrix is actually the inverse of C
Since the inverse of the Covariance matrix can not be easily
determined, we only consider the diagonal terms or simlpy
the variance terms.
There are two methods for detecting classes in the input
image. Each data point is put through a test based on the
ideas above. The detector works in two modes as follows:
(1) uses the summed method -a option set to yes. This is an
apprioximation to method (2).
sq( || X - Mi || ) <
Vi = ------------------ > 1
K x sq(Si)
where: sq( || X - Mi || ) = sq(di)
= Eucludian Distance
Squared
sq(Si) = trace(C)
the trace of C is the sum of the
diagonal terms (variances).
K is a Constant
(2) non summed method (default).
sq(d1) sq(d2) sq(dn) <
Vi = ------ + ------ + ..... + ------ > 1
Q1 x K Q2 x K Qn x K
where: sq(dj) = sq( || xj - Mi || )
= Eucludian Distance Squared of the
jth element of the vector X.
Qj = the jth variance element of the varance
vector for the ith cluster center.
K = Constant
In both cases the constant K is used. This is used to adjust
the variance value(s). If K is large, Qi can increase such
that Vi is always less than 1. There are no specific rules
as to the optimal value for K.
The data point X is put throught the test for each cluster,
the data point X is assigned to a Class based on the follow-
ing criteria:
1. choose the smallest Vi value, where Vi is the result
for cluster i.
2. If Vi is greater than 1, assign X to the NULL class
(NULL class is always 0). Otherwise assign Vi to class
that corresponds to the ith cluster.
The scale factor (k_factor) must be choosen by trial and
error. It has been found that if the summed method is being
used a small (1.0 - 3.0) value for k is sufficent. If the
average method is not being used a larger k (8.0 - 10.0)
value is sufficient. Note, this may change based on the
data, so these ranges are only a starting point.
All input images must be of data storage type FLOAT.
img - struct xvimage the input image that is to be
classified. This is also used
as the output classified
image, so must be careful not
to overwrite important data.
center - struct xvimage the image containing the
center values (or prototype
vectors) with the last data
band being the class image.
This image must contain n + 1
data bands, where n = # of
data bands in the img image,
and 1 data band that specifies
the class assignments.
varimg - struct xvimage the image containing the vari-
ance values. This image must
have the same row and column
size of the center image, and
have the same number of data
bands as the input image
(img).
border - int the border width on the image.
The border corresponds to a no
class assignment. The border
area is ignored during the
classification.
k_factor - float the fudge factor, explained in
detail above.
use_avg - int if use_avg = 1, use the sum-
ming method, if use_avg = 0
use the non-summing method, as
explained above.
vwmdd will return a 1 upon sucess and a 0 upon failure.
This routine was written with the help of and ideas from Dr.
Don Hush, University of New Mexico, Dept. of EECE.
SEE ALSO
vwmdd(1), intro(3), vipl(3), verror(3), vutils(3)
lvkmeans(3)
RESTRICTIONS
All input images must be of data storage type FLOAT.
AUTHOR
Tom Sauer
COPYRIGHT
Copyright 1991, University of New Mexico. All rights
reserved.
*******************************************************************************