PLEASE NOTE : This is the documentation for the avs module executable classify, which contains the following modules: lrfclass lrftrain viso2 vkmeans vlabel vmindis vqerr vquant vwmdd Any mention of xvimage is actually a "field 2D". Also, the INPUTs and OUTPUTs, which are mapped to avs parameters, inputs, and outputs, are for the khoros library routine. ******************************************************************************* Documentation for avs module lrfclass INPUT image the input image to be classified.SE NOTE : This is the documentation for the avs module executable classify, which contains the following modules: cc_img the cluster center image from training phase. var_img the cluster variance image from training phase. wt_img the resulting weight image from training phase. border the border width in pixels of the input image. OUTPUT class_img the resulting classified image. This routine was written with the help of and ideas from Dr. Don Hush, University of New Mexico, Dept. of EECE. DESCRIPTION lrfclass classifies an image using the Localized Receptive Field classifier (LRF). The Localized Receptive Field (LRF) is based on a single layer of self-organizing, "localized receptive field" units, followed by a single layer percep- tron. The single layer of perceptron units use the LMS or Adaline learning rule to adjust the weights. The weights are adjusted or "trained" using the companion program, "llrftrain". After training the weights, using the "llrftrain" program, a number of similar images may be quickly classified with this program based on the training data set. LRF network theory The basic network model of the LRF consists of a two layer topology. The first layer of "receptive field" nodes are trained using a clustering algorithm, such as K-means, or some other algorithm which can determine the receptive field centers. Each node in the first layer computes a receptive field response function, which should approach zero as the distance from the center of the receptive field is increased. The second layer of the LRF model sums the weighted outputs of the first layer, which produces the out- put or response of the network. A supervised LMS rule is used to train the weights of the second layer nodes. The response function of the LRF network is formulated as follows: f(x) = SUM(Ti * Ri(x)) where, Ri(x) = Q( ||x - xi|| / Wi ) x - is a real valued vector in the input space, Ri - is the ith receptive field response function, Q - is a radially symmetric functio with a single maximum at the origin, decreasing to zero at large radii, xi - is the center of the ith receptive field, Wi - is the width of the ith receptive field, Ti - is the weight associated with each receptive field. The receptive field response functions ( Ri(x) ), should be formulated such that they decrease rapidly with increasing radii. This ensures that the response functions provide highly localized representations of the input space. The response function used in this algorithm is modeled after the gaussian, and uses the trace of the covariance matrix to set the widths of the receptive field centers. The number of receptive field response nodes in the first layer of the LRF is determined by the number of cluster centers in the "cluster center" image. The number of output classes, and hence the number of output nodes in the second (ie. last) layer of the LRF, is determined by the number of desired classes that was specified in the "supervised" clas- sification phase of the clustering. This information is contained in the last band of the cluster center image. The number of weights in the network is determined by the number of receptive field response nodes and the number of output nodes. That is, #Wts = (#rf_response_nodes * #output_nodes) + #output_nodes The resulting output image is classified with the desired number of classes specified in the last band of the "cluster center" (-i2) image. The number of desired classes corresponds to the number of output nodes in the last layer of the LRF network. This classified image is of data storage type INTEGER. Input Arguments The following arguments must be specified in the following order when calling this lib routine: image is the original input image, which may be a multi-band image containing all of the feature bands used in the classification. This image MUST be of data storage type FLOAT. cc_img is the "cluster center" image, which specifies the cluster center locations in the feature space. This image MUST contain the desired class informa- tion, obtained from the supervised classification step, as the last band in the image. Therefore this image will contain one more band than the input image. This image MUST be of data storage type FLOAT. var_img is the "cluster variance" image, which specifies the variances of the data associated with each cluster center. This image should contain the same number of data bands as the input image. This image MUST be of data storage type FLOAT. wt_img contains the "weights" for the LRF network after training on the input data using "lrftrain". The number of data bands in this image is equal to the number of (nodes + 1) in the LRF network. The number of columns in each band is equal to the number of desired/output classes for the LRF net- work. This image is stored as data storage type FLOAT. border is an integer that specifies the border width, in pixels, encompassing the desired region of the image to be classified. This region is ignored during the classification process. Output Arguments class_img is the resulting classified image from the LRF classifier. This is a single band image that con- tains the class assignments for each pixel in the original input image. Its dimension is the same as the input image. The classification is based on the weights obtained from the training on a representative image. This image is stored as data type INTEGER. This routine was written with the help of and ideas from Dr. Don Hush, University of New Mexico, Dept. of EECE. SEE ALSO lrfclass(1), intro(3), vipl(3), verror(3), vutils(3) llrftrain(3) RESTRICTIONS All input images MUST be of data storage type FLOAT. The resulting classified image (-o) is of data storage type INTEGER. AUTHOR Tom Sauer and Charlie Gage COPYRIGHT Copyright 1991, University of New Mexico. All rights reserved. ****************************************************************************** Documentation for avs module lrftrain INPUT image the input image used for training. cc_img the cluster center image specifying the clus- ter centers. var_img the cluster variance image specifying the variances. cn_img the cluster number image specifying which vector/pixel belongs to which cluster. converge the convergence parameter. meu the weight update parameter. border the border width in pixels of the input image. max_iter the maximum number of iterations until termi- nation. prt_mse the iteration interval for printing the MSE to the stats file. delta_mse the minimum change in the MSE between itera- tions, for termination. OUTPUT wt_img the resulting weight image after training. printdev the file containing the training statistics. This routine was written with the help of and ideas from Dr. Don Hush, University of New Mexico, Dept. of EECE. DESCRIPTION llrftrain trains on an image for the weights used with the Localized Receptive Field classifier (see llrfclass). The Localized Receptive Field (LRF) is based on a single layer of self-organizing, "localized receptive field" units, fol- lowed by a single layer perceptron. The single layer of perceptron units use the LMS or Adaline learning rule to adjust the weights. LRF network theory The basic network model of the LRF consists of a two layer topology. The first layer of "receptive field" nodes are trained using a clustering algorithm, such as K-means, or some other algorithm which can determine the receptive field centers. Each node in the first layer computes a receptive field response function, which should approach zero as the distance from the center of the receptive field is increased. The second layer of the LRF model sums the weighted outputs of the first layer, which produces the out- put or response of the network. A supervised LMS rule is used to train the weights of the second layer nodes. The response function of the LRF network is formulated as follows: f(x) = SUM(Ti * Ri(x)) where, Ri(x) = Q( ||x - xi|| / Wi ) x - is a real valued vector in the input space, Ri - is the ith receptive field response function, Q - is a radially symmetric functio with a single maximum at the origin, decreasing to zero at large radii, xi - is the center of the ith receptive field, Wi - is the width of the ith receptive field, Ti - is the weight associated with each receptive field. The receptive field response functions ( Ri(x) ), should be formulated such that they decrease rapidly with increasing radii. This ensures that the response functions provide highly localized representations of the input space. The response function used here is modeled after the gaussian, and uses the trace of the covariance matrix to set the widths of the receptive field centers. The weights for the output layer are found using the LMS learning rule. The weights are adjusted at each iteration to minimize the total error, which is based on the differ- ence between the network output and the desired result. Prior to using the LRF algorithm, it is necessary to run "vkmeans" on the input training image to fix the cluster centers, followed by a supervised classification of the clustered image, which assigns a desired class to each clus- ter center. NOTE that the image resulting from the super- vised classification MUST be appended to the "cluster center" image before running the LRF. This is necessary since it makes the appropriate desired class assignments to the cluster centers for the training phase of the LRF. The number of receptive field response nodes in the first layer of the LRF is determined by the number of cluster centers in the "cluster center" image. The number of output classes, and hence the number of output nodes in the second (ie. last) layer, is determined by the number of desired classes that was specified in the "supervised" classifica- tion phase of the clustering. This information is contained in the last band of the cluster center image. The number of weights in the network is determined by the number of recep- tive field response nodes and the number of output nodes. That is, #Wts = (#rf_response_nodes * #output_nodes) + #output_nodes Input Arguments The following arguments must be specified in the following order when calling this lib routine: image is the original input image, which may be a multi-band image containing all of the feature bands used in the classification. This image MUST be of data storage type FLOAT. cc_img is the "cluster center" image, which specifies the cluster center locations in the feature space. This image MUST contain the desired class informa- tion, obtained from the supervised classification step, as the last band in the image. Therefore this image will contain one more band than the input image. This image MUST be of data storage type FLOAT. var_img is the "cluster variance" image, which specifies the variances of the data associated with each cluster center. This image should contain the same number of data bands as the input image. This image MUST be of data storage type FLOAT. cn_img is the "cluster number" image, which specifies which vector or pixel belongs to what cluster. This image will be a single band image of the same dimensions as the input image. This image MUST be of data storage type INTEGER. converge is a float value that specifies the convergence value for the algorithm. When the current MSE value reaches the specified convergence value, the algorithm will terminate. This can be any float value greater than or equal to zero. meu is a float value that specifies the weight update parameter for the learning algorithm. This value can be adjusted from 0 to 1. NOTE: this parameter may have a significant affect on the rate of learning, and it may have to be adjusted several times to get a feel for the optimum learning rate. border is an integer that specifies the border width, in pixels, encompassing the desired region of the image to be classified. This region is ignored during the classification process. max_iter is an integer that specifies the maximum number of iterations that the algorithm will run before ter- minating. It can be any integer value greater than zero. prt_mse is an integer specifying the iteration interval used to print the mean squared error (MSE) to the output statistics file. If this value is left at zero (the default), only the MSE of the first iteration is written to the file. Any other integer will cause the value of the MSE to be written to the statistics file at the iteration interval specified. delta_mse is a float value that specifies the minimum change in the MSE value from one iteration to the next. This parameter may be used to terminate the algo- rithm when the change in the MSE is zero or very small, but the MSE has not yet reached the speci- fied convergence value (-cv). This may occurr when the learning has reached a "plateau" or "bench" and is no longer learning. Output Arguments wt_img contains the resulting "weights" for the LRF net- work after training on the input data. The number of data bands in this image will be equal to the number of (nodes + 1) in the LRF network. The number of columns in each band will be equal to the number of desired/output classes for the LRF network. This image will be stored as data storage type FLOAT. printdev is a file specified by the printdev argument that contains the statistics for the training phase of the LRF. This routine was written with the help of and ideas from Dr. Don Hush, University of New Mexico, Dept. of EECE. SEE ALSO lrftrain(1), intro(3), vipl(3), verror(3), vutils(3) llrfclass(3) RESTRICTIONS All input images except the "cluster number" image (cn_img) MUST be of data storage type FLOAT. The "cluster number" image (cn_img) MUST be of data storage type INTEGER. The output "weight" image (wt_img) is of data storage type FLOAT. AUTHOR Tom Sauer and Charlie Gage COPYRIGHT Copyright 1991, University of New Mexico. All rights reserved. ******************************************************************************* Documentation for avs module viso2 INPUT *img input image *cc_img input cluster center image min_pts_allowed min number vectors/cluster max_cl_allowed max number cluster n_clusters inital number cluster max_n_iters_iso max iso data iterations max_n_iters_kmeans max kmeans data iterations border image border cc_flg initial cluster locations *init_cluster[] split_factor splitting factor merge_factor merging factor placement splitting placement factor split_converge splitting convergence merge_converge merging convergence *printdev2 output ascii file descriptor OUTPUT *img output image **outcc_img cluster center image **outvar_img variance image DESCRIPTION viso2 converts an input image into vectors of equal size and performs the iso2 clustering algorithm on the vectors using the K initial cluster centers. The size of each vec- tor is determined by the number of bands in the input image. viso2 is a clustering algorithm which is derived from the ISODATA clustering algorithm. viso2 initially clusters the data using a standard K-means clustering algorithm. Then, any cluster, along with its associated vectors, is discarded if the membership of that cluster is below a user specified threshold. Next, clusters exhibiting large variances are split in two, and clusters that are too close together are merged. After these operations, K-means is reiterated, and the iso2 rules tested again. This sequence is repeated until no clusters are discarded, split, or merged, or until the maximum number of iterations has been met. The merging and splitting rules used in iso2 are similar to those used in ISODATA, except that several modifications have been made which simplify parameter selection for the user. The most significant change is that the splitting and merging rules are now relative to the statistics of the data rather than being dependent on absolute limits, and these relative limits can be made to converge so that splitting and merging are reduced as the algorithm progresses. Secondly, the splitting rules are no longer dependent on a user specified "desired number of clusters" nor on the iteration index. Defining the splitting and merging requirements so that they are relative to the data set removes the need for the user to know the range of the data, and makes characterization of the algorithm for the general case possible. The methods of splitting and merging are described below. Splitting: relative standard deviation approach The standard deviation in each dimension of each cluster is computed, and the average of these standard deviations cal- culated. The maximum standard deviation for each cluster is then compared against the average standard deviation times a user specified scaling factor (splitting factor). If the maximum standard deviation is greater than that value, the cluster is split in two in the dimension of maximum standard deviation. Two other splitting conditions also exist: (1) the number of points in the cluster must satisfy a minimum condition, and (2) the average distance between members of a cluster and their cluster center must be less than the aver- age of this distance over all clusters. After all eligible clusters have been split, the splitting factor is updated. A cluster can only be split once per iteration. Merging: relative distance approach The merging process is similar to the splitting process described above. The pairwise distances between cluster centers are computed as is done in the original ISODATA algorithm and, in addition, the average pairwise distance is computed. If any distance between two cluster centers is less than the average pairwise distance multiplied by a scaling factor (merging factor), the two clusters are merged. The user has the option to specify the maximum number of cluster pairs that can be merged. Cluster pairs with the smallest pairwise distances are merged first, and a cluster can only be merged once per iteration. After all eligible clusters are merged, the merging factor is updated. If the cc_flg is set to TRUE, then viso2 will use the *cc_img image for the initial cluster center values, other- wise it will use the *init_clusters[] array of x-y image corrdinates. viso2 takes the list of x-y corrdinates and maps them to that location in the image and uses those pixel values as the initial cluster centers. viso2 will output a cluster center image, which is an image of 1 row by N columns, where the N refers to the number of clusters. The variance image is an 1 row by N column image that specifies the variance for each cluster center. A cluster number image is also computed. This image represents the cluster number the vector (pixel) at that location in the original image now belongs to. A statistics file is also genterated and output to the printdev device. Some Convensions: viso2 can be run in 2 modes, one which produces all valid centers and numbers and one that produces a cluster number without a corresponding center. If minimum points (min_pts_allowed) is set such that some vectors (pixels) are discarded, then these vectors (pixels) are not associated with any cluster. So, these vectors (pixels) will be assigned to cluster 0, and the cluster center and variance values will be set to zero. This all means that viso2 will produce N+1 clusters. This is considered the incompatable mode. This mode will still actually have valid input for other algorithms but will give strange results. To avoid this, use min_pts_allowed = 1. If no vectors (pixels) are discarded, then cluster number 0 will be a valid cluster and contain a non-zero center and variance. In this case there will be N clusters resulting. Even though this man page contains much of the information that the manual page for viso2 contains, it is highly recom- mened that one reads the viso2 manual page. It will further explain the parameters such as merge and split convergence. All input images must be of data storage type FLOAT. struct xvimage *img Pointer to the input image to be clustered. Must be data storage type FLOAT. struct xvimage **outcc_img Address of image that will contain the resulting clus- ter center image. struct xvimage **outvar_img Address of image that will contain the resulting clus- ter variance image. int min_pts_allowed The minimum number of vectors per cluster allowed. int max_cl_allowed The Maximum number of clusters desired. int n_clusters The initial number of cluster centers to start with. int max_n_iters_iso Maximum number of isodata iterations. int max_n_iters_kmeans Maximum number of k-means iterations per isodata iteration. int border Specifies the border size, number of pixels, in the image that will be ignored during clustering. The border size is equal on all sides. int cc_flg This flag is set if the initial cluster centers will be specified by the *cc_img. struct xvimage *cc_img Pointer to the image that will contain the initial input cluster centers. This image must have the same number of data bands as the *img does. cc_flg must be set for this to be used. int *init_cluster[] A 2-d array containing the initial cluster center posi- tions in the input image. The array is organized as a list of x-y pairs that correspond to the position on the initial cluster centers in the input image. This is used only if cc_flg is set to FALSE. float split_factor specifies the splitting factor. float merge factor specifies the merge factor. float placement Specifies the distance between cluster centers when they are split. float split_converge Specifies the splitting convergence. float merge_converge Specifies the merge convergence. FILE * printdev2 Specifes the output device for the statistics informa- tion. SEE ALSO viso2(1), intro(3), vipl(3), verror(3), vutils(3) RESTRICTIONS All input images must be of data storage type FLOAT. AUTHOR Tom Sauer, Donna Koechner COPYRIGHT Copyright 1991, University of New Mexico. All rights reserved. ******************************************************************************* Documentation for avs module vkmeans INPUT img input image structure. img1_flag Output cluster centers in the "img1" image. sqd_img_flag Output cluster variance in the "sqd_img" image. iter maximum number of iterations. cluster number of desired clusters. border specifies number of border pixels. init_cluster array of initial cluster centers. cc_flg indicator to use cluster center image as input. cc_img cluster center image structure. map_flag If true output cluster center values in the map of the cluster number image. If false do not output cluster center values in the map of the cluster number image. OUTPUT img1 image structure for cluster center image. DESCRIPTION vkmeans converts an input image into vectors of equal size and performs the kmeans clustering algorithm on the vectors using the K initial cluster centers. The size of each vec- tor is determined by the number of bands in the input image. The K-means algorithm is based on the minimization of the sum of the squared distances from all points in a cluster to a cluster center. The user chooses K initial cluster centers and the image vectors are iteratively distributed among the K cluster domains. New cluster centers are com- puted from these results, such that the sum of the squared distances from all points in a cluster to the new cluster center is minimized. Although the k-means algorithm does not really converge, a practical upper limit is chosen for convergence. The user has the option of specifying the maximum number of itera- tions using the iter variable. Results obtained by the k-means algorithm can be influenced by the number and choice of initial cluster centers and the geometrical properties of the data. All input images must be of data storage type FLOAT. User variables for the k-means algorithm are passed in as: cc_flg If the cc_flg is set, a cluster center image, passed in as cc_img, is used as input for the K initial cluster centers. This image MUST contain the same number of data bands as the input image (img). cc_img an image of type struct xvimage that contains the first K cluster center values. This image must have the same number of data bands as img and as cluster. sqd_img an image of type struct xvimage that will contain the variance of the cluster centers. This image must have the same number of data bands as img and as cluster. sqd_img_flag If this flag is set to TRUE, the "sqd_img" image will contain the variance of the clus- ter centers. If set to false, this informa- tion is not output. img a multiband image of type struct xvimage that contains the data to be clustered. img1 The cluster centers for the output image are returned in the struct xvimage structure img1. img1_flag If this flag is set to TRUE, the "img1" image will contain the cluster centers. If set to false, this information is not output. map_flag If "map_flag" is set to TRUE, the cluster centers are output in the map of the cluster number image "img". The number of rows in the map is equal to the number of desired clusters, while the number of columns is equal to the dimensionality of the data. Also, the ispare1 field in the image header is set to 0 and the ispare2 field in the image header is set to the dimension of the vector. By setting the "map_flag" to true, the output of vkmeans is compatable with the spectrum classification application. iter the maximum number of iterations that the algorithm will perform to achieve the clus- tering. cluster the number of desired clusters; MUST match the number of clusters in the cluster center image (cc_img) or in the array of initial cluster centers (init_cluster[]). border specifies the number of border pixels, encom- passing the desired region of the image to be clustered using the k-means algorithm. Class zero is reserved for the border pixels, while the remainder of the image uses classes one through k. The number of classes corresponds to the number of cluster centers specified in the variable cluster. *init_cluster[] The array of initial cluster centers (init_cluster[][]) must be passed in as x-y pairs, which specify the coordinates of the initial cluster centers for the k-means algo- rithm. printdev The cluster centers are written to a file specified by the 'printdev' argument. This routine was written with the help of and ideas from Dr. Don Hush, University of New Mexico, Dept. of EECE. SEE ALSO vkmeans(1), intro(3), vipl(3), verror(3), vutils(3) RESTRICTIONS All input images must be of data storage type FLOAT. AUTHOR Tom Sauer, Charlie Gage COPYRIGHT Copyright 1991, University of New Mexico. All rights reserved. ******************************************************************************* Documentation for avs module label INPUT OUTPUT DESCRIPTION vlabel performs a labeling in a multiband image or a clus- ter image by attempting to merge connected pixels. The principal of the algorithm is as follows: A pixel receives the same label as its neighbor if the likelihood distance between the two pixels is acceptable. The label process is propagated for a given region number until it is no longer possible to find a candidate. Three different types of labeling choices exist: 1 First choice: typ = 0 uses a single or multi band image, where the data storage must be VFF_TYP_FLOAT. The input image corresponds to the xvimage structure ima. The distance will be computed using all the bands of the image. 2. Second choice: typ = 1 uses a cluster number and clus- ter center image obtained from an algorithm like vkmeans or vquant. For this case, the cluster center represents the value of a class of pixels that have been grouped together. Therefore, the distance between two neighbors will be the distance between their clus- ters. This case will require less computation time because the algorithm will only compute the inter-class dis- tance, instead of computing the distance of two neigh- bors for the entire image. An additional advantage with this choice, is that the output from algorithms such as vkmeans or vquant may be utilized, which may lead to better results. The cluster center image corresponds to the xvimage structure cc and must be FLOAT. The cluster number image corresponds to the xvimage structure cn and must be VFF_TYP_4_BYTE compatible with either cc or ima depending on typ. 3. Third choice: typ = 2 is to use a single or multi- band input image associated with a cluster number image. The advantage of this choice is that the results of a clustering algorithm are used to keep the neighbor pix- els that have the same cluster number in the same class, and to rely on the distance in the single or multi band image to group two neighbors that do not belong to the same cluster. The algorithmi also requires the following parameters: Metric distance: There are 2 different metric distances that can be used. dist = 1 uses Euclidean distance: sqrt[(x-s)^2 + (y- t)^2]. dist = 2 uses City Block distance: |x-s| + |y-t|. Connectivity: There are two possible neighborhoods: connect = 1 uses 4 connectivity to link the pixels together. connect = 2 uses 8 connectivity. Minimum size of a region: surf: determines the number of pixels required for a region to be retained. The minimum number of pixels is equal to: Total number of pixels in image * surf / 100.0 (surf corresponds to a percentage of the total number of pix- els in the image). Border Size: Each pixel in the image is updated except those outside of the border. The size of the border is specified by the parameter bord. Merging Process: When the labeling process is computed, the user can expect that the small rejected region will be merged together in a bigger acceptable region or will be included inside another connected region. This choice is used by passing the parameter merge set to TRUE. If merge is set to FALSE, the small regions will be ignored and labeled as an UNDEFINED REGION (label number 0), the same as the border. The AUTOMATIC or MANUAL OPTION: This option allows the user either to fix a threshold, or to give an approximate number of regions. The user can choose these options by passing a value TRUE or FALSE in the param- eter opt_auto. If the AUTOMATIC option is used, the algorithm will iterate on the threshold until the number of regions labeled by the process is comparable to the number of expected regions. In fact, if the expected number is not reached after 30 iterations, the threshold that gives the closest number of regions is used for the final labeling. This option is easy to use, nevertheless, the function: number of regions = F(Threshold) is not a monotoni- cally increasing function, and the convergence toward a solution may not exist. reg_num (only used when opt_auto equals 1 or TRUE) determines the approximate final number of regions expected. split (only used when opt_auto equals 0 or FALSE) determines the threshold used by the labeling process. Decreasing this value will increase the number of regions found during the labeling process, but these regions will get smaller and could be rejected by the minimum size threshold. Increasing this value will decrease the number of regions found during the labeling process. At the same time the number of small regions will decrease which means that this area of the curve number of region = F(Threshold) is more stable than the other one. Once the user becomes accustomed to this routine, good results are generally obtained. One way to become fam- iliar with the routine, is to use the automatic option and analyze the output ASCII file (Statistics on the iteration process). This file contains the number of regions labeled for each iteration, allowing the user to see how the number of regions changes as the THRES- HOLD is changed. printdev File descriptor of the ascii file that will contain all the information relative to the labeling process. Output File lab will hold the labeled image in which every pixel has as its region number a value. This image is of VFF_TYP_4_BYTE data storage type. The region label numbers are 1 to N. The region number 0 is reserved as an UNDEFINED label or for the border. SEE ALSO vlabel(1), intro(3), vipl(3), verror(3), vutils(3) lvkmeans(3), lviso2(3), lvquant(3), vkmeans(1), viso2(1), vquant(1). RESTRICTIONS ima must be of VFF_TYP_FLOAT data storage type. cc must be of VFF_TYP_FLOAT data storage type and compatible with cn. cn must be of VFF_TYP_4_BYTE data storage type and compati- ble with cc or ima. AUTHOR Pascal ADAM COPYRIGHT Copyright 1991, University of New Mexico. All rights reserved. ******************************************************************************* Documentation for avs module vmindis INPUT img the input image that is to be classified. center the image containing the center values (or prototype vectors) with the last data band being the class image. border the border width on the image. The border corresponds to a no class assignment. OUTPUT img a single band classified image. DESCRIPTION vmindis is a simple minimum distance classifier. The dis- tance metric used to the Euclidean distance. This routine accpets two images as the input. The image that corresponds to the variable image is the image that needs to be classi- fied. This image can be have as many data bands as neces- sary. The number of data bands depicts the dimensionality of the vectors. Each pixel in the image is a vector with dimen- sionality of the number of data bands. The other input image which corresponds to the variable center is the prototype image. The last data band of this image is the class data band. This image must contain the same number of data bands as the other input image plus an extra data band that represents the class mapping. This image would most likely have been created by vkmeans or some other routine that will give cluster centers. This image contains vectors that correspond to the prototype of each class. As stated above the center image's last data band is a class data band. The class data band simply maps each vector in the center image to the final class. The border variable allows the user to specify a border width, in pixels, encompassing the image. The border region is skipped by vmindis when classification is performed. This usefull if neighborhood operators have been used previously and have not updated edge pixels. All input images must be of data storage type FLOAT. img - struct xvimage the input image that is to be classified. center - struct xvimage the image containing the center values (or prototype vectors) with the last data band being the class image. border - int the border width on the image. The border corresponds to a no class assignment. vmindis will return a 1 upon sucess and a 0 upon failure. SEE ALSO vmindis(1), intro(3), vipl(3), verror(3), vutils(3) RESTRICTIONS All input images must be of data storage type FLOAT. AUTHOR Tom Sauer COPYRIGHT Copyright 1991, University of New Mexico. All rights reserved. ******************************************************************************* Documentation for avs module vqerr INPUT img1 quantized VIFF image structure img2 original VIFF image structure img3 gating mask image mflg a flag set (equal 1) if gating mask available rms pointer to a DOUBLE to receive the RMS error OUTPUT rms is modified to contain the RMS quantization error Return Value: 1 on success, 0 on failure. DESCRIPTION vqerr computes the RMS quantization error between a quan- tized image and the original image. The first input image is the quantized image and should have maps attached. The data type of the first image should be BYTE or INT. The map should be of the same data type as the data in the second image (the original). The original image should have no maps attached. The mask image must be the same data storage type and size as the first input image. The number of columns in the map of the first input image should match the number of bands in the second (original) input image. The output is in text format, with the RMS quantization error printed in using the %e format. SEE ALSO vqerr(1), intro(3), vipl(3), verror(3), vutils(3) RESTRICTIONS vqerr will not operate on VFF_TYP_BIT, VFF_TYP_COMPLEX, or VFF_TYP_DOUBLE data storage types. COPYRIGHT Copyright 1991, University of New Mexico. All rights reserved. ******************************************************************************* Documentation fro avs module vquant INPUT i pointer to vector image to be quantized. nvecs number of vectors to generate mapflg output image map enable ccimage cluster center output image structure cvimage cluster variance output image structure sp split point method: 0 is mid-span, 1 is mean axis split axis: 1 is max-span axis, 2 is max variance axis, 3 is principal eigenvector OUTPUT i the image pointed to by i on input is modi- ifed to contain the single-band classifica- tion pixel plane and the map is modified so that it contains the representative vectors. DESCRIPTION vquant performs N-dimensional vector quantization, also known as non-parametric vector classification. The technique is based on Paul Heckbert's median cut, with the modifica- tion that the histograms may have many independent axes, and are stored in a sparse data structure. Additionally, the splitting of a cluster is governed by the 2-norm of the sub- space spanned by that cluster. When splitting a cluster, the split axis is normally chosen to be the axis with the largest span across the enclosed data points, and the split point is the middle of the span. By setting the variance split flag, the split axis is chosen to be the axis with the largest variance of data across that axis, which should give better results at the cost of more computation. When the variance split flag is selected, the split point is at the mean of the distribution of data long the split axis. Inputs: i pointer to vector image to be quantized. nvecs number of vectors to generate mapflg output image map enable ccimage cluster center output image structure cvimage cluster variance output image structure s split point method: 0 is mid-span, 1 is mean axis split axis: 1 is max-span axis, 2 is max variance axis, 3 is principal eigenvector Outputs: i the image pointed to by i on input is modi- ifed to contain the single-band classifica- tion pixel plane and the map is modified so that it contains the representative vectors. SEE ALSO vquant(1), intro(3), vipl(3), verror(3), vutils(3) RESTRICTIONS vquant will currently only work on images of type VFF_TYP_FLOAT. Additionally, the output is currently res- tricted to type VFF_TYP_1_BYTE, and this in turn restricts the maximum number of representative vectors to 256. Additionally, vquant suffers from the same statistical influences that vgamut does; see vgamut(1) for more details. AUTHOR Scott Wilson COPYRIGHT Copyright 1991, University of New Mexico. All rights reserved. ******************************************************************************* Documentation for avs module vwmdd INPUT img the input image that is to be classified. center the image containing the center values (or prototype vectors) with the last data band being the class image. This image must have n + 1 data bands where n is the number of data bands in img. varimg the image containing the variance values for the clusters. Should have the same number of data bands as the input image (img). border the border width on the image. The border corresponds to a no class assignment. i.e. the border region is ignored during the clas- sification process. k_factor scalar, fudge factor. use_avg 1 - use summing method 0 - use non-summing method OUTPUT img a single band classified image. This routine was written with the help of and ideas from Dr. Don Hush, University of New Mexico, Dept. of EECE. DESCRIPTION vwmdd is an Weighted minimum distance detector or Classif- ier. The main idea behind this detector is that it will distinguish a single class form the rest of the data. How- ever, it will also work to distinguish multiple classes from each other. Theory To classify or detect spatial patterns in an image, we must have a template to compare against. Creating a template can be thought of as training on representive (ideal) data. This can be done by using a clustering algorithm on the represen- tive data. Clustering routines such as vkmeans, vquant, iso- data, etc can be used to create a template. The idea here is to over cluster the representive data, and quantize that into k classes. Pseudo color in xvdisplay, and vcustom can be used to create a class image. When the clustering is per- formed, two items need to be computed; (1) the cluster center values, and (2) the variances for each axis in the cluster. The variances can be obtained by looking at the diagonal of the covariance matrix. Explained below is a method of creating a the class assign- ments for each cluster center. The class assignments may, for example, quantize the space from 12 clusters to 2 classes. Example f Cluster Center Class rms N-dimensional vector quantization 1 --------------------- 2 2 --------------------- 2 3 --------------------- 1 4 --------------------- 2 . . . 12 --------------------- 1 This routine expects an image that contains the diagonal terms of the covariance matrix, called the variance image, and an image that contains the cluster center values and class assignment for each cluster center. And of course the image to be classified. This detector is to approximate the following equation: (X - Mi)~ Inv(C) (X - Mi) (1) where Mi is the mean, inv(C) is the inverse of the Covari- ance matrix, and X is the data points, and ~ denotes tran- spose. This can be written as (d1 d2 ... dn)~ - Q1 - (d1 d2 ... dn) (2) | Q2 | | . 0 | | . | | 0 . | | Qn | - - Which equals sq(d1) sq(d2) sq(dn) ------ + ------ + ..... + ------ (3) Q1 Q2 Qn Notation: sq => square Qi => the ith variance di => the ith (X - Mi) value the matrix is actually the inverse of C Since the inverse of the Covariance matrix can not be easily determined, we only consider the diagonal terms or simlpy the variance terms. There are two methods for detecting classes in the input image. Each data point is put through a test based on the ideas above. The detector works in two modes as follows: (1) uses the summed method -a option set to yes. This is an apprioximation to method (2). sq( || X - Mi || ) < Vi = ------------------ > 1 K x sq(Si) where: sq( || X - Mi || ) = sq(di) = Eucludian Distance Squared sq(Si) = trace(C) the trace of C is the sum of the diagonal terms (variances). K is a Constant (2) non summed method (default). sq(d1) sq(d2) sq(dn) < Vi = ------ + ------ + ..... + ------ > 1 Q1 x K Q2 x K Qn x K where: sq(dj) = sq( || xj - Mi || ) = Eucludian Distance Squared of the jth element of the vector X. Qj = the jth variance element of the varance vector for the ith cluster center. K = Constant In both cases the constant K is used. This is used to adjust the variance value(s). If K is large, Qi can increase such that Vi is always less than 1. There are no specific rules as to the optimal value for K. The data point X is put throught the test for each cluster, the data point X is assigned to a Class based on the follow- ing criteria: 1. choose the smallest Vi value, where Vi is the result for cluster i. 2. If Vi is greater than 1, assign X to the NULL class (NULL class is always 0). Otherwise assign Vi to class that corresponds to the ith cluster. The scale factor (k_factor) must be choosen by trial and error. It has been found that if the summed method is being used a small (1.0 - 3.0) value for k is sufficent. If the average method is not being used a larger k (8.0 - 10.0) value is sufficient. Note, this may change based on the data, so these ranges are only a starting point. All input images must be of data storage type FLOAT. img - struct xvimage the input image that is to be classified. This is also used as the output classified image, so must be careful not to overwrite important data. center - struct xvimage the image containing the center values (or prototype vectors) with the last data band being the class image. This image must contain n + 1 data bands, where n = # of data bands in the img image, and 1 data band that specifies the class assignments. varimg - struct xvimage the image containing the vari- ance values. This image must have the same row and column size of the center image, and have the same number of data bands as the input image (img). border - int the border width on the image. The border corresponds to a no class assignment. The border area is ignored during the classification. k_factor - float the fudge factor, explained in detail above. use_avg - int if use_avg = 1, use the sum- ming method, if use_avg = 0 use the non-summing method, as explained above. vwmdd will return a 1 upon sucess and a 0 upon failure. This routine was written with the help of and ideas from Dr. Don Hush, University of New Mexico, Dept. of EECE. SEE ALSO vwmdd(1), intro(3), vipl(3), verror(3), vutils(3) lvkmeans(3) RESTRICTIONS All input images must be of data storage type FLOAT. AUTHOR Tom Sauer COPYRIGHT Copyright 1991, University of New Mexico. All rights reserved. *******************************************************************************