Journal Paper Abstracts
Sudeep Sarkar
- S. Borra and S. Sarkar, ``A Framework for Performance Characterization
of Intermediate Level Grouping Modules,'' IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 19, no. 11,
pp. 1306--1312, Nov. 1997.
We present five performance measures to evaluate grouping
modules in the context of constrained search and indexing based object
recognition. Using these measures, we demonstrate a sound
experimental framework, based on statistical ANOVA tests, to compare
and contrast three edge based organization modules, namely those of
Etemadi et al., Jacobs, and Sarkar-Boyer in the domain of aerial
objects using 50 images. With adapted parameters, the Jacobs module
performs overall the best for constraint based recognition. For fixed
parameters, the Sarkar-Boyer module is the best in terms of
recognition accuracy and indexing speedup. Etemadi et al.'s module
performs equally well with fixed and adapted parameters while the
Jacobs module is most sensitive to fixed and adapted parameter
choices. The overall performance ranking of the modules is Jacobs,
Sarkar-Boyer, and Etemadi et al.
- M. Heath, S. Sarkar, T. Sanocki, and K.W. Bowyer,
``A Robust Visual Method for Assessing the Relative Performance of
Edge Detection Algorithms'' IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol. 19, no. 12, pp. 1338--1359, Dec. 1997.
A new method for evaluating edge detection algorithms is presented and
applied to measure the relative performance of algorithms by Canny,
Nalwa-Binford, Iverson-Zucker, Bergholm and Rothwell. The basic
measure of performance is a visual rating score which indicates the
perceived quality of the edges for identifying an object. The process
of evaluating edge detection algorithms with this performance measure
requires the collection of a set of grey-scale images, optimizing the
input parameters for each algorithm, conducting visual evaluation
experiments and applying statistical analysis methods. The novel
aspect of this work is the use of a visual task and real images of
complex scenes in evaluating edge detectors. The method is appealing
because, by definition, the results agree with visual evaluations of
the edge images.
- L. Tsap, D. B. Goldgof, and S. Sarkar, ``Efficient Nonlinear
Finite Element Modeling of Nonrigid Objects via Optimization of Mesh
Models,'' to appear in the special issue of Computer Vision
and Image Understanding on CAD-Based Computer Vision, accepted 1997.
In this paper we propose a new general framework for the application
of the Nonlinear Finite Element Method (FEM) to nonrigid motion
analysis. We construct the models by integrating image data and prior
knowledge, using well-established techniques from computer vision,
structural mechanics and computer-aided design (CAD). These techniques
guide the process of optimization of mesh models. Linear FEM proved to
be a successful physically-based modeling tool in solving limited
types of nonrigid motion problems. However, linear FEM can not handle
nonlinear materials or large deformations. Application of nonlinear
FEM to nonrigid motion analysis has been restricted by difficulties
with high computational complexity and noise sensitivity.
We tackle the problems associated with nonlinear FEM by changing the
parametric description of the object to allow easy automatic control
of the model, using physically motivated analysis of the possible
displacements to address the worst effects of the noise, applying mesh
control strategies and utilizing multiscale methods. The combination
of these methods represents a new systematic approach to a class of
nonrigid motion applications for which sufficiently precise and
flexible FEM models can be built.
The results from the skin elasticity experiments demonstrate the
success of the proposed method. The model allows us to objectively
detect the differences in elasticity between normal and abnormal skin.
Our work demonstrates the possibility of accurate computation of point
correspondences and force recovery from range image sequences
containing nonrigid objects and large motion.
- S. Sarkar and Kim L. Boyer, ``Quantitative Measures of Change
based on Feature Organization: Eigenvalues and Eigenvectors'' to
appear in Computer Vision and Image Understanding, accepted 1997.
One important task of site monitoring is change detection from aerial
images. Change, in general, can be of various types. In this paper we
address the problem of developmental change at a site. For instance,
we would like to know about new construction at a previously
undeveloped site and possibly monitor its progress. Model based
approaches are not suited for this kind of change as it usually
happens in unmodelled areas. Since it is difficult to infer
construction activity by predicting and verifying specific local
features, we rely on more global statistical indicators.
The thesis of this paper is that the change induced by human activity
can be inferred from changes in the organization among the visual
features. Not only will the attributes of the individual image
features change but also the relationships among these features will
evolve. With the progress of construction we expect to see increased
structure among the image features. We exploit this emerging
structure, or organization, to infer change. In this paper, we propose
four measures to quantify the global statistical properties of the
individual features and the relationships among them. We base these
measures on the theory of graph spectra. We provide extensive analysis
of the robustness of these measures under various imaging conditions
and demonstrate the ability of these organization based measures to
detect coarsely incremental developmental changes.
- M. Heath, S. Sarkar, T. Sanocki, and K.W. Bowyer,
``Edge Detector Comparison: Initial Study and Methodology'' to appear
in Computer Vision and Image Understanding, accepted 1997.
Because of the difficulty of obtaining ground truth for real images,
the traditional technique for comparing low-level vision algorithms is
to present image results, side by side, and to let the reader
subjectively judge the quality. This is not a scientifically
satisfactory strategy. However, human rating experiments can be done
in a more rigorous manner, to provide useful quantitative conclusions.
We present a paradigm based on experimental psychology and statistics,
in which humans rate the output of low level vision algorithms. We
demonstrate the proposed experimental strategy by comparing four well
known edge detectors: Canny, Nalwa-Binford, Sarkar-Boyer, and
Sobel. We answer the following questions: Is there a statistically
significant difference in edge detector outputs as perceived by humans
when considering an object recognition task? Do the edge detection
results of an operator vary significantly with the choice of its
parameters? For each detector, is it possible to choose a single set
of optimal parameters for all the images without significantly
affecting the edge output quality? Does an edge detector produce
edges of the same quality for all images, or does the edge quality
vary with the image?
- S. Sarkar and K. L. Boyer, ``Perceptual
Organization in Computer Vision: A Review and a proposal for a
Classificatory Structure,'' IEEE Transactions on Systems,
Man, and Cybernetics , vol.~23, no.~2, pp.~382--399, Mar. 1993.
The evolution of perceptual organization in biological
vision, and its necessity in advanced computer vision systems, arises
from the characteristic that perception, the extraction of meaning
from sensory input, is an intelligent process. This is particularly
so for high order organisms and, analogically, for more sophisticated
computational models. In this paper we explore the role of perceptual
organization in computer vision systems. We do this from four vantage
points. First, we offer a brief history of perceptual organization
research in both humans and computer vision. Next, we propose a
classificatory structure in which to cast perceptual organization
research to clarify both the nomenclature and the relationships among
the many contributions. Thirdly, we review the perceptual
organization work in computer vision in the context of this
classificatory structure. Finally, we survey the array of
computational techniques applied to perceptual organization problems
in computer vision.
- S. Sarkar and K. L. Boyer, ``A Computational Structure for
Preattentive Perceptual Organization: Graphical Enumeration and Voting
Methods,'' IEEE Transactions on Systems, Man, and
Cybernetics , vol.~24, no.~2, pp.~246--267, Feb. 1994.
We present an efficient computational structure for preattentive
perceptual organization. By perceptual organization we refer to the
ability of a vision system to organize features detected in image
based on viewpoint consistency and other Gestaltic perceptual
phenomena. This usually has two components, a primarily bottom up
preattentive part and a top down attentive part, with meaningful
features emerging in a synergistic fashion from the original set of
(very) primitive features. In this work we advance a computational
structure for preattentive perceptual organization. We propose a
hierarchical approach, using voting methods to build associations
through consensus and relational graphs to represent the organization
at each level. The voting method is very efficient in terms of time
and space and performs impressively for a wide range of organizations.
The graphical representation allows the ready extraction of
higher order features, or perceptual tokens, because the relational
information is rendered explicit.
- S. Sarkar and K. L. Boyer, ``Integration, Inference, and Management of
Spatial Information Using Bayesian Networks: Perceptual
Organization,'' IEEE Transactions on Pattern Analysis and
Machine Intelligence (Special Section on Probabilistic Reasoning),
vol.~15, no.~3, pp.~256--274, Mar. 1993.
The use of knowledge bases has been advocated by many researchers to
make computer vision more stable and reliable. The formalism of
Bayesian networks provides a very elegant solution, in a probabilistic
framework, to the problem of integrating top down and bottom up visual
processes as well serving as a knowledge base. We modify the formalism
to handle spatial data and thus extend the applicability of Bayesian
networks to visual processing. We call the modified form the
Perceptual Inference Network (PIN). We present the theoretical
background of a PIN and demonstrate its viability in the context of
perceptual organization. Perceptual organization imparts robustness,
efficiency, and a qualitative and holistic nature to vision. So far the
approaches to the problem of perceptual organization have been purely
bottom up, without much top down knowledge base influence and hence
entirely dependent on the inputs, which may be imperfect. The knowledge
base, besides coping with such input imperfection, also allows us to
integrate multiple organizations and form a composite organization
hypothesis. The Perceptual Inference Network imparts an active
inferential and integrating nature to perceptual organization in an
elegant probabilistic framework.
- S. Sarkar and K. L. Boyer, ``Using Perceputal Inference Networks to
Manage Vision Processes,'' Computer Vision,
Graphics, and Image Processing: Image Understanding , vol. 62, no. 1,
pp.27--46, July 1995.
We provide a probabilistic framework, based on Perceptual Inference
Networks, for the management of computational resources such as special
purpose modules, feature detectors, and highly domain dependent
algorithms. Since these resources tend to be computationally expensive
and have limited applicability, judicious management is warranted. The
resources are used to build a comprehensive description of the scene.
Resources are selected in an information theoretic framework with the
maximization of information gain per unit of computation as the
optimality criterion. The viability of the algorithm is demonstrated
in perceptual organization tasks.
- S. V. Raman, S. Sarkar, and K. L. Boyer, ``Hypothesizing
Structures in Edge Focused Cerebral Magnetic Resonance Images Using
Graph Theoretic Cycle Enumeration," Computer Vision,
Graphics, and Image Processing: Image Understanding , vol.~57,
no. 1, pp. 81--98, Jan. 1993.
We present a novel method for the automatic generation of strcture
hypotheses suitable for recognition in medical images. We base the
approach on segment-based edge-focusing to precisely delineate
significant boundaries, and graph-theoretic cycle enumeration to
produce natural closures and, therefore, plausible tissue structures
of interest from incomplete boundary information. An efficient edge
focusing algorithm selects significant fine scale boundaries as those
natural descendants (in scale space) of prominent coarse scale edges.
The fine scale representation provides the localization precision
necessary, while the focusing ensures that only significant contours
surviving over a range of scales are considered and so eliminates much
of the ``clutter'' associated with a fine scale edge map. The spatial
relationships among the edge segments are stored in the form a
directed graph. Possible extensions (closures) of broken edge
segments are searched using time- and space-efficient voting methods.
Cycle enumeration techniques for directed graphs then generate the
structure hypotheses. The overall paradigm is fairly general and can
be used in other problem domains, certainly for images of other parts
of the anatomy. We demonstrate the effectiveness of the method with
extensive experimental results on various magnetic resonance images of
the human brain.
- S. V. Raman, S. Sarkar, and K. L. Boyer, ``Tissue boundary
refinement in magnetic resonance images using contour-based scale
space matching,'' IEEE Transactions on Medical Imaging ,
vol. 10, pp. 109--121, June 1991.
Precise measurement or shape analysis of structures or lesion is
required from cerebral magnetic resonance images for statistical
studies of possible relationships between structural malformation and
neuropsychiatric disorders, such as schizophrenia. Unfortunately,
individual delineation of such image features by human operators is so
cumbersome that it is done in only a few research centers. Tissue
boundaries in magnetic resonance images can be identified as
photometric edges in the visual representation. In recent years, the
Laplacian-of-Gaussian (LoG) and Canny's edge detector have proven most
interesting from the standpoint of mathematical optimality. These edge
detection filters incorporate a scaling parameter allowing the
tradeoff between edge localization and error rate to be made
appropriately. At large scales, edges are inaccurately located with
respect to the true underlying edge, but the boundaries detected arise
from significant physical events. At small scales, edges are precisely
located, but many false positive responses and excessive detail
emerge, producing an overly rich edge image with a great deal of
``clutter.'' Point based scale space searching has been proposed as a
mechanism for circumventing this problem but, to date, no robust,
efficient algorithms have been reported. In contrast, we have
developed a novel, whole contour based technique for tracing edges
selected at a coarse scale into successively finer scales to recover
the needed precision. The tracing algorithm builds consensus through a
fast pixel voting scheme. We present a rigorous approach to setting
the refinement schedule (quantizing the scale space) according to the
information redundancy between adjacent filters by defining a {\it
similarity functional}, which has broad applications. This has
particular application to the mensuration of various structures in
images of the brain. Although the LoG is used for many of the
experiments, we also present results using a new edge detector which
is mathematically superior to and faster to compute than the LoG and
for which fewer steps are required to traverse the same effective span
in scale space. We present experimental results on real data and
outline other potential applications.
- S. Sarkar and K. L. Boyer, ``On optimal infinite impulse
response edge detection filters.'' IEEE Transactions on
Pattern Analysis and Machine Intelligence , vol.~13, no.~2,
pp.~1154--1171, Nov. 1991.
In this paper we outline the design of an optimal, computationally
efficient, infinite impulse response edge detection filter. We
compute the optimal filter based on Canny's high signal to noise
ratio, good localization criteria, and a criterion on the spurious
response of the filter to noise. In our design procedure we
incorporate an expression for the width of the filter, appropriate for
infinite length filters, directly in the expression for spurious
responses. The three criteria are maximized using the variational
method and non-linear constrained optimization. The optimal filter
parameters are tabulated for various values of the filter performance
criteria. A complete methodology for implementing the optimal filters
using approximating recursive digital filtering is presented. The
approximating recursive digital filter is separable into two linear
filters operating in two orthogonal directions. The implementation is
very simple and computationally efficient. It has a constant time of
execution for different sizes of the operator and is readily amenable
to real time hardware implementation.
- S. Sarkar and K. L. Boyer, ``Optimal infinite impulse
response zero crossing based edge detectors,'' Computer
Vision, Graphics, and Image Processing: Image Understanding ,
vol.~54, no.~2, pp.~224--243, Sept. 1991.
We present formal optimality criteria and a complete design
methodology for a family of zero crossing based, infinite impulse
response (recursive) edge detection filters. In particular, we adapt
the optimality criteria proposed by Canny to filters designed to
respond with a zero crossing in the output at an edge location and
{\em additionally} to impulse responses which are (allowed to be)
infinite in extent. The spurious response criterion is captured
directly by means of an appropriate measure of filter spatial extent
for infinite responses. Infinite duration impulse responses may be
implemented efficiently with recursive filtering techniques and so
require constant computation time with respect to scale. As we will
show, we can achieve both superior performance and increased speed by
designing directly for an infinite impulse response than by any of the
proposed finite duration approaches. We also show that the optimal
filter which responds with a zero crossing in its output {\em may not}
be implemented by designing the optimal peak responding filter
(similar to Canny) and taking an additional derivative. It is
necessary to formulate the criteria and design for a zero crossing
response from the outset, else optimality is sacrificed. Filter
parameters and performance criteria are presented for several designs,
and experimental results are presented on a variety of images which
demonstrate the behavior in the presence of very adverse noise, with
respect to scale, and as compared to other ``optimal'' IIR filters
which have been reported.
Click here for instructions to get Source Code
- K. L. Boyer and S. Sarkar, ``Comments on ``On the
Localization Performance Measure and Optimal Detection","
IEEE Transactions on Pattern Analysis and Machine Intelligence ,
vol.~16, no.~1, pp.~106--107, Jan. 1994.
In a recent paper, Tagare and deFigueiredo present a localization
performance measure for edge detectors (PAMI-1990). They point out a
flaw in Canny's formulation (subsequently used by Sarkar and Boyer of
the localization criterion and motivate their form of the localization
criterion from a different line of reasoning. In this correspondence
we show that although Canny's derivation was wrong, the final form of
the criterion is adequate and can in fact be derived from Tagare and
deFigueiredo's formulation of the problem. We also point out
disadvantages of using the form of Tagare and deFigueiredo's
localization criterion.
- K. L. Boyer, D. M. Wuescher, and S. Sarkar, ``Dynamic edge
warping: An experimental system for recovering disparity maps in
weakly constrained systems,'' IEEE Transactions on Systems,
Man, and Cybernetics , vol. 21, pp. 143--158, Jan. 1991.
A new technique called dynamic edge warping (DEW) for recovering
reasonably accurate disparity maps from uncalibrated stereo
image pairs, is presented. That is, no precise knowledge of the
epipolar camera geometry is assumed. The technique is embedded in a
system including structural stereopsis on the front end and robust
estimation in digital photogrammetry on the other for the purpose of
self-calibrating stereo image pairs. Once the relative camera
orientation if know, the epipolar geometry is computed and the system
may use the information to refine its representation of the object
space. Such a system will find application in the autonomous
extraction of terrain maps from stereo aerial photographs, for which
camera position and orientation are unknown a priori, and for on-line
autonomous calibration maintenance for robotic vision applications, in
which the cameras are subject to vibration and other physical
disturbances after calibration. This work forms a component of an
intelligent system that begins with a pair of images and, having only
vague knowledge of the conditions under which they are accquired,
produces an accurate, dense, relative depth map. The resulting
disparity map may also be used directly in some high level application
involving qualitative scene analysis, spatial reasoning, and
perceptual organization of object space. The system as a whole
substitutes high level information and constraints for precise
geometric knowledge in driving and constraining the early
correspondence process.
Submitted Journal Paper Abstracts
- S. Sarkar, ``Context Dependent Perceptual Organization: Graph Spectral
Partitioning and Learning Automata,'' submitted to IEEE
Transactions on Pattern Analysis and Machine Intelligence, Dec. 1997.
Perceptual organization using Gestalt principles offers an elegant
framework to group low level features that are likely to come from a
single object. We offer a novel strategy to adapt this grouping
process to an object and its context in a scene. Given a set of
training images of an object in context, the associated learning
process decides on the relative importance of the basic Gestalt
relationships such as proximity, parallelness, similarity, symmetry,
closure, and common region towards segregating the object from the
background. This learning is accomplished using a team of stochastic
automata in a N-player cooperative game framework. The grouping
process which is based on graph partitioning is able to form {\em
large} groups from relationships defined over a small set of
primitives and is fast. We demonstrate the robust performance of the
grouping system on a variety of real images. Among the interesting
conclusions is the significant role of photometric attributes in
grouping and the ability to perform figure-ground segmentation from a
set of local relations, each defined over a small number of
primitives.
- L. Tsap, D. B. Goldgof, S. Sarkar, and P. Powers,
``A vision-based technique for objective assessment of burn scars,''
submitted to IEEE Transactions on Medical Imaging, Oct.~1997.
In this paper we propose a method for the objective assessment of burn
scars. The quantitative measures developed in this research provide an
objective way to calculate scar elasticity. The approach combines
range data and the mechanics and motion dynamics of human
tissues. Active contours are employed to locate regions of interest
and to find displacements of feature points using automatically
established correspondences. We are able to evaluate the changes in
strain distribution over time. Given images at two time instances and
their corresponding features, we use Finite Element Method (FEM) to
synthesize strain distributions of the underlying tissues. This
results in a physically-based framework for motion and strain
analysis. Elasticity of the burn scar is then recovered using
iterative descent search for the best nonlinear finite element model
that approximates stretching behavior of the region containing the
burn scar. The results from the skin elasticity experiments illustrate
the ability to objectively detect differences in elasticity between
normal and abnormal tissue. These estimated differences in elasticity
are correlated against the subjective judgments of physicians which
are presently the practice.
- L. Tsap, D. B. Goldgof, and S. Sarkar, ``Accurate tracking of
non-rigid motion through iterative refinement of finite element
models,'' submitted to IEEE Transactions on Pattern Analysis and
Machine Intelligence , Nov. 1997.
In this paper we propose new algorithms for accurate nonrigid motion
tracking. Given only a set of sparse correspondences and incomplete
or missing information about geometry or material properties, we can
recover dense motion vectors using finite element models. The method
is based on the iterative analysis of the differences between the
actual and predicted behavior. Large differences indicate that an
object's properties are not captured properly by the model describing
it. Feedback from the images during the motion allows the refinement
of the model by minimizing the error between the expected and true
position of the object's points. These errors are due to flaws in the
model parameter estimation such as geometry and material properties.
Unknown parameters are recovered using an iterative descent search for
the best nonlinear finite element model that approximates nonrigid
motion of the given object. During this search process we not only
estimate material properties, but also infer dense point
correspondences from our initial set of sparse correspondences. Thus,
during tracking the model is refined which, in turn, improves tracking
quality. As a result, we obtain a more precise description of nonrigid
motion.
Experimental results demonstrate the success of the proposed
algorithm. The method was applied to man-made elastic materials and
human skin to recover unknown elasticity, to complex 3-D objects to
find details of their geometry, and to a hand motion analysis
application. Our work demonstrates the possibility of accurate
quantitative analysis of nonrigid motion in range image sequences with
objects consisting of multiple materials and 3-D volumes.
- N. Saxena, S. Sarkar, and N. Ranganathan, ``Mapping and Parallel
Implementation of Bayesian Belief Networks,'' submitted to IEEE
Transactions on Parallel and Distributed Systems, Apr. 1996.
Bayesian belief networks are used for graphically representing
uncertainty and probabilistic dependence. Bayesian networks are
applied in computer vision, object recognition, feature detection,
medicine, CAM, troubleshooting and other applications wherein
decisions are conditionally dependent on many controlling factors.
Since most real time applications require fast response,
parallelization of Bayesian networks becomes important. This paper
presents an efficient technique for mapping polytree structured
Bayesian belief networks, onto the hypercube parallel machine
architecture. The proposed mapping is deadlock free since all the
messages are received and processed in the order of the structural
hierarchy of the nodes in a tree. The mapping scheme maintains
parent-child adjacency and single hop message passing throughout the
computation. The scheme was implemented and verified on a 64 node
nCUBE. The task allocation is static and is done at the beginning of
the computation. The proposed scheme allows for efficient mapping of
arbitrarily large trees onto a fixed size hypercube. It is shown that
the overall speed up corresponds to the height of the tree.
- T. K. Das and S. Sarkar, ``Optimal Preventive
Maintenance in a Single Machine Production Inventory System,''
Submitted to IIE Transactions on Quality, revised 1997.
In this paper we consider a production inventory system. The system
produces a single product type of which inventory is maintained
according to a ($S,s$) policy. Exogenous demand for the product
arrives according to a random process. Unsatisfied demands are not
back ordered. Such a make-to-stock production inventory policy is
found very commonly in discrete part manufacturing industry, e.g.,
automotive spare parts manufacturing.
It is assumed that the demand arrival process is Poisson, and the
production time of a unit has a general probability distribution. The
system is failure prone and the time between failures has a general
probability distribution. We conjecture that, for any such system, the
down time due to failures can be reduced through preventive
maintenance resulting in possible increases in the service level (\%
of satisfied demands) and the cost benefit. We develop a mathematical
model for systems whose repair time and maintenance time have general
probability distributions. Subsequently we develop expressions for
system performance measures. These performance measures, which are
functions of the preventive maintenance parameters, are used as basis
for optimal determination of the maintenance parameters. The
optimization approach is exemplified through a numerical example
problem. Interesting results from our numerical study together with a
outline of the solution procedure is presented to motivate and
facilitate the application of the modeling approach. Exact numerical
results obtained from the example problem can be used to benchmark
performance of other computationally attractive (perhaps, non optimal)
solution approaches.
- S. Sarkar and K. L. Boyer, ``Automated Design of Bayesian
Perceptual Inference Networks,'' Tech. Rep. SAMPL-93-03, SAMP-Lab,
Dept. of EE, OSU, March 1993, Presented at the International
Conference on Computer Vision and Pattern Recognition, 1994
In our previous work we presented the
Perceptual Inference Network (PIN), a formalism based on Bayesian
Networks, to reason among a set of object or feature hypotheses and to
integrate multiple sources of information in the context of perceptual
organization. The design of a PIN requires knowledge of the
dependency structure among the organizations of interest and the
specification of the conditional probabilities. Heretofore, this
design was done manually. In this paper we present an algorithm based
on structural entropic measures and Random Parametric Structural
Descriptions (RPSDs) to design a PIN automatically. Experimental
results present evidence of the robustness of the algorithm and make
performance comparisons on real image data with a manually structured
PIN. Since PINs are a form of Bayesian Network, we hope that this work
will also prove useful towards structuring Bayesian Networks in other
computer vision contexts.
Sudeep Sarkar
Last modified: Wed Feb 25 09:29:50 EST 1998