US National Science Foundation and IEEE Computer Society Workshop on

# Perceptual Organization in Computer Vision

### Sept 20 and 21, 1999 (pre ICCV'99)      Corfu, Greece

Program for September 20, 1999
 Time Speaker Topic 8:30am-8:45am Kim Boyer and Sudeep Sarkar Introductory remarks 8:45am-9:30am Dr. Steve Lehar,  Schepens Eye Research Institute, affiliate of Harvard Medical School Computational implications of biological vision: A Gestalt model of spatial perception 9:30am-9:50am Jonas August and Steven Zucker Yale University Organizing curve elements with an indicator random field on the (unit) tangent bundle 9:50am-10:10am Ben Kimia, I Frankel, A Popescu Brown University Euler spiral for shape completion 10:10am-10:30am BREAK BREAK 10:30am-11:15am Dr. Jitendra Malik,  University of California, Berkeley Cue Combination and Aggregation in Grouping 11:15am-11:35am P. Vasseur, EM Mouaddib, C Pegard, A Dupuis Universite de Picardie, Amiens, France Object grouping by multiprimitive preattentive perceptual organization 11:35am-11:55am MS Lee, CKTang, Gerard Medioni University of Southen California A unified computational framework for feature inference and segmentation 12:01pm-1:30pm LUNCH LUNCH 1:30pm-2:15pm Dr. Zili Liu,  Rutgers University The role of convexity in perceptual completion: beyond good continuation 2:15pm-2:30pm Kim Boyer and Sudeep Sarkar Guidelines for breakout sessions 2:30pm-5:30pm BREAKOUT SESSIONS   Performance evaluation of perceptual organization tech.   Perceptual organization in image sequences   Perceptual organization principles   Computational models, and complexity issues   New applications for perceptual organization   Learning and perceptual organization

Program for September 21, 1999
 Time Speaker Topic 8:30am-9:15am Dr. Michael Kubovy,  Dept. of Psychology, University of Virginia From Gestalt Principles to Gestalt Laws 9:15am-9:35am David W. Jacobs NEC Research What makes viewpoint invariant properties perceptually salient? 9:35am-9:55am Daniel Crevier Opthalmos Systems, Canada Bayesian extraction of collinear segment chains from digital images 9:55am-10:15am M Lindenbaum and A Berengolts Technion, Israel A probabilistic interpretation of the saliency network 10:15am-10:30am BREAK 10:30am-11:15am Dr. Ram Nevatia,  University of Southern California Perceptual Organization for Object Description and Recognition 11:15am-11:35am Karvel Thornber and Lance Williams NEC Research Closed curves in the analysis and segmentation of images 11:35am-11:55pm Dr. Steve Lehar,  Schepens Eye Research Institute, affiliate of Harvard Medical School Harmonic resonance theory: An alternative computational paradigm to address Gestalt properties of perception 12:01pm-1:30pm LUNCH 1:30pm-2:15pm Dr. Eric Saund,  Xerox PARC Toward richer labels for visual structure 2:15pm-2:45pm Summary of breakout session on  Performance Evaluation of Perceptual Organization Techn. 2:45pm-3:15pm Summary of breakout session on  Perceptual Organization in Image Sequences 3:15pm-3:30pm BREAK 3:30pm-4:00pm Summary of breakout session on  Perceptual Organization Principles 4:00pm-4:30pm Summary of breakout session on  Computational Models and Complexity 4:30pm-5:00pm Summary of breakour session on  Learning and Perceptual Organization 5pm- Closing remarks

### WORKSHOP REGISTRATION:

You can register for the workshop (and/or the main conference -ICCV) HERE.

HOTEL BOOKING:

The preferred method of making hotel reservations is by direct fax to the hotel. To reserve a room at the hotel, please address your requirements by fax to :
Mr. Byron Tsonakis,
General Manager,
Corfu Holiday Palace
P.O. Box 124, Kanoni,
Corfu 49 100,
Greece
FAX: +30-661-45933
Include in your note that you will be attending  "IEEE ICCV'99" in order to receive the following rates -
Single Room:        $90.00US/night Double Room:$110.00US/night

# The papers presented at the workshop and summary of the discussion would be published as a book by Kluwer Academic Publishers. All attendees will receive a copy of the book, which is included in the registration.

TALK ABSTRACTS:

Closed Curves in the Analysis and Segmentation of Images
Karvel Thornber and Lance Williams

Through evolution vision takes advantage of persistent regularities, especially the closed boundaries of objects.  Once an object's boundary is found, its segmentation from the image is greatly facilitated.  The key question is what is the nature of the information processing which can pick out closed curves from large sets of edge segments.  In particular, can we understand this processing independent of the specific algorithms, mechanisms, or structures which implement them.  We have discovered a principled, quantitative theory of closure which we have demonstrated can 1) identify the boundaries of  unknown objects against textured backgrounds more accurately than other methods, 2) explain a well-known illusory contour phenomenon, and 3) segment the boundaries of unknown objects in real images at an average rate of 10 sec/ object. We have made this theory transparent by framing it as the solution to a previously unsolved in geometry-characterizing the distribution of closed curves through any set of edge segments in any dimension-and then by identifying the saliency of sequences of image edge segments with their probability of being included in a closed contour.

There has been considerable cross-fertilization of ideas in recent decades between theories of biological and machine vision.  However this has been largely a case of the blind leading the blind, for neither domain of knowledge can yet claim to have solved the problem of vision, or even to have discovered the most fundamental principles   responsible for   the  incredible  visual   performance observed in even the simplest animals such as  the common house fly. I  propose  an alternative  to  the neural  network,  or the feature detection paradigm common in machine vision.  I propose a perceptual modeling approach, i.e. a quantitative  model of  the  subjective experience   of   vision  independent  of  neurophysiological assumptions. This approach has  implications for both biological and machine vision, for   it  reveals a unique  computational   strategy evident in  spatial perception unlike anything proposed in  either domain.

The computational transformations in biological vision revealed by the properties of subjective  experience indicate a holistic, global style of computation unlike anything  devised  by man, and  certainly unlike the atomistic  sequential  paradigm of computation expressed   in most pure form in  the digital  computer.  I  propose that  these  holistic aspects of perception which have been  so problematic for conventional concepts of visual processing, are explained  by a harmonic resonance, or  pattern of standing waves in   the neural substrate.  The harmonic resonance model exhibits exactly those elusive Gestalt properties such as emergence, reification, and invariance, not as specialized circuits contrived to account for those properties individually, but as natural properties of the resonance itself.

From Gestalt Principles to Gestalt Laws
Michael Kubovy

The Gestalt principles of grouping (proximity, similarity, good continuation, common fate, and so on) are an essential foundation of psychology. Yet they have remained fairly vague, experimentally intractable, and unquantified. In my talk I will describe progress my students and I have made in the quest for clarity, lawfulness and precision in the formulation of these principles. Although most of my talk will be devoted to grouping by proximity and similarity, I will also show how we can generalize our new research techniques to the study of another important Gestalt phenomenon: apparent motion.

Toward Richer Labels for Visual Structure
Eric Saund

The descriptive vocabulary of proximity groups, curvilinear alignment, parallels, corners, closed regions, and coherent texture regions are all important components of visual structure in images, but they don't go far enough.  Labels for visual structure can become richer in at least two ways. They can reflect more domain knowledge, and they can hinge on more complex
computational processes than the grouping and partitioning operations that are the stock-in-trade of the field.  This talk will suggest two research domains pushing enriched visual labels for Perceptual Organization:

• document images: While printed text can now be successfully transcribed by OCR systems, sketches, annotations, and handwritten notes totally confuse the strong domain models behind existing techniques.  Yet the tremendous visual structure effortlessly perceived by the human eye  challenges Perceptual Organization to extend its weaker (and therefore  more general) models.
• posterized images of 3D scenes: While interpretation of photographic imagery remains beyond the reach of computer vision, we can distill important components by working with simplified, "posterized" images of scenes.  These permit the development of appropriate vocabularies and soft constraint propagation techniques for labeling surface occlusion relationships at L- and T-junctions.  Then, considerations of surface coherence, transparency, motion, etc., can be mixed in incrementally.
These avenues of investigation draw us inexorably toward the crucial issues of learning, and managing computational resources through visual attention.

A Unified Computational Framework for Feature Inference and Segmentation
Mi-Suen Lee, Chi-Keung Tang, and Gerard Medioni

We have developed a unified computational framework  for the inference of multiple salient structures such as
junctions, curves, regions, and surfaces from any combinations  of points, curve elements, surface elements, in 2-D and 3-D.  A book summarizing our research effort over the past seven years  is now in print, along with a companion software system available  to the community for experimentation and evaluation. The methodology  is grounded in two elements: tensor calculus for representation,  and voting for data communication. The proposed methodology is  non-iterative, requires no initial guess or thresholding, and  can handle the presence of multiple curves, regions, and surfaces  in a large amount of noise while still preserves discontinuities,  and the only free parameter is scale. We will demonstrate the approach on a number of examples, both in 2-D and 3-D, using  the software.

The role of convexity in perceptual completion: beyond good continuation
Zili Liu (joint with David Jacobs and Ronen Basri)

Since the seminal work of the Gestalt psychologists, there has been great interest in understanding what factors determine the perceptual organization of images.  While the Gestaltists demonstrated the significance of grouping cues such as similarity, proximity, and good continuation, it has not been well understood whether their catalog of grouping cues is complete --- in part due to the paucity of effective methodologies for examining the significance of various grouping cues. We describe a novel, objective method to study perceptual grouping of planar regions separated by an occluder.  We demonstrate that the stronger the grouping between two such regions, the harder it will be to resolve their relative stereoscopic depth.  We use this new method to call into question many existing theories of perceptual completion that are based on Gestalt grouping cues by demonstrating that convexity plays a strong role in perceptual completion.  In some cases convexity dominates the effects of the well known Gestalt cue of good continuation.  While convexity has been known to play a role in figure/ground segmentation,this is the first demonstration of its importance in perceptual completion.

What Makes Viewpoint Invariant Properties Perceptually Salient?
David Jacobs

It has been noted that many of the perceptually salient image properties identified by the Gestalt psychologists, such as
collinearity, parallelism, and good continuation, are viewpoint invariant.  That is, there exist scene structures that always produce images with these properties regardless of viewpoint, while other scene structures virtually never produce these properties.  This correlation between salience and invariance has suggested that the perceptual salience of viewpoint invariants is due to the leverage they provide for inferring 3-D properties of objects and scenes. However, we show that viewpoint invariance is not sufficient to distinguish these Gestalt properties; one can define an infinite number of viewpoint invariant properties that are not perceptually salient.  This leads to the question of what else the Gestalt properties might  have in common that contributes to their perceptual salience.

We then show that generally, the perceptually salient viewpoint invariant properties are {\em minimal}, in the sense that they can be derived using less image information than non-salient properties.  For example, given four image dots one can derive an infinite number of viewpoint invariants, but two or three dots produce the minimal viewpoint invariants collinearity and identity.  These are also perceptually salient.  We show that connectedness, closure, corners, trihedral vertices, collinearity, parallelism and convexity can be naturally characterized as minimal viewpoint invariants.  We also
discuss the salience of horizontal and vertical lines, right angles, and various types of symmetry, as being possibly derived from minimal viewpoint invariants.  We then point out that computations with minimal features are more tractable than those requiring higher order properties.  This provides support for the hypothesis that the biological relevance of an image property is determined both by the extent to which it provides information about the world and by the ease with which this property can be computed.

Bayesian Extraction of Collinear Segment Chains from Digital Images
Daniel Crevier

We present a probabilistic method for extracting chains of collinear  segments. We start by defining a quantitative measure of the deviation of a two-segment junction from perfect collinearity.  From simple assumptions for the distributions of segment lengths, orientations and positions, we compute, as a function of this  measure, a probability density for the accidental occurrence of  junctions. Perceptual rules allow the extraction of  a representative  population of non accidental junctions  from an image, from which a  probability density for the non accidental occurrence of deviations  is computed.  From these two distributions, we perform the bayesian dentification of  likely non-accidental junctions. These are  probabilistically combined into chains, through a procedure that  takes the interdependence of junctions into account. This procedure  is to our knowledge original, and represents a practical and accurate simplification of an otherwise exponentially complex analysis.  Successive iterations allow the bridging of larger gaps. The method  uses  both geometric and photometric information,  allows for  segment curvature, and automatically extracts statistics for  natural image contours. Examples are presented.

Cue Combination and Aggregation in Grouping
Jitendra Malik

The two central issues in grouping are (1) use of multiple factors and cues (2) obtaining global perceptual organization from local measurements. Both issues were raised by the Gestaltists in the early part of this century. I will present a (partial) solution in the normalized cut framework.
(Based on joint work with Jianbo Shi, Serge Belongie and Thomas Leung.)

Perceptual Grouping for Object Description and Recognition
R. Nevatia

It will be argued that object descripition and recognition is a key goal for perceptual organization. We will examine a hypothesize and verify approach to feature grouping. A key issue is the set of rules that should be used for grouping: they may come from specific shape models, generic shape models or attempt to be completely general. Some examples of each will be shown. Another important issue is how to combine diverse and uncertain evidence in selecting among possible grouping hypotheses. We will describe some recent work on this.

Sudeep Sarkar