Semantic Analysis for Image Resizing
With the wide popularity of portable display devices, images are often visualized through small displays of various resolutions and aspect ratios. On the other hand, images are typically captured for large display. Naive image resizing of the whole image results in over-squeezing of prominent image content and loss of details. Several image resizing methods have been proposed to selectively reduce less salient image region while preserving the salient image region. However, the definition of saliency is usually vague and based on the low-level image features that may not be related to the semantics.
Although high-level semantics analysis (such as automatic recognition of arbitrary objects) is infeasible in the near future, analysis of certain middle-level semantics, such as symmetry and foreshortening, is feasible. In this project, we propose a semantics-aware retargeting framework. Based on the detected semantics, the image can then be retargeted in a more sensible way that preserves the detected semantics. In particular, we first analyze one type of middle-level semantics, translational symmetry. By understanding the symmetry in an image, we may retarget the image by deleting or replicating the repeated “cells” in the symmetry, instead of deleting or replicating the semantics-less pixels as in existing approaches. In other words, such retargeting can be regarded as summarization. Knowing the semantics not only allows us to resize images and videos more intelligently, but also opens up a new space for retargeting. This new space of cell-by-cell processing provides much flexibility to avoid over-squeezing, over-stretching, and undesirable bending of prominent symmetric structure.
The major challenge of this project is the ability to efficiently and rapidly identify meaningful “cells” as our goal is for interactive applications. Identifying feature points is not effective for identifying meaningful cells, as a cell may contain a collection of feature points. In contrast, we need to identify suitable region-based features (blobs) efficiently and effectively. Understanding cells is probably only the first step towards high-level semantic understanding. To further understand the structure these cells form, we plan to approach this challenging semantic analysis problem based on the Gestalt principles of human visual perception. By solving these challenges, we believe the outcome (the knowledge, publications, and algorithms) from the proposed project should motivate further study along the direction of semantics-aware resizing by the community. Besides the developed algorithm should be directly applicable in mobile computing and movie production.
In the first stage of development, we analyze the translational symmetry, which exists in many real-world images. By detecting the symmetric lattice in an image, we can summarize, instead of only distorting or cropping, the image content. This opens a new space for image resizing that allows us to manipulate, not only image pixels, but also the semantic cells in the lattice. As a general image contains both symmetry & non-symmetry regions and their natures are different, we propose to resize symmetry regions by summarization and non-symmetry region by warping. The difference in resizing strategy induces discontinuity at their shared boundary. We demonstrate how to reduce the artifact. To achieve practical resizing applications for general images, we developed a fast symmetry detection method that can detect multiple disjoint symmetry regions, even when the lattices are curved and perspectively viewed. Comparisons to state-of-the-art resizing techniques and a user study were conducted to validate the proposed method. Convincing visual results are shown to demonstrate its effectiveness.
To further achieve high-level semantic analysis, we move forward to study the computational modeling of Gestalt phenomena. The well-known Gestalt rules, which summarize how forms, patterns, and semantics are perceived by humans from bits and pieces of geometric information. Although defining a computational model for each rule alone has been extensively studied, modeling a conjoint of Gestalt rules remains a challenge. In this work, we develop a computational framework which models Gestalt rules and more importantly, their complex interactions. Since Gestalt phenomenon is highly related to human ability in visual abstraction. As the first attempt to computational Gestalt modeling, we first evaluate our computational Gestalt model on 2D vector graphics. In particular, we apply conjoining rules to line drawings, to detect groups of objects and repetitions that conform to Gestalt principles. We summarize and abstract such groups in ways that maintain structural semantics by displaying only a reduced number of repeated elements, or by replacing them with simpler shapes. We show an application of our method to line drawings of architectural models of various styles. We believe that with the Gestalt analysis, we can take a much intelligent strategy in image resizing.
Prof. Tien-Tsin Wong, Department of Computer Science & Engineering
Last Updated: 25 Nov 2011