Conference Paper

Photo and Video Quality Evaluation: Focusing on the Subject.

DOI: 10.1007/978-3-540-88690-7_29 Conference: Computer Vision - ECCV 2008, 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part III
Source: DBLP

ABSTRACT Traditionally, distinguishing between high quality professional pho- tos and low quality amateurish photos is a human task. To automatically assess the quality of a photo that is consistent with humans perception is a challenging topic in computer vision. Various differences exist between photos taken by pro- fessionals and amateurs because of the use of photography techniques. Previous methods mainly use features extracted from the entire image. In this paper, based on professional photography techniques, we first extract the subject region from a photo, and then formulate a number of high-level semantic features based on this subject and background division. We test our features on a large and diverse photo database, and compare our method with the state of the art. Our method performs significantly better with a classification rate of 93% versus 72% by the best existing method. In addition, we conduct the first study on high-level video quality assessment. Our system achieves a precision of over 95% in a reason- able recall rate for both photo and video assessments. We also show excellent application results in web image search re-ranking.

Download full-text


Available from: Xiaoou Tang, Feb 22, 2015
  • Source
    • "A notable exception and closely related to our work is the work by Jiang et al. [4], in which one of the three main criteria that characterize a salient object is that " it is most probably placed near the center of the image " [4]. The authors justify this characterization with the " rule of thirds " , which is one of the most well-known principles of photographic composition (see, e.g., [17]), and use a Gaussian distance metric as a model. We go beyond following the rule of third and show that the distribution of the objects' centroids correlates strongly positively with a 2-dimensional Gaussian distribution. "
    [Show abstract] [Hide abstract]
    ABSTRACT: It has become apparent that a Gaussian center bias can serve as an important prior for visual saliency detection, which has been demonstrated for predicting human eye fixations and salient object detection. Tseng et al. have shown that the photographer's tendency to place interesting objects in the center is a likely cause for the center bias of eye fixations. We investigate the influence of the photographer's center bias on salient object detection, extending our previous work. We show that the centroid locations of salient objects in photographs of Achanta and Liu's data set in fact correlate strongly with a Gaussian model. This is an important insight, because it provides an empirical motivation and justification for the integration of such a center bias in salient object detection algorithms and helps to understand why Gaussian models are so effective. To assess the influence of the center bias on salient object detection, we integrate an explicit Gaussian center bias model into two state-of-the-art salient object detection algorithms. This way, first, we quantify the influence of the Gaussian center bias on pixel- and segment-based salient object detection. Second, we improve the performance in terms of F1 score, Fb score, area under the recall-precision curve, area under the receiver operating characteristic curve, and hit-rate on the well-known data set by Achanta and Liu. Third, by debiasing Cheng et al.'s region contrast model, we exemplarily demonstrate that implicit center biases are partially responsible for the outstanding performance of state-of-the-art algorithms. Last but not least, as a result of debiasing Cheng et al.'s algorithm, we introduce a non-biased salient object detection method, which is of interest for applications in which the image data is not likely to have a photographer's center bias (e.g., image data of surveillance cameras or autonomous robots).
  • Source
    • "The works of Datta et al. [13] and Ke et al. [24] stand as the first efforts to infer aesthetic quality by applying Machine Learning techniques on those features, showing that aesthetics can be successfully inferred to some extent . Later works extended and improved the features [14] [22] [31], offered insights for the handling of images in specific corpora (e.g., paintings [28], images with faces [29]), and integrated image-enhancing systems [5]. Although those works yield good and interpretable results, custom-designed features cannot be exhaustive due to the diversity of both perceptual attributes and image corpora. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Little is known on how visual content affects the popularity on social networks, despite images being now ubiquitous on the Web, and currently accounting for a considerable fraction of all content shared. Existing art on image sharing focuses mainly on non-visual attributes. In this work we take a complementary approach, and investigate resharing from a mainly visual perspective. Two sets of visual features are proposed, encoding both aesthetical properties (brightness, contrast, sharpness, etc.), and semantical content (concepts represented by the images). We collected data from a large image-sharing service (Pinterest) and evaluated the predictive power of different features on popularity (number of reshares). We found that visual properties have low predictive power compared that of social cues. However, after factoring-out social influence, visual features show considerable predictive power, especially for images with higher exposure, with over 3:1 accuracy odds when classifying highly exposed images between very popular and unpopular.
    ACM Web Science Conference (WebSci), Bloomington, USA; 06/2014
  • Source
    • "Considering that professional photographers often skillfully differentiate the main subject of the photo from the background, features such as lighting difference and subject composition were adopted in [25] for photo quality classification. Visual attention model based on saliency map is deployed for photo assessment in [26]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The popularity of mobile devices equipped with various cameras has revolutionized modern photography. People are able to take photos and share their experiences anytime and anywhere. However, taking a high quality photograph via mobile device remains a challenge for mobile users. In this paper we investigate a photography model to assist mobile users in capturing high quality photos by using both the rich context available from mobile devices and crowdsourced social media on the Web. The photography model is learned from community-contributed images on the Web, and dependent on user's social context. The context includes user's current geo-location, time (i.e., time of the day), and weather (e.g., clear, cloudy, foggy, etc.). Given a wide view of scene, our socialized mobile photography system is able to suggest the optimal view enclosure (composition) and appropriate camera parameters (aperture, ISO, and exposure time). Extensive experiments have been performed for eight well-known hot spot landmark locations where sufficient context tagged photos can be obtained. Through both objective and subjective evaluations, we show that the proposed socialized mobile photography system can indeed effectively suggest proper composition and camera parameters to help the user capture high quality photos.
    IEEE Transactions on Multimedia 01/2014; 16(1):184-200. DOI:10.1109/TMM.2013.2283468 · 1.78 Impact Factor
Show more