Conference Paper

Photo and Video Quality Evaluation: Focusing on the Subject.

DOI: 10.1007/978-3-540-88690-7_29 Conference: Computer Vision - ECCV 2008, 10th European Conference on Computer Vision, Marseille, France, October 12-18, 2008, Proceedings, Part III
Source: DBLP

ABSTRACT Traditionally, distinguishing between high quality professional pho- tos and low quality amateurish photos is a human task. To automatically assess the quality of a photo that is consistent with humans perception is a challenging topic in computer vision. Various differences exist between photos taken by pro- fessionals and amateurs because of the use of photography techniques. Previous methods mainly use features extracted from the entire image. In this paper, based on professional photography techniques, we first extract the subject region from a photo, and then formulate a number of high-level semantic features based on this subject and background division. We test our features on a large and diverse photo database, and compare our method with the state of the art. Our method performs significantly better with a classification rate of 93% versus 72% by the best existing method. In addition, we conduct the first study on high-level video quality assessment. Our system achieves a precision of over 95% in a reason- able recall rate for both photo and video assessments. We also show excellent application results in web image search re-ranking.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Example-based texture synthesis (ETS) has been widely used to generate high quality textures of desired sizes from a small example. However, not all textures are equally well reproducible that way. We predict how synthesizable a par-ticular texture is by ETS. We introduce a dataset (21, 302 textures) of which all images have been annotated in terms of their synthesizability. We design a set of texture features, such as 'textureness', homogeneity, repetitiveness, and ir-regularity, and train a predictor using these features on the data collection. This work is the first attempt to quantify this image property, and we find that texture synthesizability can be learned and predicted. We use this insight to trim images to parts that are more synthesizable. Also we suggest which texture synthesis method is best suited to synthesise a given texture. Our approach can be seen as 'winner-uses-all': picking one method among several alternatives, ending up with an overall superior ETS method. Such strategy could also be considered for other vision tasks: rather than build-ing an even stronger method, choose from existing methods based on some simple preprocessing.
    IEEE Conference on Computer Vision and Pattern Recognition; 06/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The popularity of mobile devices equipped with various cameras has revolutionized modern photography. People are able to take photos and share their experiences anytime and anywhere. However, taking a high quality photograph via mobile device remains a challenge for mobile users. In this paper we investigate a photography model to assist mobile users in capturing high quality photos by using both the rich context available from mobile devices and crowdsourced social media on the Web. The photography model is learned from community-contributed images on the Web, and dependent on user's social context. The context includes user's current geo-location, time (i.e., time of the day), and weather (e.g., clear, cloudy, foggy, etc.). Given a wide view of scene, our socialized mobile photography system is able to suggest the optimal view enclosure (composition) and appropriate camera parameters (aperture, ISO, and exposure time). Extensive experiments have been performed for eight well-known hot spot landmark locations where sufficient context tagged photos can be obtained. Through both objective and subjective evaluations, we show that the proposed socialized mobile photography system can indeed effectively suggest proper composition and camera parameters to help the user capture high quality photos.
    IEEE Transactions on Multimedia 01/2014; 16(1):184-200. · 1.78 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Most works on image retrieval from text queries have addressed the problem of re-trieving semantically relevant images. However, the ability to assess the aesthetic quality of an image is an increasingly important differentiating factor for search engines. In this work, given a semantic query, we are interested in retrieving images which are semanti-cally relevant and score highly in terms of aesthetics/visual quality. We use large-margin classifiers and rankers to learn statistical models capable of ordering images based on the aesthetic and semantic information. In particular, we compare two families of ap-proaches: while the first one attempts to learn a single ranker which takes into account both semantic and aesthetic information, the second one learns separate semantic and aesthetic models. We carry out a quantitative and qualitative evaluation on a recently-published large-scale dataset and we show that the second family of techniques signifi-cantly outperforms the first one.


Available from