3. Data Sets
In this study 1,044 CGI and 1,114 photographs were used. Among these CGIs and photographs, 200 of the CGIs and 200 of the photos were used as test sets and the rest were used as training sets. The definition of CGI in our context has been given in section 1.
These pictures were downloaded from open sources. Approximately 80% of the photographs were downloaded from Charles W. Cushman Photograph Collection of Indiana University, and the rest were from [20]. (I used for my own study the exact same sources with the same ratio) Their pixel dimensions range from 375Ã?500 to 528Ã?1024.
The photorealistic images were downloaded from [21] and [22]. (I have had some problems to find 1.044 CGI using these only two sources, that's why I haven't finished to make up the whole data set....[635 CGI at the date of today]) Their dimensions ranged from 270Ã?315 to 1600Ã?1200 pixels. Both sources contain a variety of content such as human subjects, landscapes, indoor scenes, etc.
The image set contained only colour pictures because the present work focuses on features obtained from colour images.
4. Visual Features
We seek the determination of a good feature set which is able to discriminate photos from CGI. But it is quite a challenge to choose from a number of visual features. The first step in our study was to select four visual features and evaluate them.
4.1 Feature Selection
What are the key visual features that can readily distinguish photographs and CGIs? Rademacher [2] proposed that Surface Smoothness and Shadow Softness are two important factors when humans distinguish photos and CGIs with their eyes. Cutzuetal [6] proposed a set of features that are good for distinguishing photos from paintings. Based upon our visual inspection of some photographs and photorealistic images, we found that several features described by Cutzuetal [6] can be used in our algorithms to distinguish photographs and CGIs.
â?¢ Number of unique colours:
Computer Generated Images tend to have fewer unique colours than photographs. In a previous study, Cutzu et al [6] found that paintings appear to have more colours than photographs. But our observation about CGI is quite the opposite. In our study we observed that the average number of colours of photographs is shown to be about 25% more than that of CGI. Although CGI software provides a large colour palette, as does a real palette, drawing a picture on a computer is still different from drawing a painting on canvas. For example, if we draw a line with a brush on canvas, the colour will start fading from the beginning of the line to the end. But if we do it using computer software, the line will remain the same colour as the one you choose, unless we do something to change it afterwards. (In other words, there are more chances to find for the same numbers of colour pixels the exact same colour in CGI than in photographies... Unless you use a special brush in CGI for the whole creation) But photographs may contain all natural colours, so actually CGIs do not have as many unique colours as photos do. The richness of the colour palette of images, U, as described in [6], can be calculated by normalizing the total number of unique colours by the total number of pixels. Given an image with a <R,G,B> triplet for each pixel, we say that two pixels have the same colour if and only if they have the same R, G and B components. To reduce the impact of noise, a colour triplet was counted only if it appears in more than 10 pixels.
(http://img19.imageshack.us/img19/427/triplets.png)