The reliability of television audience ratings
Television audience ratings are extremely important for the evaluation of television programmes, stations and commercials. It is argued that, as a result of various sample and other characteristics of the Dutch people meter panel, daily ratings of programmes may be highly unstable. The important characteristics that affect sample reliability are sample size, rating level (programme), the variables used for weighting the sample, the size of the weight factors, behavioral characteristics of the public, i.e. correlated viewing pat- terns of members within households and the way ratings are used (daily ratings, aggregated ratings or difference of ratings). Several sources of empirical data have been used: Dutch people meter ratings of programmes of one day, gross rating points aggregated per hour and per quarter and the gross rating points of two advertising campaigns. Examples are given of separate and simultaneous effects on the reliability of ratings of several target groups. It is found that sample stratification gives only a slight improvement of the reliability, because the stratification variables have a low correlation with individual viewing patterns. Correlated viewing patterns in households decrease reliability dramatically. The combined results show that daily ratings of most programmes lack the needed reliability for evaluation. This is even more the case when small target groups are considered. The summed or aggregated GRPs of large advertising campaigns comprise many measurements and even though they are measured in a panel, they are much more reliable. Specific examples of the reliability of programme ratings and advertising campaigns are given. One with highly stable ratings, the other with less stable ratings. Differences of ratings of a panel are much more reliable than sum scores. Difference scores are not used extensively, though.
- This could also be of interest