Statistics may be defined as the collection, analysis and interpretation of numerical data. As market research is concerned mostly with counting and measuring, it is not surprising that the theory of statistics can play an important part in assisting researchers to collect valid samples of data, and in helping them to draw correct conclusions from those data. In this chapter, the emphasis will be on the analysis and interpretation aspects of statistics dealing mainly with simple descriptive measures calculated from survey data and the testing of hypotheses about those data. The techniques and significance tests described below are those which have been found most useful in interpreting survey tabulations. This is not an exhaustive list, by any means, and the reader who wishes to know more about statistical analysis in market research may consult the texts given in the references at the end of the chapter.
Factor Analysis (like the similar technique of Principal Components Analysis) is a multivariate statistical method used primarily for data reduction. Itis usually employed to reduce the columns of our data matrix, that is the variables measured for each respondent. The reduction Is done by grouping together those variables which are intercorrelated as measured by the coefficient of correlation.
In our data matrix we have one variable of particular interest and we wish to see how it is affected by movements in a single explanatory variable. For example, the variable we might be interested in is Sales in £mn and we may wish to know how it is affected by advertising expenditure in £000's. The rows of our data matrix may consist of the last seven years of data for these two variables.
In survey research it is very rare for all respondents in a given population to be interviewed. We usually take a sample of that population. The reason why we can do this is because a sample can give us, not necessarily the accuracy of a census (or full count), but sufficient accuracy for prediction purposes. This is true if the sample is representative of the population from which it is drawn. There are various sampling methods that can be used if we wish to obtain a representative sample. Such samples can give, depending mainly on the size of the sample, results to given levels of precision.
How should I decide on the sample size for a survey? That is a question often posed by survey researchers to statisticians. It is difficult to answer simply as in market research we carry out surveys which more often than not carry a large number of different questions. There may be questions which are more important than others and hence need to be answered with a higher level of precision. A good starting point therefore is to consider the most important item to be measured by a proposed survey. For the moment we will assume that the survey is to be carried out using a Simple Random Sample and that the survey result is a percentage.
The simplest form of analysing data is to form survey tabulations. This is done by counting the number (and percentage) of people that fall in to the predefined categories of our questionnaire. The basic tool for the survey analyst is the cross tabulation in which one or more questions on the questionnaire form the rows of the cross tabulation and one or more different items form the columns. The simplest form would be where one question forms the rows and one demographic forms the columns.
Discriminant Analysis Models are very similar to regression models, but differ in one important respect. In Discriminant Analysis the key variable of interest Y is now categorised (eg, Buyer = 1, Non-buyer = 0) rather than being continuous like sales.
Clustering normally operates on the rows of the data matrix. It seeks to find groupings or clusters of respondents who exhibit similar patterns in terms of the variables measured.
In this technique we are interested how one key variable relates to other survey variables. The technique of AID is particularly useful when applied to survey tabulations. Let us suppose that we have interviewed 1,000 smokers and asked them. How many cigarettes do you smoke in an average day?' We would normally analyse the survey by such things as age, sex and region and interlaced categories of them. The idea would be to isolate age/sex/region groups which have very high and very low average smoking levels. In other words we are looking for differences in the key dependent variable smoking consumption. We are looking for categories of our survey which we may predict have high and low smoking levels.
In this chapter, the emphasis will be on the analysis and interpretation aspects of statistics dealing mainly with simple descriptive measures calculated from survey data and the testing of hypotheses about those data. The more advanced statistical techniques used for interpreting market research data will be covered in chapter 13. The techniques and significance tests described below are those which have been found most useful in interpreting survey tabulations. This is not an exhaustive list, by any means, and the reader who wishes to know more about statistical analysis in market research may consult the texts given in the references at the end of the book.
The paper considers the current state of multivariate analysis when the data available only take the form of binary and classified variables. This restriction has meant that researchers have had to develop new techniques to deal with problems of defining and segmenting markets. An encouraging amount of work has been done in this field in the last twenty years and now there are statistical methods for handling the various types of dependence and interdependence analyses. The paper reviews some of the literature that shows how market analysts proceed with model building when the data are not normally distributed measured variables. Among the techniques cited are those under the general umbrella of Log Linear Model!ing, which enable regression and correlation type procedures to be employed and new ways of clustering respondents. The paper concludes that this growing area will continue to flourish especially as computer programs are available for many of the techniques.