Testing data fusion
Data fusion is a process for combining two or more surveys to form a database that can be analysed as if it came from a single survey e.g. to enable a cross-analysis of daily newspaper readership against peak-time television viewership from two surveys, one on readership and the other on viewership. JICNARS has been involved in an investigation of the feasibility of fusing the National Readership Survey (NRS) with a survey on financial matters conducted by Financial Research Services (FRS). This has raised the question of how a data fusion can be tested in the absence of a validation sample, i.e. a single survey covering the whole range of topics. Two issues will be discussed in relation to different types of estimate. These are accuracy, which can be considered in terms of the effective sample size of a fused data base, and bias. The paper discusses the tests applied to the FRS/NRS fusion to set upper limits to the effective sample size of the fused database; the practical difficulties that were encountered and the methods that might be applied to test such fusions in the future.
- This could also be of interest