It is necessary in a multilingual survey to ensure the comparability of the data, no matter in which language it was collected. For purely closed-ended (multiple-choice) questions this is moderately straightforward, and depends largly on an accurate translation of the questionnaire. For open-ended questions, however, in which a free discursive or verbatim response is invited, answers given in different languages may be compared only after some form of translation. This translation is normally achieved by referring the verbatim responses back to a multi-lingual code-frame. The translation of this code-frame may well lie on the critical path of the data analysis: it is certainly a translation which cannot be completed until all the data is in. Advancing technology, particularly in the field of Computational Linguistics, allows us to consider another approach. This is to translate automatically the verbatim responses themselves before coding. Machine Translation (MT) has had some spectacular failures in the past, and is only just beginning to give useful results in particular restricted contexts.
SISDATA, the Statistical Information System, designed and maintained by Slamark International, an Italian information system and marketing consultancy firm, deals with secondary statistical data, that is, the data of the main sectors of a country's economic and social activity normally surveyed in official statistics. The system has two facets: a socio-economic analysis function applied to extremely disaggregated data covering a considerable time span, and a dynamic trend observer dealing with the most recent aggregated data in the form of time series. Currently envisaged in a European dimension, the Italian archive is already operative and will be joined by the French, Federal German and British sectors. The System is, moreover, multi-lingual, the descriptions to the statistical tables (rows and columns) being provided in the language of the country surveyed and English as the interface language.