Abstract:
Market research is embracing insightful new sources of data. Among them, behavioural data is one of the most promising. It has proved to have an edge over survey data by overcoming human memory limitations and lack of sincerity. The challenge, however, in sharing clickstream data with third parties is to avoid violating individual's privacy rights, as defined in the GDPR. To overcome this difficulty, we developed our first "PII Filter", based on an intuitive principle: public web sites can be accessed by anyone; therefore, those URLs should be visited by several people. As a result, a new PII Filter has been developed based on a much more Aristotelic principle, learning from experience. This new PII Filter relies on a supervised predictive classifier: a rule-based algorithm that learns from a labelled data set of URLs.