From stream to structure

Date of publication: June 15, 2016

Author: Andrew Jeavons


One of the biggest problems any researcher is faced with today is reading. We can't read quickly enough. For numeric data there is an ocean of visualisation and analytics available, we can watch trends in 7 colours over 3 dimensions unfold on a smart phone. When it comes to textual data it is wholly different. Even small research projects can generate thousands of lines of text which we cannot hope to read thoroughly let alone summarise and understand. We need different forms of analysis for text and different visualisations. The paper shows via the analysis of tweets from USA Presidential candidates how Topic Analysis, using LDA, can be used to generate segments within unstructured information streams. LDA can be used on very large datasets, which is of significant importance.

