This article presents a mixed methods approach for analysing text and image relations in violent extremist discourse. The approach involves integrating multimodal discourse analysis with data mining and information visualisation, resulting in theoretically informed empirical techniques for automated analysis of text and image relations in large datasets. The approach is illustrated by a study which aims to analyse how violent extremist groups use language and images to legitimise their views, incite violence, and influence recruits in online propaganda materials, and how the images from these materials are re-used in different media platforms in ways that support and resist violent extremism. The approach developed in this article contributes to what promises to be one of the key areas of research in the coming decades: namely the interdisciplinary study of big (digital) datasets of human discourse, and the implications of this for terrorism analysis and research.