Library Guides: Text mining & text analysis: Introduction

Text mining & Text analysis - what is the difference?

Text mining began with the computational and information management fields (e.g. database searching and information retrieval), whereas Text analysis began in the humanities with the manual analysis of text, (e.g Bible concordances and newspaper indexes). More recently, the two terms have become synonymous, and now generally refer to the use of computational methods to search, retrieve, and analyse text data.

"Text mining or text analytics is an umbrella term describing a range of techniques that seek to extract useful information from document collections through the identification and exploration of interesting patterns in the unstructured textual data of various types of documents – such as books, web pages, emails, reports or product descriptions." (Truyens & van Eecke, 2014)

Manual vs computational text analysis

Researchers have been analysing texts for centuries and manual text analysis techniques are still valid, and often preferred, for analysing text collections of a manageable size (say less than 100,000 words). However, with the accessibility of powerful computers, software, and programming languages, many text analysis techniques have been automated for use in analysis of collections of text data that are too large to be read, interpreted and coded manually by humans.

How does text mining work? (YouTube, 1m:35s). This video is an introduction to text mining and how it can be used in research.

Text Mining 101
What is text mining, how does it work and why is it useful? This article will help you understand the basics in just a few minutes.

Library Resources that provide an introduction to text mining and analysis

Text Mining: a guidebook for the social sciences by Gabriel (Gabe) Ignatow; Rada F. Mihalcea
Publication Date: 2016
This book brings together a broad range of contemporary qualitative and quantitative methods to provide strategic and practical guidance on analysing large text collections. It surveys the fast-changing landscape of data sources, programming languages, software packages, and methods of analysis available today. Suitable for novice and experienced researchers alike.
Social Research Methods by Alan Bryman
Publication Date: 2016
This introduction to research methods provides students and researchers with unrivalled coverage of both quantitative and qualitative methods, making it invaluable for anyone embarking on social research. This new edition provides engaging examples and practical tips to equip students with the tools and knowledge needed for them to complete their own research projects.
Computer Supported Qualitative Research by edited by António Pedro Costa, et al.
Publication Date: 2016
The book features seven main subjects: Rationale and Paradigms of Qualitative Research (theoretical studies); Systematisation of approaches with Qualitative Studies (literature review, integrating results, aggregation studies, meta -analysis, etc.); Qualitative and Mixed Methods Research (research processes that build on mixed methodologies but with priority to qualitative approaches); Data Analysis Types (content analysis , discourse analysis , thematic analysis , narrative analysis , etc.); Innovative processes of Qualitative Data Analysis (design analysis, different sources of data - images, audio, video); Qualitative Research in Web Context (eResearch, virtual ethnography, interaction analysis , latent corpus on the internet, etc.); Qualitative Analysis with Support of Specific Software (usability studies, user experience, etc).
Text Mining, Web Mining, and Visualization Use Cases Using Open Source Tools by Markus Hofmann (Editor)
Publication Date: 2016
Provides an introduction to text mining using some of the most popular and powerful open-source tools: KNIME, RapidMiner, Weka, R, and Python. The contributors--all highly experienced with text mining and open-source software--explain how text data are gathered and processed from a wide variety of sources, including books, server access logs, websites, social media sites, and message boards. Each chapter presents a case study that you can follow as part of a step-by-step, reproducible example.
Nvivo 11 Essentials by Bengt Edhlund; Allan McDougall
Publication Date: 2016
NVivo 11 Essentials is a comprehensive guide to the world's most popular qualitative data analysis software. Provides instruction to NVivo users of all skill levels and experience with both qualitative data analysis and qualitative research methods. Provides practical, anecdotal advice for using NVivo 11 for every stage of your research project.
Analysis of Images, Social Networks and Texts by Wil M.P. van der Aalst
Publication Date: 2018
This book constitutes the proceedings of the 6th International Conference on Analysis of Images, Social Networks and Texts, AIST 2016, held in Moscow, Russia, in July2017. The 29 full papers and 8 short papers were carefully reviewed and selected from 127 submissions. The papers are organized in topical sections on natural language processing; general topics of data analysis; analysis of images and video; optimization problems on graphs and network structures; analysis of dynamic behavior through event data; social network analysis.
A practical guide to sentiment analysis by Erik Cambria
Publication Date: 2017
This edited work presents studies and discussions that clarify the challenges and opportunities of sentiment analysis research. While sentiment analysis research has become very popular in the past ten years, most companies and researchers still approach it simply as a polarity detection problem. In reality, sentiment analysis is a ‘suitcase problem’ that requires tackling many natural language processing subtasks, including microtext analysis, sarcasm detection, anaphora resolution, subjectivity detection and aspect extraction.