Library Guides: Text mining and text analysis: Introduction

Text mining and text analysis - what is the difference?

Text mining began with the computational and information management fields (e.g. database searching and information retrieval), whereas Text analysis began in the humanities with the manual analysis of text, (e.g Bible concordances and newspaper indexes). More recently, the two terms have become synonymous, and now generally refer to the use of computational methods to search, retrieve, and analyse text data.

"Text mining or text analytics is an umbrella term describing a range of techniques that seek to extract useful information from document collections through the identification and exploration of interesting patterns in the unstructured textual data of various types of documents – such as books, web pages, emails, reports or product descriptions." (Truyens & van Eecke, 2014)

Manual vs computational text analysis

Researchers have been analysing texts for centuries and manual text analysis techniques are still valid, and often preferred, for analysing text collections of a manageable size (say less than 100,000 words). However, with the accessibility of powerful computers, software, and programming languages, many text analysis techniques have been automated for use in analysis of collections of text data that are too large to be read, interpreted and coded manually by humans.

How does text mining work? (YouTube, 1m:35s). This video is an introduction to text mining and how it can be used in research.

Text Mining 101
What is text mining, how does it work and why is it useful? This article will help you understand the basics in just a few minutes.

Library resources that provide an introduction to text mining and analysis

Text Mining: a guidebook for the social sciences by Gabriel (Gabe) Ignatow; Rada F. Mihalcea
Publication Date: 2016

This book brings together a broad range of contemporary qualitative and quantitative methods to provide strategic and practical guidance on analysing large text collections. It surveys the fast-changing landscape of data sources, programming languages, software packages, and methods of analysis available today. Suitable for novice and experienced researchers alike.
Social Research Methods by Alan Bryman
Publication Date: 2016

This introduction to research methods provides students and researchers with unrivalled coverage of both quantitative and qualitative methods, making it invaluable for anyone embarking on social research. This new edition provides engaging examples and practical tips to equip students with the tools and knowledge needed for them to complete their own research projects.
Computer Supported Qualitative Research by Jaime Ribeiro (Editor), et al.
Publication Date: 2024

This book aims to bring together researchers, academics, and professionals, promoting the sharing and discussing knowledge, new perspectives, experiences, and innovations in Qualitative Research. This book includes selecting the articles accepted for presentation and discussion at WCQR2024, held January 23 to 25, 2024 (face-to-face and virtual conference). WCQR2024 featured four main application fields (Education, Health, Social Sciences, and Engineering/Technology) and seven main subjects: Rationale and Paradigms of Qualitative Research; Systematization of Approaches with Qualitative Studies; Qualitative and Mixed Methods Research; Data Analysis Types; Innovative Processes of Qualitative Data Analysis; Qualitative Research in Web Context; Qualitative Analysis with Software Support.
Text Mining and Visualization Use Cases Using Open Source Tools by Markus Hofmann (Editor)
Publication Date: 2016

Provides an introduction to text mining using some of the most popular and powerful open-source tools: KNIME, RapidMiner, Weka, R, and Python. The contributors--all highly experienced with text mining and open-source software--explain how text data are gathered and processed from a wide variety of sources, including books, server access logs, websites, social media sites, and message boards. Each chapter presents a case study that you can follow as part of a step-by-step, reproducible example.
Nvivo 12 Essentials by Bengt Edhlund; Allan McDougall
Publication Date: 2019

NVivo 12 Essentials is a comprehensive guide to the world's most popular qualitative data analysis software. Provides instruction to NVivo users of all skill levels and experience with both qualitative data analysis and qualitative research methods. Provides practical, anecdotal advice for using NVivo 12 for every stage of your research project.
Analysis of Images, Social Networks and Texts by Dmitry I. Ignatov (Editor), et al.
Publication Date: 2024

This book constitutes revised selected papers from the thoroughly refereed proceedings of the 11th International Conference on Analysis of Images, Social Networks and Texts, AIST 2023, held in Yerevan, Armenia, during September 28-30, 2023.The 24 full papers included in this book were carefully reviewed and selected from 93 submissions. They were organized in topical sections as follows: natural language processing; computer vision; data analysis and machine learning; network analysis; and theoretical machine learning and optimization. The book also contains one invited talk in full paper length.
A practical guide to sentiment analysis by Erik Cambria
Publication Date: 2017

This edited work presents studies and discussions that clarify the challenges and opportunities of sentiment analysis research. While sentiment analysis research has become very popular in the past ten years, most companies and researchers still approach it simply as a polarity detection problem. In reality, sentiment analysis is a ‘suitcase problem’ that requires tackling many natural language processing subtasks, including microtext analysis, sarcasm detection, anaphora resolution, subjectivity detection and aspect extraction.

Reuse this guide

This guide is licensed under a Creative Commons Attribution-NonCommercial 4.0 International Licence, except where otherwise noted.