Library Guides: Text mining & text analysis: Programming resources

Programming resources

Glossary of Computer Science
A list of definitions of terms and concepts used in computer science, its sub-disciplines, and related fields, including terms relevant to software, data science, and computer programming.
Language Technology and Data Analysis Laboratory (LADAL)
A school-based, collaborative support infrastructure for digital and computational humanities established and maintained by the School of Languages and Cultures at the University of Queensland. The LADAL assists with data processing, visualization, and analysis and offers guidance on matters relating to language technology and digital research tools.

LinkedIn Learning
LinkedIn Learning (formerly Lynda.com) is one of the largest software and skills training websites and is free for UQ students and staff.

Codeacademy
Learn to code interactively, with one of 15 popular coding languages for free. (Sign up required)
The Programming Historian
The Programming Historian offers novice-friendly, peer-reviewed tutorials that help humanists learn a wide range of digital tools, techniques, and workflows to facilitate their research.
The Digital Orientalist (topic = coding)
Run by a dedicated team of scholars, librarians, and students who share their experience using digital tools in the Humanities, especially as it relates to day-to-day workflow.

Text Mining with R: A Tidy Approach by Julia Silge; David Robinson
Publication Date: 2020
Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you'll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr.
Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from your Data, 2nd edition by Dipanjan Sarkar
Publication Date: 2016
The second edition of this book will show you how to use the latest state-of-the-art frameworks in Natural Language Processing, coupled with Machine Learning and Deep Learning to solve real-world case studies leveraging the power of Python.
Python for Linguists by Michael Hammond
Publication Date: 2020
Specifically designed for linguists, this book provides an introduction to programming using Python for those with little to no experience of coding. More experienced users of Python will also benefit from the advanced chapters on graphical user interfaces and functional programming.
Text Analysis with R for Students of Literature by Matthew L. Jockers; Rosamond Thalken
Publication Date: 2020
Provides a practical introduction to computational text analysis using the open source programming language R. Each chapter builds on its predecessor as readers move from small scale "microanalysis" of single texts to large scale "macroanalysis" of text corpora, and each concludes with a set of practice exercises that reinforce and expand upon the chapter lessons. Text Analysis with R is written with students and scholars of literature in mind but will be applicable to other humanists and social scientists wishing to extend their methodological toolkit to include quantitative and computational approaches to the study of text.

Text Analytics for Beginners using NLTK and Python
NLTK is a powerful Python package that provides a set of diverse natural languages algorithms. It is free, opensource, easy to use, large community, and well documented. NLTK consists of the most common algorithms such as tokenizing, part-of-speech tagging, stemming, sentiment analysis, topic segmentation, and named entity recognition.