A school-based, collaborative support infrastructure for digital and computational humanities established and maintained by the School of Languages and Cultures at the University of Queensland. The LADAL assists with data processing, visualization, and analysis and offers guidance on matters relating to language technology and digital research tools.
Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you'll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr.
The second edition of this book will show you how to use the latest state-of-the-art frameworks in Natural Language Processing, coupled with Machine Learning and Deep Learning to solve real-world case studies leveraging the power of Python.
Specifically designed for linguists, this book provides an introduction to programming using Python for those with little to no experience of coding. More experienced users of Python will also benefit from the advanced chapters on graphical user interfaces and functional programming.
NLTK is a powerful Python package that provides a set of diverse natural languages algorithms. It is free, opensource, easy to use, large community, and well documented. NLTK consists of the most common algorithms such as tokenizing, part-of-speech tagging, stemming, sentiment analysis, topic segmentation, and named entity recognition.