Library Guides: Text mining & text analysis: Research methods

Machine learning

Text analysis often relies on machine learning, a branch of computer science that trains computers to recognise patterns. There are two kinds of machine learning used in text analysis: supervised learning, where a human helps to train the pattern-detecting model, and unsupervised learning, where the computer finds patterns in text with little human intervention. An example of supervised learning is Naive Bayes Classification. See Natural Language Processing and Topic Modeling for examples of unsupervised machine learning

Machine learning - reference entry - Encyclopedia of the Sciences of Learning
more... less...

UQ sign in required
Naive Bayes Classification

Natural language processing

Natural language processing, a kind of machine learning, is the attempt to use computational methods to extract meaning from free text. Among other things, natural language processing algorithms can derive: names of people and places, dates, sentiment, and parts of speech.

Natural Language Processing
more... less...

UQ sign in required
The Stanford Natural Language Processing Group
Natural Language Processing software available to everyone.

Topic modelling

Topic modeling, a form of machine learning, is a way of identifying patterns and themes in a body of text. Topic modeling is done by statistical algorithms, such as Latent Dirichlet Allocation, which groups words into "topics" based on which words frequently co-occur in a text

Topic Modeling
more... less...

UQ sign in required

Network analysis

Network analysis is a method for finding connections between nodes representing people, concepts, sources, and more. These networks are usually visualised into graphs that show the interconnectedness of the nodes.

Social network analysis - the process of investigating social structures through the use of networks and graph theory. It characterises networked structures in terms of nodes (individual actors, people, or things within the network) and the edges, or links (relationships or interactions) that connect them.

Semantic network analysis - a network that represents semantic relations (meanings) between concepts. This is often used as a form of knowledge representation. It is a graph consisting of nodes, which represent concepts, and edges, which represent semantic relations between concepts.

Social Network Analysis
more... less...

UQ sign in required
Semantic Networks
more... less...

UQ sign in required

Visualisations

Text visualisation is a way to "see" your data. Text mining visualisation can help researchers see relationships between certain concepts. An example of a visualisation of data can be word clouds, graphs, maps, and other graphics that produce a visual depiction the data.

Various Text Analysis Projects with Visualisations

With Criminal Intent - currently unavailable
The state of our union is... dumber
Novel Views: Les Miserables
Tolkien's Books Analysed

Word Frequency Visualisations

Google n-gram viewer - word frequencies over time
Historical culturomics of pronoun frequencies - pronoun frequencies by gender over time
The Words They Used - bubble cloud of words from national convention speeches, with size and color coding
Ye Shall Know Them By Their Words - word frequencies by topic for presidential nomination speeches (additional description)
Mining Books to Map Emotions - frequencies of sentiment terms over time

Text Visualisation
more... less...

UQ sign in required
UQ Library - Data Visualisation Software