An increasing number of publishers are allowing data and text mining of their licensed resources by members of subscribing institutions, and of available open access material. Generally this access is governed by existing usage terms and conditions and existing copyright provisions.
Some publishers will require you to use tools they provide to mine their content, or will conduct the process for you. In this way they can manage the quantity of data being accessed and the impact on their servers.
Downloading large amounts of data can trigger automatic lockouts and prevent access to resources by other users. In some instances the publisher may apply a fee for the additional usage that sits outside of our existing agreement.
Please consult with the Librarian team when utilising UQ subscribed databases as a source of data.
Checking Library database license information for text mining provisions
Individual database records within UQ Library Search contain links to license information.
Clicking the Show License button will show license details. If text and data mining is allowed from this database the information will be provided here.
Library databases with built-in text analysis tools
Digital Scholar Lab is an online tool for collecting data sets comprised of content from UQ's Gale Primary Sources subscriptions. Those data sets can then be analyzed using text analysis and visualization tools built into the Digital Scholar Lab. Digital humanities analysis methods include: Named Entity Recognition, Topic Modelling, Parts of Speech, and more.
JSTOR Labs projects take some of the methods and technologies in the digital humanities -- such as topic modeling or linked open data -- and apply it to the rich JSTOR corpus.
Analysis Hub provides text analysis across subscribed Wiley Digital Archive records. For example: Term Frequency, Term Popularity, Collocations, Concordance, Word Cloud and Frequency Distribution.