Depending on how the process of mining is conducted e.g. whether the material is copied, reformatted or digitised without permission, it could be considered a copyright infringement. The ability to data mine relies heavily on technologies that are considered 'copy-reliant' where copies must be made of the data in order for it to be analysed. Currently the Copyright Act 1968 makes no specific exemption for text or data mining.
Limited text mining might be covered by the fair dealing exceptions however if an entire dataset needed to be copied this would clearly exceed a 'reasonable portion' of the work.
While copyright does not apply to raw data or factual information it does cover the arrangement of data within a database or the 'expression' of data eg. presentation in a table.
Data providers will each have their own specific standards and procedures that you must follow in order to legally use the data they provide. It’s essential that you ensure from the outset of your project that the activities you intend to perform during the course of your data mining and the subsequent publication of your research results comply with any licensing terms and conditions.
For example, many data providers license their data to be mined for research purposes only and either prohibit or require special negotiation for data mining with potential commercial applications. If you have any questions about licensing conditions or negotiating permission for potential commercial applications of data mining with data providers please contact your Librarian.
Online mining etiquette
Even if the licence permits it, some approaches to text and data mining are considered poor etiquette due to the inconvenience they can cause to data providers. For example, bulk scraping or non-rate-limited programmatic querying via APIs can place a significant burden on data providers’ servers, causing slow response times or even down time for other users. Best practice is to check the requirements of the data provider and comply with their preferences regarding data mining activities.