JISC, a UK think tank providing advice on the use of information technology to higher education and research institutions, published an extensive report on data mining in the UK on March 14th.
To tap the full potential of data mining for innovation and scientific development, the authors call for a copyright exception in line with the recommendations of the Hargreaves review. As professor Hargreaves noted, “current UK copyright laws are restricting [the] use of text mining” and there is evidence to suggest the existence of a market failure in this area. To address this problem a text and data analytics exception designed to promote the non-commercial exploitation of works “in ways which do not directly trade on the underlying creative and expressive purpose of the work”, is suggested.
There is no question that data mining, the practice of examining large databases in order to generate new information, is an exciting area of scientific research and promises to lead to important new discoveries in a number of areas. For example, data mining of human DNA sequences helped discover the individual risk of developing diseases such as cancer. The extensive research on the role of social media during the London riots using data mining techniques helped dispel the myth that social media, in particular Twitter, incited the riots. Data mining can even help discern systematic Human Rights violations. The examination of large sets of Government records over long time periods allowed researchers to find patterns of fraudulent legal records by different government agencies.
Data mining helps develop new hypotheses by identifying similar patterns between seemingly disparate topics and extracting valuable new information from them. To do that the underlying data needs to be accessed, analysed and linked to existing information. It is precisely this process, the restructuring of information and shifting to new formats, that gives rise to conflicts with copyright law.
The current legal framework has led to a situation where it is often unclear what types of data mining activities are permitted – creating liability for research institutions. One of the examples used to illustrate this problem in the report is that of a single researcher who conducted a series of text mining experiments without realizing it may be illegal. The incident triggered the content provider to block the entire institution from accessing a complete set of journals for several days even though it was not clear whether the contract actually prohibited data mining. In addition to that, libraries and research institutions face rising costs to establish the necessary licenses for their research. According by one researcher cited in the report “establishing permission to digitise alone” requires one full-time employee per year.
A copyright exception for text and data analytics as suggested by the authors of the report would ensure that universities and research institutions have the necessary legal protections to engage in large-scale, non-commercial data mining activities. The Wellcome Trust, a global, charitable foundation to support medical research, said of the report: “This is a complete no-brainer. This is scholarly research funded from the public purse, largely from taxpayer and philanthropic organisations. The taxpayer has the right to have maximum benefit extracted and that will only happen if there is maximum access to it.”
You can find the full report here: http://www.jisc.ac.uk/publications/reports/2012/value-and-benefits-of-text-mining.aspx
The press release is here: http://www.jisc.ac.uk/news/stories/2012/03/textmining.aspx