Artificial Intelligence (AI) products are creating incredible solutions to our modern-day problems. Last year, we saw DALL-E 2 create art from natural language prompts with stunningly beautiful output in unbelievable style and detail. Now it is ChatGPT, an AI chatbot, that is taking the internet by storm, further demonstrating the potential power of AI technology. ChatGPT can write music in almost any genre, create content in multiple languages and write blocks of functional software code.
Before heralding in a new era of AI-assisted working practices, however, we need to consider legal questions around the ownership of the AI-generated artworks and whether there are risks or limitations on their use.
DOES UK LAW HELP RESOLVE ISSUES OF OWNERSHIP?
The UK is one of only a handful of countries to protect computer-generated works, that is works generated by a computer where there is no human author. Under the Copyright, Designs and Patents Act 1988, the owner will usually be deemed to be the person by whom “the arrangements necessary for the creation of the work are undertaken”. So, how does that apply to generative AI where there is a human creator inputting natural language commands as well as an organisation developing the AI? Potentially, both seem to have a claim to ownership.
Also, to benefit from copyright protection, a work must meet the ‘originality test’, which requires that a protected work be ‘original’, meaning “skill, labour or judgement” was expended by the author. It’s unclear how this applies to an AI system and whether a machine could meet these criteria.
The UK’s Intellectual Property Office (UKIPO) recently conducted consultations to address the uncertainty regarding the scope and effect of the UK’s copyright and patent protection for AI. However, whilst AI offerings are still in their early stages, the UKIPO is aware that changing the law could result in unintended consequences. In conclusion, there are currently no plans to change the law. So, for now, this uncertainty will remain.
WHAT ABOUT OTHER REGIONS?
Looking at the US, the U.S. Copyright Office directs that copyright law only protects works that are made by a human being and not an AI model. Current copyright law only provides protection to “the fruits of intellectual labor” that “are founded in the creative powers of the [human] mind”. This seems contrary to rewarding and supporting an innovative environment and could be changed by Congress.
German legislation takes a similar approach to the US: a work is protectable only if it qualifies as the “author’s own intellectual creation”, and only works that originate in the human mind are considered to fulfil this requirement. So, creations made by computers are excluded from such protection.
There is considerable international divergence, and intellectual property regimes will need to keep up as generative AI offerings evolve.
RISKS FROM TRAINING MATERIALS
Substantial risk potentially resides in using AI-generated outputs because of the way AI tools are trained.
Generative AI systems usually require vast quantities of example materials (such as artworks, photographs, code etc.) to learn from and develop their algorithms. Without transparency of what content is incorporated in such training materials, users of an AI tool can’t be certain that the AI developers had the right to use all the materials for training purposes. If an AI output replicated a substantial part of one of the works in the original training set, the owner of that original work could assert a copyright infringement claim if the AI output is used.
This January, stock image provider Getty Images commenced litigation in the UK against Stability AI, an entity behind Stable Diffusion, a text-to-image AI model. Getty alleges that Stability AI has processed its images in training the AI and that such use infringes Getty’s copyright. The claim appears to be supported by Getty’s watermark showing on numerous images generated by Stable Diffusion.
In a different strategic approach, Shutterstock has partnered with DALL-E 2, meaning that Shutterstock images can train DALL-E and that Shutterstock can offer DALL-E images to its customers. When the new tool is launched, Shutterstock’s AI-generated images will offer a competitive alternative to the very photographs used to train DALL-E. Whilst this might first seem like an odd partnership, it eliminates the risk of a copyright infringement claim as ownership rights in the training materials are clear and accounted for – and they offer new business opportunities to both sides.
The UKIPO considered the issues around training materials in the recent consultations and proposed to extend the current copyright exception to allow text and data mining for any purpose, which is currently limited to non-commercial research. However, this is highly controversial, with the creative sector pushing back. In this unclear situation, we will have to await the progress of Getty’s claim in the hope that we obtain guidance from the Court.
AI TOOLS THAT WRITE SOFTWARE CODE
ChatGPT is not the first AI tool to offer a coding solution. GitHub Copilot can provide developers with improved code suggestions or add new code blocks. However, it has also had its fair share of controversy, which stems from GitHub’s use of code hosted on GitHub repositories to train and develop Copilot’s coding skills.
Last November, a class action was filed in San Francisco against GitHub, Microsoft and OpenAI on behalf of GitHub users. The complaint argues that the defendants have violated the legal rights of a vast number of creators who posted code under open source licences on GitHub. As the litigation progresses, many questions need to be answered: was code copied from GitHub to be used as training data? Was the code hosted in GitHub used for purposes permitted under the doctrine of fair use? Have the open source software (OSS) licences been breached by use as training data? And the list goes on…
Another key issue is the extent to which original licence terms governing the training materials can be respected when using AI coding tools. Some OSS is subject to a “copyleft” licence provision, meaning that any derivative work incorporating that OSS must become subject to the same licence terms. Alternatively, other licences require attribution or acknowledgement of the origin.
If Copilot or ChatGPT provide a solution that is complex or lengthy, there’s a risk that copyrightable training material is being reproduced. Reported examples of Copilot code have shown obvious regurgitation, sometimes even including comments from the original materials. In such cases, the original licence terms need to be respected. However, Copilot will not notify you of the source used or the related licence requirements. Effectively, you’re flying blind in contravention of licence compliance.
Such issues need resolutions before AI-based tools like Copilot and ChatGPT can be used in a risk-free manner to produce code, text, images and other content. In the meantime, IP lawyers and all involved can only speculate as to the legal consequences of using these tools.
Head of Legal, Risk and DisputesHannah joined our legal team in 2021 and, based in London, she is responsible for managing areas of risk across Endava. She leads on intellectual property, disputes, licence compliance and legal issues around cybersecurity and artificial intelligence. Hannah has worked in legal services for 13 years, managing IP risk and disputes for clients across numerous industries. Outside of work, and when not running after small children, Hannah enjoys watching Star Trek and eating excessive amounts of dark chocolate.
05 December 2023
When Considering Synthetic Data, Answer These 3 Questions
28 November 2023
Moving Towards Global AI Regulation, an Update
16 November 2023
Hi, I’m Matt Cloke
26 September 2023
Who Wins Most from Healthcare’s AI Transformation?
14 September 2023
The Spark That Drives Machine Learning to Shine