IPR and Using AI in Product Creation
June 01, 2023 by Santtu Ahonen | Comments
Software development has come a long way in the past few decades. With technological advancements, the use of machine learning and artificial intelligence (AI) in software development tools has become increasingly common. AI has the potential to revolutionize the way software is developed, making it faster, more efficient, and less prone to errors. However, as with any new technology, there are important considerations that must be taken into account when using AI in product and software creation. A key consideration is the impact on intellectual property rights (IPR), especially for the ownership and licensing of AI-generated outputs.
In this blog post, I will explore the use of AI in software product creation and discuss the IPR considerations that arise when using such tools. In this post I keep on repeating three abbreviations: 1) AI that I use as a general term for machine learning and artificial intelligence based solutions,
2) IPR for intellectual property rights, and
3) T&C for terms and conditions under which the IPR can be used.
Who owns the AI engine?
The ownership of the AI engine used in the software development tool is an important part of the IPR palette. The party who owns the engine software can define key aspects of the intellectual property rights (IPR) associated with it. This means both the use of the engine and any output generated by the AI engine, including code snippets or comments, may be subject to licensing T&C set by the vendor of the engine.
We can also foresee situations where the provider of the engine and the provider of the service to use the engine may not be the same parties. The engine itself is delivered under certain T&C, and the service to use of the engine is under another set of T&C. Understanding the impact of these terms is critical to your business wishing to integrate and/or use an AI engine.
A side note and observation on the T&C of some of the AI services has revealed unfair, outright predatory, or possibly even illegal terms in the wide array of services. For example, some services assume perpetual bragging rights (read = marketing rights) for you even touching the service. Or how about claiming full IPR ownership for anything you may post in the prompt? Reading the small print really matters.
Who owns the data used for training the AI engine?
The IPR of the data, be that source code, text, images, or other content, used to train the AI engine is also a key aspect when considering the IPR for the outputs generated by the AI engine.
The original data T&C typically do not cover machine learning or AI engine education use cases. Those T&C are typically aimed at users, either companies or individual users. Having said that, the T&C are still valid and apply, even if they may not be a good fit. The company or individual who owns the IPR for the data has a say how the data can be used downstream. This means that any output generated by the AI engine, based on that data, may also be subject to licensing agreements.
Also, AI is often referred to as a smart parrot. A smart parrot does not really understand what it is doing but it knows what and when to copy-paste as an answer. This kind of AI creates very little new, and the IPR concerns are high as content is copied from somewhere and that content owner may have a say on it.
Then there is the smart butler who can learn patterns and e.g., on English or a coding language and generate new content based on what it learned. With a smart butler, the ownership of the content from learning data to produced output gets blurred. It is a similar situation when people learn to code; When does copying end and new content creation begin?
…and therefore, who owns the output generated by the AI engine?
If we focus on source code we can discuss the IPR ownership through examples. While there are very permissive open source licenses that allow for relatively free re-use and copying of software, such as the MIT or Apache licenses, there are still limitations to the use of the code, such as keeping the license headers and author credits along with the code. Another example could be a strong copyleft open source license such as GPLv3. The GPLv3 enforces all derivate work to also be under GPLv3 and spinning the material through a smart parrot is not changing the fact.
The smart parrot may forget to mention the source or T&C for the source to the user. A copyright violation is created if the original data was under GPLv3 or some other license that sets specific rules for reusing the code.
In other words, if the data used to train the AI engine is licensed under a specific open-source license, the output generated by the AI engine, especially if it is a direct parroted copy, may also be subject to the terms and conditions of the open source license.
If the data or source code used to train the AI engine is proprietary or closed source, then the owner of that data or source code dictates the terms and conditions under which it can be used. This means that any output generated by the AI engine, based on that proprietary data or source code, may also be subject to additional licensing terms and conditions.
Different use cases and content generated with AI
Based on the above, generating source code with AI can be a minefield unless you can be sure of how the AI engine is working (you have the sources for the engine itself), any parroting includes details of the source, or the provider of the engine can provide sufficient protection against possible IPR claims. Typically engine providers T&C provide defense against 3rd party claims only up to what you paid for using the engine, which is insufficient for pretty much any business use case.
In addition to the IPR considerations for source code, similar considerations apply to other content and use cases. There is no IPR owner for natural language itself, such as English grammar, but there are still copyrights for published texts. Say you wanted your documentation and marketing materials in a specific style of a famous writer, such as Stephen King. Mr. King may or may not have a say in this.
Developers are typically not big fans of adding and especially maintaining comments. Code commenting and their maintenance is a lucrative use case for AI. One usually cannot parrot comments for new code from some existing source, so the IPR aspects here depend on the T&C of the engine and the use of it, as long as the commenting is using general language patterns and your code is not a copy from somewhere else.
AI can assist with other source code management tasks, such as version control, conflict resolution, merging changes, asset, and variable realignment etc… In these cases, there is a small risk for parroting, so you will need a smart butler. The ownership of the output generated by the AI engine may be subject to the T&C of the AI, and perhaps also where it learned its’ skills from.
An AI can be an ace in finding dependencies, relations, and patterns e.g. for bugs. It can also find and pinpoint performance issues, discover poorly written code, or code not meeting the code review or system architecture requirements. Finding and pinpointing issues is IPR-wise straightforward and depends on the AI T&C. But if the engine provides also new code suggestions and improvements we wonder into the parrot-butler swamp of problem areas on source code management.
Human testers are not very reliable or redundant, this is why we have a wide range of quality assurance tools available from Qt. AI can add value in testing. It can create test cases, maintain them and run them. It can provide suggestions based on test case results, and it can auto-generate bug reports, too. But as you can get very accurately lost with a GPS, you can also test vigorously but never be as innovative as your first real user out in the wild.
Overall, using AI in the software development lifecycle has huge potential to multiply productivity without sacrificing quality. At the same time, the whole industry is based on complex network of IPR rules and ownerships that pays the salaries of people working in the industry. AI is both a disruption and an opportunity at the same time.
This blog post was generated with ChatGPT-3. I, a human Product Manager, had some fun, too.
Blog Topics:
Comments
Subscribe to our newsletter
Subscribe Newsletter
Try Qt 6.8 Now!
Download the latest release here: www.qt.io/download.
Qt 6.8 release focuses on technology trends like spatial computing & XR, complex data visualization in 2D & 3D, and ARM-based development for desktop.
We're Hiring
Check out all our open positions here and follow us on Instagram to see what it's like to be #QtPeople.