Essential Knowledge
Intellectual property is a work or invention that is the result of creativity, to which one has rights over.
In the modernized world today, intellectual property can be shared more easily now through the Internet. This raises concerns about who owns something that was created digitally.
Copyright protects your IP and keeps anyone else from using it, unless they have permission. Anything you create like a picture u took or a digital drawing, you automatically own an “All rights reserved” copyright. If we pretend that material is our own then it becomes plagiarism
There are ways to create material for others to use:
Creative Commons
Open Source
Open Access
A - The browser Firefox is fully open-source. Mozilla has stated that this improves customer choice and innovation in the browser market.
No matter how you use something, it’s still important to cite information and give credit where it belongs. There is so information at our fingertips and though open source programs may be created with good intentions, people will sometimes use these open sources and modify them to harm individuals or groups.
There are legal concerns regarding computing devices that collect and analyze data by monitoring individual activity. Digital media downloads can sometimes include viruses and harmful programs
Some softwares may include algorithms with bias. The training data for the software might underrepresent certain groups or demographics. Lack of fairness in algorithmic decision-making can perpetuate societal inequalities.
The digital divide is the unequal distribution of access to technology Although the Internet provides all these resources for us to use, certain groups of people do not have the technology readily available for them to use. These databases and online resources are only beneficial to those who have the opportunities and privileges to understand and access technology.
A - Another ethical concern for intellectual property is the discussion surrounding dataset harvesting for LLMs and other modern AI systems. Some creators whose work has been used for LLM training without their own permission have raised complaints of IP/copyright violations and plagiarism, while others have sought to prevent theirs from being used.
Complete these five free response questions and send them to Taj on slack by December 21, 11:59 pm. You will be graded on how much effort you put into your answer.
What is the difference between open source and open access?
What is the importance of copyright?
What is a way people can harm individuals or groups through open source data?
What term describes the uequal distribution of access to technology?
How do you see intellectual property in 20 years (be creative)?
Open source is when the code of the application can be found and downloaded by anyone, such as with Python packages. Open access is when the output of the software is free to use, but the internals are not available for editing, such as Google (you can use Google for free but cannot modify its code). Open access may also refer to various non-code IPs that are freely available, like open-access databases (NASA Exoplanet Archive, NIH PubMed) or publications (Astronomy & Astrophysics, Proceedings of the Royal Society).
Copyright prevents others from taking work that you created and passing it off as their own or using it for finanical gain without compensating you for creating it.
Open source programs may be modified to become malicious, either by inserting malicious code or by misusing open-source code for malicious means. For instance, you could use open-source LLM code to synthesize large quantities of misinformation that you then spread online.
The Digital Divide refers to some groups being underrepresented online, usually because they lack financial access to digital devices or Internet.
Assuming we didn’t blow ourselves up, continued advancement of database-trained AI and increasing difficulty of tracking down sources online will probably make source verification and IP protection much more important than today. Within that time we may get to a state where there is so much LLM-generated garbage out there that we have to rely on vetted organizational IPs, be they open-access or not (seriously, trying to get anything cutting-edge out of ChatGPT is a fool’s errand.). Susceptibility of open-access to plaigarism in such an environment may also reduce its importance.