On September 22, Rep. Anna Eshoo (D-CA) called on the National Security Advisor (NSA) and the Office of Science and Technology Policy (OSTP) to restrict access to open-source generative AI models in response to the release of Stable Diffusion by Stability AI. Stable Diffusion is an open-source text-to-image AI that allows for the creation of high-quality images simply from text descriptions, similar to proprietary competitors such as DALL-E 2 and Midjourney. The images can range from artistic to photographic in quality, depending on how the desired output is described.
In her letter, Congresswoman Eshoo attempts to cast Stability AI as an irresponsible firm in comparison to competitor OpenAI, which places tight content restrictions on its users, acting as a benevolent gatekeeper to prevent users from generating harmful content such as “unfiltered imagery of a violent or sexual nature.” Her statement highlights use of Stability AI’s open-source text-to-image AI in generating degrading pornography on 4Chan as a reason for access to the platform to be limited.
The line of reasoning here—that public access to open-source AI models must be limited given their capacity to generate nefarious content—is wildly contrary to decades of policy that helped cement the U.S. as a technology leader. It is impossible to think of a general-purpose tool that does not have unintended side effects, but this does not mean that the tools themselves should be restricted ex ante. The Internet itself allows for nefarious content to be created and shared, but few argue that the Internet itself should have gatekeepers to prevent the rise of malicious sites. Further, the First Amendment itself provides broad protections for many kinds of offensive, pornographic, or otherwise repugnant speech.
Rather, if someone uses a tool to conduct illegal activity, unless that tool was intentionally made to enable such harm, the individual should be held responsible for his or her actions ex post. That Stability AI’s models have allowed people to create grotesque content does not mean that this is what they were meant to do or all they are capable of, anymore than it would be for Photoshop, iMovie, or other commonplace software tools.
Stability has released an API of its text-to-image model that restricts the production of malicious or insalubrious content to the public at low cost and has significantly lowered the barrier for using state-of-the-art AI to build new products. A plethora of novel products have already appeared, built atop their API in less than two months. The impact of easy-access image generation can improve numerous industries, from marketing to visual effects to game development to even scientific research.
In addition to their API, Stability has released access to its code, allowing people to remove any filtering and run the model on their own hardware. This has allowed others to play around and innovate on top of their work, with people rebuilding many of the features of DALL-E 2, OpenAI’s text-to-image model that Rep. Eshoo praises, at a much lower cost.
The reality is that these open-source models provide evidence that economic means no longer restrict access to AI models. The era of “whoever owns the most compute wins at AI” is coming to an end, and the fact that high-quality models can be built cheaply and distributed widely has resulted in a revival of attention to the field from the public, technologists, investors, and, sadly, government. Restricting development at this stage risks undermining the vast potential unlocked by allowing development of new products and companies to be sped up drastically by generative AIs that allow high-quality outputs to be produced quickly and cheaply.
This is not to deny that there are legitimate enforcement questions to be asked concerning illegal or otherwise harmful content. However, the eagerness to create onerous top-down rules at this early stage would stifle the field’s development.
The same technology that was used to generate infringing and illicit content can be used to detect and remove it if that focus is encouraged. Initiatives to prove the authenticity of a digital image, to determine whether images are drawn from copyrighted work, or to detect the presence of problematic content in an image all benefit from greater access to high-quality tooling. Supporting the private-sector initiatives already developing to ensure that AI is used in a pro-social manner can help prevent any potential illicit applications. Rather than taking a reactionary approach and trying to limit the shift away from a world where AI is gatekept by large tech firms with capital and computational power, Rep. Eshoo should direct her attention to nurturing a world where democratized AI is directed at the good.
Ryan Khurana is Chief of Staff at WOMBO.ai, a generative AI company.