Open Source AI Funding Will Dry Up

Free open source Generative AI is going to run out of funding. It is expensive and legally risky to create generative AI models. Also, open source AI models can be deployed by anyone, anywhere, which, while advantageous, enables misuse. In this article, I am going to explain my reasoning focusing on the abuses and the costs associated with free open source generative AI.

Fraud/CSAM

First, open-source generative AI is increasingly being abused. It is being utilized to commit fraud and generate CSAM (Child Sexual Abuse Material).

AI’s role in facilitating fraud is well documented. It excels at mimicking voices and rewriting text to enhance the credibility of email scams. As large companies have improved their monitoring tools to combat these abuses, fraudulent actors have turned to open-source models to evade oversight. Consequently, there will be mounting pressure on major corporations to avoid creating models that can be exploited for fraud, leading them to discontinue open-source work.

More troubling is the use of open-source models in generating non-consensual pornography. Beyond creating explicit images of adults, these models are being used to produce CSAM and images of celebrities altered to appear as children. No corporation wants to be associated with such content. While the public may tolerate the creation of explicit images of celebrities, it will not accept the production and distribution of CSAM.

References:
404 Media - https://www.404media.co/google-image-search-ai-results-have-opened-a-portal-to-hell/
Stanford Report - https://stacks.stanford.edu/file/druid:jv206yg3793/20230624-sio-cg-csam-report.pdf

PCMag - https://www.pcmag.com/news/man-arrested-for-creating-ai-child-sexual-abuse-material-using-stable-diffusion

Forbes - https://www.forbes.com/sites/forbestechcouncil/2024/06/14/the-weaponization-of-ai-the-new-breeding-ground-for-bec-attacks/

Vice - https://www.vice.com/en/article/dy7axa/how-i-broke-into-a-bank-account-with-an-ai-generated-voice

Business Model

Secondly, open-source generative AI development is costly and lacks a clear business model. There are scenarios where having some models and AI code open-source makes sense. For instance, Meta allows limited use of its open-source models and monetizes them when user volume reaches a threshold. However, this situation does not apply universally.

Current expenses include significant compute power and processing costs. Additionally, developing these models requires highly skilled, expensive personnel. Emerging issues, such as licensing the data used to train models, will add further costs that open-source projects cannot easily absorb. Sadly, there are not good financials on most of these projects as the models are either developed within the company or by a privately owned company.

Counterpoint & Ways Forward

It’s important to consider both sides of the argument. Some approaches might mitigate these issues.

In terms of preventing abusive AI systems, Anthropic’s Golden Gate Claude demonstrates that models can be designed with inherent safeguards, which is more effective than merely filtering prompts. This approach restricts harmful outputs at a fundamental level rather than just prohibiting certain discussions. Anthropic - https://www.anthropic.com/news/golden-gate-claude.

What seems more difficult are the cases where people are using AI to create more believable phishing emails. Those emails are created using normal language, which is going to be hard to filter or identify. We will probably see something like an arms race here, where the defenders look for better meta signal detection, while the people committing fraud will get more and more aggressive.

From a business perspective, some large corporations share their open-source implementations to counteract competitors. For example, Meta might release a model to open source if it pressures rivals without undermining its premium offerings. It is possible that competition will cause there to be some form of open source projects.

Conclusions

The business of building AI without an ROI is going to come to an end at some point. The data risk and costs are high and the benefits are questionable.

Search This Blog

Long Tailed Leopard Blog