Increasing Our Mannequin Security Bug Bounty Program

PRESS RELEASE

The fast development of AI mannequin capabilities calls for an equally swift development in security protocols. As we work on creating the subsequent era of our AI safeguarding programs, we’re increasing our bug bounty program to introduce a brand new initiative centered on discovering flaws within the mitigations we use to stop misuse of our fashions.

Bug bounty applications play a vital position in strengthening the safety and security of expertise programs. Our new initiative is concentrated on figuring out and mitigating common jailbreak assaults. These are exploits that might permit constant bypassing of AI security guardrails throughout a variety of areas. By concentrating on common jailbreaks, we intention to deal with among the most vital vulnerabilities in crucial, high-risk domains similar to CBRN (chemical, organic, radiological, and nuclear) and cybersecurity.

We’re desperate to work with the worldwide group of safety and security researchers on this effort and invite candidates to use to our program and assess our new safeguards.

Our strategy

So far, we’ve operated an invite-only bug bounty program in partnership with HackerOne that rewards researchers for figuring out mannequin issues of safety in our publicly launched AI fashions. The bug bounty initiative we’re asserting right now will take a look at our next-generation system we have developed for AI security mitigations, which we haven’t deployed publicly but. Right here’s the way it will work:

  • Early Entry: Individuals can be given early entry to check our newest security mitigation system earlier than its public deployment. As a part of this, individuals can be challenged to determine potential vulnerabilities or methods to avoid our security measures in a managed atmosphere.

  • Program Scope: We’re providing bounty rewards as much as $15,000 for novel, common jailbreak assaults that might expose vulnerabilities in crucial, excessive threat domains similar to CBRN (chemical, organic, radiological, and nuclear) and cybersecurity. As we’ve written about beforehand, a jailbreak assault in AI refers to a technique used to avoid an AI system’s built-in security measures and moral tips, permitting a consumer to elicit responses or behaviors from the AI that might usually be restricted or prohibited. A common jailbreak is a sort of vulnerability in AI programs that permits a consumer to persistently bypass the security measures throughout a variety of subjects. Figuring out and mitigating common jailbreaks is the important thing focus of this bug bounty initiative. If exploited, these vulnerabilities might have far-reaching penalties throughout a wide range of dangerous, unethical or harmful areas. The jailbreak can be outlined as common if it could get the mannequin to reply an outlined variety of particular dangerous questions. Detailed directions and suggestions can be shared with the individuals of this system.

Become involved

This mannequin security bug bounty initiative will start as invite-only in partnership with HackerOne. Whereas it is going to be invite-only to start out, we plan to increase this initiative extra broadly sooner or later. This preliminary section will permit us to refine our processes and reply to submissions with well timed and constructive suggestions. In the event you’re an skilled AI safety researcher or have demonstrated experience in figuring out jailbreaks in language fashions, we encourage you to use for an invite by means of our utility type by Friday, August 16. We’ll observe up with chosen candidates within the fall.

Within the meantime, we actively search any studies on mannequin security issues to repeatedly enhance our present programs. In the event you’ve recognized a possible security problem in our present programs, please report it to [email protected] with adequate particulars for us to duplicate the problem. For extra data, please check with our Accountable Disclosure Coverage.

This initiative aligns with commitments we’ve signed onto with different AI firms for creating accountable AI such because the Voluntary AI Commitments introduced by the White Home and the Code of Conduct for Organizations Creating Superior AI Techniques developed by means of the G7 Hiroshima Course of. Our aim is to assist speed up progress in mitigating common jailbreaks and strengthen AI security in high-risk areas. You probably have experience on this space, please be a part of us on this essential work. Your contributions might play a key position in making certain that as AI capabilities advance, our security measures hold tempo.


Leave a Reply

Your email address will not be published. Required fields are marked *