How to use generative AI like ChatGPT – In a safe way!

Wed, 14th Jun 2023

FYI, this story is more than a year old

By Bob Janssen, Global Head of Innovation, Delinea

ChatGPT has undoubtedly set the cat among the pigeons regarding Artificial Intelligence (AI).

"Using AI safely and responsibly is a balancing act the whole world is grappling with at the moment," said Ed Husic, Australian Minister for Industry and Science, releasing a new government paper kicking off a national discussion to ensure appropriate safeguards are in place.

Organisations looking to benefit from AI don't have time to wait for a national discussion. For many, the focus is turning to the cybersecurity implications and, specifically, the key security considerations before utilising generative AI solutions.

So, what are the security implications?

Firstly, ChatGPT is designed to generate responses based on the data it has been trained on, which could include sensitive or confidential information. Enterprises need appropriate measures to protect the privacy and confidentiality of the data they pass through the API.

The accuracy and reliability of ChatGPT's responses may not be 100% guaranteed, and the technology may generate incorrect or biased responses based on the data it has been exposed to. Organisations should carefully evaluate the accuracy and reliability of the answers before using them in any critical business processes.

Organisations may become dependent on an AI service's reliability and availability. ChatGPT is a third-party service, so if there are any issues or the service is discontinued, it can affect the enterprise's ability to use the API.

If the API is not adequately secured, it can be vulnerable to misuse and abuse by attackers who could launch attacks against the enterprise's systems or harvest sensitive data. So, organisations should have appropriate measures to protect the API from misuse and abuse.

Finally, depending on the industry and data involved, enterprises may need to comply with specific regulations, such as APRA CPS-234, the Security of Critical Infrastructure Act, the Privacy Act or the New Zealand Information Security Manual. Enterprises need appropriate policies and procedures to comply with these regulations when using ChatGPT as an API.

What if there is another security breach?

While any security breach is concerning, OpenAI's response to its recent security incident has been prompt and transparent. The company has been open about the breach, its cause, and the steps taken to address the issue. OpenAI's quick action to secure its systems and notify the affected parties demonstrates its commitment to security and privacy.

That said, it's worth looking at OpenAI's security protections:

Access control: OpenAI implements strict access controls to ensure that only authorised personnel have access to its systems and data.
Encryption: All data transmitted between OpenAI's systems and customers is encrypted using industry-standard encryption protocols.
Network security: OpenAI has implemented various measures, such as firewalls and intrusion detection and prevention systems, to protect its systems from external threats.
Regular security assessments: OpenAI conducts security assessments to identify and mitigate potential security risks.
Data protection: OpenAI uses various measures, such as data masking and access controls, to protect the confidentiality, integrity and availability of customer data.
Incident response: OpenAI has a well-defined incident response process to respond quickly to security incidents and minimise their impact.
Compliance with industry standards: OpenAI follows industry-standard security best practices and complies with various international regulations to ensure the security and privacy of its customers' data.

Tools and safety recommendations to consider

There are several tools and recommendations to consider when it comes to safely using AI solutions.

OpenAI has a free-to-use Moderation API that can help reduce the frequency of unsafe content in completions. Alternatively, you could develop a custom content filtration system tailored to specific use cases.

"Red-teaming" applications are recommended to ensure they are robust to adversarial input. Test products over a wide range of inputs and user behaviours, both a representative set and those reflective of someone trying to 'break' the application. Does it wander off-topic? Can someone easily redirect the feature via prompt injections, e.g. "Ignore the previous instructions and do this instead"?

Wherever possible, it is recommended to have someone review outputs before they are used in practice. This is especially critical in high-stakes domains and for code generation. Staff should be aware of the system's limitations and have access to any information needed to verify the outputs. For example, if the application summarises notes, someone should have easy access to the original notes to refer to.

"Prompt engineering" can help constrain the topic and tone of output text. This reduces the chance of producing undesired content, even if a user tries. Providing additional context to the model (such as giving a few high-quality examples of desired behaviour prior to the new input) can make it easier to steer model outputs in desired directions.

Users should generally be required to register and log in to access services. Linking this service to an existing account such as Gmail, LinkedIn, or Facebook may help, although it may not always be appropriate. Requiring a credit card or ID card further reduces risk. Limiting the amount of text a user can input into the prompt helps avoid prompt injection. Limiting the number of output tokens helps reduce the chance of misuse. Narrowing the ranges of inputs or outputs, especially those drawn from trusted sources reduces the extent of misuse possible within an application.

Allowing user inputs through validated dropdown fields (e.g., a list of movies on Wikipedia) can be more secure than allowing open-ended text inputs. Returning outputs from a validated set of materials on the backend, where possible, can be safer than returning novel-generated content (for instance, routing a customer query to the best-matching existing customer support article rather than attempting to answer the query from scratch).

Users should generally have a readily available method for reporting improper functionality or other concerns about application behaviour (listed email address, ticket submission method, etc.). This method should be monitored by a human and responded to as appropriate.

Something to watch: democratised AI

Finally, a trend being closely monitored at Delinea Labs is the advent of democratised AI.

It's already incredibly easy to replicate and re-train your privately owned ChatGPT-like model. For example, Stanford's Alpaca is surprisingly good.

The claim is that the training can be done in five hours on a single RTX 4090. This can mitigate the risk of privacy, confidentiality, compliance and third-party reliance. But other risks remain. Staying ahead of this trend and AI security generally will help organisations leverage these new solutions safely and effectively.

Generative AI solutions like ChatGPT have captured the public's attention, and organisations are already putting them to work. In the meantime, regulators are still playing catch-up, and it is up to organisations to consider the cybersecurity implications and business risks, and implement appropriate measures, to ensure the technology is used safely.

Share on: