Prompt Sanitization for ChatGPT

Enhancing Privacy with Arcs DLP: A Three-Layered Approach to Prompt Sanitization

1.Introduction

In today's digital landscape, Large Language Models (LLMs) have become an integral part of online applications, powering everything from chatbots to search engines and translation tools. These models, trained on vast datasets, offer users powerful capabilities without the need for expensive hardware. However, the convenience of cloud-based LLM services comes with significant privacy concerns. Users have limited control over how their data is processed, stored, and potentially shared with third-party providers.

To address these privacy issues, we introduce Arcs DLP, a browser extension designed to protect user privacy when interacting with online LLM services. Unlike solutions that rely on cloud-based PII redaction, Arcs DLP performs all data sanitization locally on the user's device, ensuring that sensitive information never leaves the user's control.

2. How Arcs DLP Works

Arcs DLP is a lightweight, efficient browser extension that operates entirely on the user's device, ensuring that sensitive information never leaves the user's control.

It uses a three-layered sanitization mechanism comprised of:

Rule-Based Filtering

The first layer of Arcs DLP employs rule-based filtering to identify 7 predefined detectors of private data that are all run by default for the following entities: credit card numbers, names, email addresses, phone numbers, US addresses, US social security numbers, secret keys

Named Entity Detection

The second layer uses a fine-tuned BERT (110 million parameters) model, for named-entity recognition. This model (bert-base-uncased-finetuned-conll03-english) is trained on the CoNLL-2003 Named Entity Recognition dataset, which includes 6,600 person names, 7,140 locations, 6,321 organizations, and 3,438 miscellaneous examples. We use Transformers.js to load it within our browser extension. Arcs DLP sends user prompts to the local Transformers.js interface and receives a list of recognized entities in the string synchronously. Each recognized entity is tagged with its type and a confidence score. This score allows for future enhancements, such as filtering to report only entities above a certain threshold. Arcs DLP sets a default score threshold of 75 out of 100, which users can adjust as needed.

Local WebGPU-Based LLM for Enhanced Redaction

The final layer of Arcs DLP uses a local LLM to identify sensitive topics that might not be captured by the first two layers. This is achieved by leveraging the WebLLM framework. The WebLLM framework offers significant flexibility, allowing for easy switching between different LLM models, which users can select from the configuration page based on their specific needs.

Currently, Arcs DLP uses the Llama 3 8B model. This model was chosen for its optimal balance between performance, accuracy, and resource consumption.

Figure-1: Architecture

4. Integrating Arcs DLP with ChatGPT

Arcs DLP operates as a browser extension, analyzing prompts before they are submitted to LLM services. Users can activate prompt analysis with a simple keystroke combination, such as Control-Enter, to ensure that their data is sanitized before submission.

When sensitive information is detected, Arcs DLP alerts the user through a popup message, providing details about the privacy risks involved. This alert system helps users make informed decisions about sharing their data.

Figure-2: Plugin configuration page

We have tailored Arcs DLP to work with OpenAI’s ChatGPT interface. Our browser extension locates the prompt input box in the ChatGPT web interface and analyzes user prompts before they are submitted. Due to limitations in the Chrome extension API, we cannot intercept the input prompt right before it is sent through the Enter key or submission button click. To address this, we have two options:

1. Analyze the prompt on every keystroke.

2. Introduce a new keystroke combination, such as Control-Enter, to trigger the analysis.

For our prototype, we chose the second option, which is more user-friendly and avoids the performance overhead of analyzing each keystroke. Although users might forget to press Control-Enter, we can enforce this by blocking prompt submission until the user presses the combination. The Chrome extension API allows us to intercept the submission event and prevent the prompt from being sent until our conditions are met, although we cannot modify the prompt content before submission.

When Control-Enter is pressed, Arcs DLP analyzes the prompt using its three-layered filtering mechanism. If any privacy risks are detected, Arcs DLP uses a window alert to notify the user of the risk, as shown in Figure 3. This popup includes a warning message and requires user acknowledgment before proceeding. Users can disable alerts for PII detection, allowing automatic redaction, but alerts for privacy-sensitive topics remain enabled for maximum user awareness.

Figure-3: Prompt Sanitization Popup

Future Developments

Arcs DLP is continuously evolving, with ongoing efforts to enhance its capabilities. The local WebGPU-based LLM component is in early alpha stages, and future updates will  improve its accuracy and efficiency further.

This section provides a broad overview of the technologies and methods that can enhance privacy and security in the context of using large language models and GPUs. For those interested in the technical details, the referenced works offer deeper insights.

Trusted and Confidential GPU Computing

Trusted computing is a technology that helps protect sensitive applications from being accessed by unauthorized parties. Companies like Intel and AMD have developed features that allow programs to run in secure areas, known as enclaves, within a computer. This technology can also be applied to GPUs (graphics processing units), which are often used for complex computations. While Arcs DLP doesn't currently use these methods, incorporating them could enhance privacy and security by ensuring that even sensitive data processed on a user's device remains protected. For more technical details, you can refer to the works by McKeen et al. (2013), Volos et al. (2018), and others.

Browser GPU Stack Security

The GPU stack in web browsers is a potential target for security attacks, as vulnerabilities in the GPU's code can be exploited by malicious websites. Technologies like WebGL and WebGPU enable websites to use the GPU for rendering graphics, which can introduce security risks. Research has shown that a significant number of bugs in operating systems are found in device drivers, which are critical for GPU operation. Arcs DLP uses WebGPU to speed up local processing but relies on the browser's built-in security features to protect users. For more insights, check out studies by Peng et al. (2023) and Yao et al. (2018).

Client-Side Filtering

Client-side filtering involves removing sensitive information from user inputs before they reach a server. This approach is commonly used in applications like voice assistants and remote desktop software to protect users' privacy. Arcs DLP applies this concept to online LLM services, ensuring that personal data is filtered out before it is sent to external servers. This method is effective in preventing sensitive data from being exposed to third parties. For further reading, see works by Seyed-Talebi et al. (2021) and Liu et al. (2023).

Privacy-Preserving Machine Learning

Privacy-preserving machine learning focuses on protecting data privacy during the training and use of machine learning models. Techniques like federated learning and homomorphic encryption allow data to be processed while keeping it private. Federated learning involves training models across multiple devices without sharing raw data, while homomorphic encryption allows computations on encrypted data. However, these techniques are not yet widely applicable to large language models like those used in Arcs DLP. For more technical exploration, refer to research by McMahan et al. (2017) and Gentry (2009).

Conclusion

Arcs DLP represents an advancement in privacy protection for users of cloud-based LLM services. By performing all data sanitization processes locally, Arcs DLP ensures that sensitive information remains under the user's control, offering a robust solution to the privacy challenges posed by modern LLM applications. With its comprehensive privacy protection, lightweight design, and seamless compatibility, Arcs DLP empowers users to interact with LLM services confidently and securely.

References:

* arXiv:2411.11521v1 [cs.CR] 18 Nov 2024

* https://github.com/lakeraai/chrome-extension

* https://arxiv.org/html/2408.07004v1#S6