Supp/Blog/Can You Use Support Transcripts to Train AI? (Maybe Not.)
AI & Technology7 min read· Updated

Can You Use Support Transcripts to Train AI? (Maybe Not.)

Your support ticket archive is a goldmine for AI training data. But GDPR, CCPA, and class action lawyers say you might not be allowed to use it. The legal picture is murky and changing fast.


You have 50,000 support transcripts. Each one contains a customer's question, your agent's response, and the resolution. Perfect training data for an AI that could handle those same questions automatically.

Can you use it?

The answer depends on where your customers are, what your privacy policy says, what consent you obtained, and which regulatory body is watching. And the answer is changing faster than most companies' legal teams can keep up.

The GDPR View

Under GDPR (which applies if any of your customers are in the EU/EEA), using personal data to train AI requires a legal basis. The two most commonly cited:

Legitimate interest (Article 6(1)(f)). You can argue that improving your support system is a legitimate business interest that benefits both the company and the customer. But this requires a balancing test: your interest vs the customer's privacy rights. If the training data includes sensitive information (health data, financial details, personal complaints), the balance may tip against you.

Consent (Article 6(1)(a)). You can ask customers for explicit consent to use their support interactions for AI training. But consent must be specific, informed, freely given, and withdrawable. "By using our support, you agree to everything" buried in a ToS doesn't count.

The GDPR also requires data minimization: you can only use data that's necessary for the stated purpose. Training an AI model on 50,000 full transcripts when 5,000 anonymized excerpts would suffice is hard to justify under data minimization.

The right to erasure (Article 17) creates an additional complication. If a customer requests deletion of their data, and their conversations are embedded in your AI training dataset, can you actually delete them? Once data is used to train a model, extracting a specific individual's contribution is technically challenging (sometimes impossible with current methods).

The CCPA/CPRA View

California's privacy law gives consumers the right to opt out of the "sale or sharing" of their personal information. The California Privacy Protection Agency has indicated that using customer data for AI training may constitute "sharing" under the law, depending on the specifics.

If you use a third-party AI service (like an LLM API) and send customer support data to it for training, that's likely "sharing" personal information with a third party. The customer would need to not have opted out.

If you train an in-house model on your own servers using your own data, the analysis is different (no third-party sharing), but the data still needs to comply with purpose limitation: did you tell customers their support interactions would be used for AI training?

Class Action Risk

Several class action lawsuits filed in 2024 and 2025 challenged companies' use of customer data for AI training. The specific legal theories vary (breach of contract, invasion of privacy, unjust enrichment), but the common thread is: customers didn't expect their support conversations to be used to train AI systems, and companies didn't clearly disclose it.

These cases haven't fully resolved yet, so the legal precedent is still forming. But the direction is clear: using customer data for AI training without clear disclosure and appropriate consent is increasingly risky.

What You Can Do Safely

Anonymize aggressively. Strip all personally identifiable information (names, emails, account numbers, addresses, phone numbers) from transcripts before using them for training. Anonymized data is generally outside GDPR scope (if truly anonymized, not just pseudonymized). Use Named Entity Recognition (NER) to automatically detect and redact PII.

Use synthetic data. Generate training data that mimics the patterns of real support conversations without using actual customer messages. This eliminates the privacy issue entirely.

Update your privacy policy. If you intend to use support data for AI improvement, say so explicitly. "We may use anonymized support interactions to improve our automated support systems." This gives you a clearer legal basis and reduces surprise.

Offer opt-out. Let customers opt out of having their interactions used for training. CCPA requires this for California residents. GDPR requires it under legitimate interest (they can object). Making it available to all customers is the safest approach.

Use your own infrastructure. Sending customer data to third-party AI APIs (OpenAI, Anthropic, Google) for training raises third-party sharing concerns under CCPA and cross-border transfer issues under GDPR. Training models in-house on your own servers keeps the data within your control.

The Supp Approach

Supp's classifier was not trained on customer data. No customer PII in the training set. No consent needed for training. No deletion requests that conflict with model weights. No cross-border data transfer for training purposes.

When Supp classifies a new message, the message is processed for classification (a legitimate operational purpose) but not stored for model training unless the customer explicitly opts in.

The broader point: the legal environment for AI training on customer data is tightening, not loosening. Companies that built their AI on customer data without consent in 2023 are now scrambling to retrofit compliance. Companies that designed for privacy from the start (synthetic data, anonymization, purpose limitation) don't have this problem.

If you're planning to use support transcripts for AI training, talk to a privacy lawyer before you start. The cost of legal advice upfront is a fraction of the cost of a class action later.

Learn About Supp

$5 in free credits. No credit card required. Set up in under 15 minutes.

Learn About Supp
AI training data legalsupport data AI trainingGDPR AI trainingcustomer data AIsupport transcript privacy
Can You Use Support Transcripts to Train AI? (Maybe Not.) | Supp Blog