test-data-management
Strategic test data generation, management, and privacy compliance. Use when creating test data, handling PII, ensuring GDPR/CCPA compliance, or scaling data generation for realistic testing scenarios.
What this skill does
# Test Data Management
<default_to_action>
When creating or managing test data:
1. NEVER use production PII directly
2. GENERATE synthetic data with faker libraries
3. ANONYMIZE production data if used (mask, hash)
4. ISOLATE test data (transactions, per-test cleanup)
5. SCALE with batch generation (10k+ records/sec)
**Quick Data Strategy:**
- Unit tests: Minimal data (just enough)
- Integration: Realistic data (full complexity)
- Performance: Volume data (10k+ records)
**Critical Success Factors:**
- 40% of test failures from inadequate data
- GDPR fines up to €20M for PII violations
- Never store production PII in test environments
</default_to_action>
## Quick Reference Card
### When to Use
- Creating test datasets
- Handling sensitive data
- Performance testing with volume
- GDPR/CCPA compliance
### Data Strategies
| Type | When | Size |
|------|------|------|
| **Minimal** | Unit tests | 1-10 records |
| **Realistic** | Integration | 100-1000 records |
| **Volume** | Performance | 10k+ records |
| **Edge cases** | Boundary testing | Targeted |
---
## Data Anonymization
```javascript
// Masking
function maskEmail(email) {
const [user, domain] = email.split('@');
return `${user[0]}***@${domain}`;
}
// [email protected] → j***@example.com
function maskCreditCard(cc) {
return `****-****-****-${cc.slice(-4)}`;
}
// 4242424242424242 → ****-****-****-4242
// Anonymize production data
const anonymizedUsers = prodUsers.map(user => ({
id: user.id, // Keep ID for relationships
email: `user-${user.id}@example.com`, // Fake email
firstName: faker.person.firstName(), // Generated
phone: null, // Remove PII
createdAt: user.createdAt // Keep non-PII
}));
```
---
## Database Transaction Isolation
```javascript
// Best practice: use transactions for cleanup
beforeEach(async () => {
await db.beginTransaction();
});
afterEach(async () => {
await db.rollbackTransaction(); // Auto cleanup!
});
test('user registration', async () => {
const user = await userService.register({
email: '[email protected]'
});
expect(user.id).toBeDefined();
// Automatic rollback after test - no cleanup needed
});
```
---
## Agent-Driven Data Generation
```typescript
// High-speed generation with constraints
await Task("Generate Test Data", {
schema: 'ecommerce',
count: { users: 10000, products: 500, orders: 5000 },
preserveReferentialIntegrity: true,
constraints: {
age: { min: 18, max: 90 },
roles: ['customer', 'admin']
}
}, "qe-test-data-architect");
// GDPR-compliant anonymization
await Task("Anonymize Production Data", {
source: 'production-snapshot',
piiFields: ['email', 'phone', 'ssn'],
method: 'pseudonymization',
retainStructure: true
}, "qe-test-data-architect");
```
---
## Agent Coordination Hints
### Memory Namespace
```
aqe/test-data-management/
├── schemas/* - Data schemas
├── generators/* - Generator configs
├── anonymization/* - PII handling rules
└── fixtures/* - Reusable fixtures
```
### Fleet Coordination
```typescript
const dataFleet = await FleetManager.coordinate({
strategy: 'test-data-generation',
agents: [
'qe-test-data-architect', // Generate data
'qe-test-executor', // Execute with data
'qe-security-scanner' // Validate no PII exposure
],
topology: 'sequential'
});
```
---
## Related Skills
- [database-testing](../database-testing/) - Schema and integrity testing
- [compliance-testing](../compliance-testing/) - GDPR/CCPA compliance
- [performance-testing](../performance-testing/) - Volume data for perf tests
---
## Remember
**Never use production PII directly.** Always use synthetic data or properly anonymized production snapshots.
**With Agents:** `qe-test-data-architect` generates 10k+ records/sec with realistic patterns, relationships, and constraints. Agents ensure GDPR/CCPA compliance automatically and eliminate test data bottlenecks.
Related in specialized-testing
security-testing
IncludedScans for security vulnerabilities including XSS, SQL injection, CSRF, and auth flaws using OWASP Top 10 methodology. Use when conducting SAST/DAST scans, auditing authentication flows, testing authorization rules, or implementing security test automation.
mutation-testing
IncludedTest quality validation through mutation testing, assessing test suite effectiveness by introducing code mutations and measuring kill rate. Use when evaluating test quality, identifying weak tests, or proving tests actually catch bugs.
performance-testing
IncludedProfiles application performance under load using k6, Artillery, or JMeter to measure latency, throughput, and error rates. Use when planning load tests, stress tests, soak tests, benchmarking APIs, or identifying performance bottlenecks.
accessibility-testing
IncludedWCAG 2.2 compliance testing, screen reader validation, and inclusive design verification. Use when ensuring legal compliance (ADA, Section 508), testing for disabilities, or building accessible applications for 1 billion disabled users globally.
a11y-ally
IncludedUse when running comprehensive WCAG accessibility audits with axe-core + pa11y + Lighthouse, generating context-aware remediation, or testing video accessibility. Supports 3-tier browser cascade with graceful degradation.
chaos-engineering-resilience
IncludedChaos engineering principles, controlled failure injection, resilience testing, and system recovery validation. Use when testing distributed systems, building confidence in fault tolerance, or validating disaster recovery.