List view
Getting Started
Getting Started
Background information
Background information
How to use VEIL.AI Anonymization Engine
How to use VEIL.AI Anonymization Engine
FAQs
FAQs
Feedback
Feedback
Risk Analysis
This page provides insights into the effectiveness of anonymization using the VEIL.AI Anonymization Engine, based on two core evaluation strategies:
- Membership Inference Attack
- Re-Identification Risk Attack
These analyses help determine how well the anonymization prevents an attacker from identifying whether a record belonged to the original dataset or correctly matching anonymized records to individuals.
Membership Inference Attack
This attack tests whether an individual’s presence in a dataset can be inferred after anonymization. A sample size, attack set size, and prior knowledge threshold are configured to evaluate the anonymization effectiveness efficiently.
Summary of Results
- A significant drop in True Positives (TP) for anonymized data shows that the anonymization reduces the likelihood of accurately identifying members.
- False Positives (FP) tend to increase in anonymized data, which is desirable from a privacy perspective as it reduces the precision of the attack.
- As the Hamming threshold increases (from 0 to 5), both TP and FP generally increase, but the growing FP count further undermines the attacker’s accuracy.
- F1 Score and True Positive Rate (TPR) are notably lower in anonymized data, while False Positive Rate (FPR) and False Discovery Rate (FDR) are higher — indicating effective anonymization.
Re-Identification Risk Attack
This attack goes beyond membership and tries to re-identify specific individuals based on anonymized data.
Summary of Results
- For anonymized datasets, True Positives are reduced to near zero, even at low Hamming distances.
- False Positives are high, and the combination of high FDR and low precision makes accurate re-identification unlikely.
- Across all tested Hamming thresholds, anonymized data yields consistently low TPR and F1 scores, confirming robust privacy protection.
Evaluating Anonymized Data Quality
Metrics such as F1 Score, True Positive Rate (TPR), False Positive Rate (FPR), and False Discovery Rate (FDR) are used to evaluate the quality of anonymization. These metrics range from 0 to 1.
Strong privacy protection in Anonymized Data:
- Low F1 Score and True Positive Rate (TPR): Indicates a strong anonymization level, with few correct re-identifications.
- High False Discovery Rate (FDR) and False Positive Rate (FPR): Suggests many false alarms, which dilutes the accuracy of any attempts to re-identify data, enhancing privacy.
Poor privacy protection in Anonymized Data:
- High F1 Score and TPR: Suggests that the anonymization process may not be robust enough, as a significant number of correct re-identifications are occurring.
- Low FDR and FPR: Indicates that the anonymization is not effective enough to confuse the re-identification attempts, leading to potential privacy breaches.
Effective anonymization reduces the performance of both Membership Inference and Re-Identification Risk Attacks, as demonstrated by lower F1 scores and TPR in anonymized data. If your anonymized data shows contrary results, consider adjusting the anonymization parameters to enhance data privacy.
Privacy Risk Assessment
A precision-prioritized framework is used to classify risk levels based on:
- Precision
- AUC (Area Under the Curve) approximation
- Hamming distance tolerance
Precision | AUC > 0.75 | AUC 0.60–0.75 | AUC 0.50–0.60 | AUC ≤ 0.50 |
> 0.75 | High Risk | High Risk | Medium Risk | Medium Risk |
0.60–0.75 | High Risk | High Risk | Medium Risk | Low Risk |
0.50–0.60 | Medium Risk | Medium Risk | Low Risk | No Risk |
≤ 0.50 | Low Risk | Low Risk | No Risk | No Risk |
Based on analysis results, well-anonymized datasets consistently fall into the “Low Risk” or “No Risk” categories.
Takeaways
The anonymization process demonstrates strong protection by consistently reducing the ability of attacks to correctly identify individuals or confirm membership.
If your anonymized data shows unexpectedly high TPR or F1 or low FDR and FPR, you may want to adjust your anonymization parameters — such as reducing epsilon or increasing k — to improve privacy.
Need help interpreting your risk analysis? Contact our support team.
We are continuously enhancing our anonymity verification tools—stay tuned for improvements within the VEIL.AI Native App.