Risk Analysis

Risk Analysis

This page provides detailed insights into the effectiveness of our anonymization techniques by showcasing the results from Membership Inference Attacks and Re-Identification Risk Attacks.

Membership Inference Attack

This attack assesses how well the anonymization process hides whether an individual's data was used in the dataset. A sample size, attack set size, and prior knowledge threshold are configured to evaluate the anonymization effectiveness efficiently.
Attack Results Summary:
  • True Positives (TP): High TP in original data suggests effective identification of members; a decrease in TP in anonymized data indicates successful anonymization.
  • False Positives (FP): Low FP in anonymized data demonstrates precision in the attack, minimizing incorrect member flags.
  • True Negatives (TN) and False Negatives (FN): High TN and low FN in anonymized data reflect robust privacy protection.
  • Hamming Distance 0 to 5: Examines the progression from identical to significantly altered records. Effective anonymization results in a decrease in True Positives (TP) and an increase in False Negatives (FN), indicating strong anonymization measures.

Re-Identification Risk Attack

This attack goes further by trying to match anonymized records with actual individuals, offering a deeper look at potential re-identification risks.
Attack Results Summary:
  • True Positives (TP): A reduction in TP for anonymized data shows effective prevention of accurate re-identification.
  • False Positives (FP), True Negatives (TN), and False Negatives (FN): Similar to the Membership Inference Attack, desirable results include low FP and high TN in anonymized data, indicating strong anonymization.
  • Hamming Distance 0 to 5: Examines the progression from identical to significantly altered records. Effective anonymization results in a decrease in True Positives (TP) and an increase in False Negatives (FN), indicating strong anonymization measures.

Evaluating Anonymized Data Quality

Metrics such as F1 Score, True Positive Rate (TPR), False Positive Rate (FPR), and False Discovery Rate (FDR) are used to evaluate the quality of anonymization. These metrics range from 0 to 1.
Strong privacy protection in Anonymized Data:
  • Low F1 Score and True Positive Rate (TPR): Indicates a strong anonymization level, with few correct re-identifications.
  • High False Discovery Rate (FDR) and False Positive Rate (FPR): Suggests many false alarms, which dilutes the accuracy of any attempts to re-identify data, enhancing privacy.
Poor privacy protection in Anonymized Data:
  • High F1 Score and TPR: Suggests that the anonymization process may not be robust enough, as a significant number of correct re-identifications are occurring.
  • Low FDR and FPR: Indicates that the anonymization is not effective enough to confuse the re-identification attempts, leading to potential privacy breaches.
Effective anonymization reduces the performance of both Membership Inference and Re-Identification Risk Attacks, as demonstrated by lower F1 scores and TPR in anonymized data. If your anonymized data shows contrary results, consider adjusting the anonymization parameters to enhance data privacy.
If you want us to help in interpreting the Risk analysis results, please contact us for more details.
We are developing better tools for anonymity verification in our Native App.