Skip to content

ZFancy/TARF

Repository files navigation

Decoupling the Class Label and the Target Concept in Machine Unlearning

Abstract

Machine unlearning as an emerging research topic for data regulations, aims to adjust a trained model to approximate a retrained one that excludes a portion of training data. Previous studies showed that class-wise unlearning is successful in forgetting the knowledge of a target class, through gradient ascent on the forgetting data or fine-tuning with the remaining data. However, while these methods are useful, they are insufficient as the class label and the target concept are often considered to coincide. In this work, we expand the scope by considering the label domain mismatch and investigate three problems beyond the conventional all matched forgetting, e.g., target mismatch, model mismatch, and data mismatch forgetting. We systematically analyze the new challenges in restrictively forgetting the target concept and also reveal crucial forgetting dynamics in the representation level to realize these tasks. Based on that, we propose a general framework, namely, TARget-aware Forgetting (TARF). It enables the additional tasks to actively forget the target concept while maintaining the rest part, by simultaneously conducting annealed gradient ascent on the forgetting data and selected gradient descent on the hard-to-affect remaining data. Empirically, various experiments under the newly introduced settings are conducted to demonstrate the effectiveness of our TARF.

Overview

This repository contains the source code of our paper in ICLR'26: Decoupling the Class Label and the Target Concept in Machine Unlearning.

In which we explore the unlearning with label domain mismatch among the forgetting data, target concept, and the model representation.

Label domain mismatch settings

Figure 1. Illustrations of decoupling the class label and the target concept.

Taking the CIFAR-100 dataset with its classes and superclass (two different label domains for modeling different taxonomy of unlearning from pre-training tasks) as an example, we instantiate four tasks given the same forgetting data with the class labels of “boy” and “girl”: a) all matched forgetting (conventional scenario): unlearn “boy” and “girl” with the model trained on the classes; b) target mismatch forgetting: unlearn “people” with the model trained on the classes; c) model mismatch forgetting: unlearn “boy” and “girl” with the model trained on the superclass; d) data mismatch forgetting: unlearn “people” with the model trained on the superclass.

Challenges in mismatch forgetting

Figure 2. The challenges of restrictive unlearning with the mismatched label domains.

We conduct various unlearning methods for the four tasks. In conventional all-matched forgetting, all the methods can perform similarly to Retrained. In contrast, we can find that model mismatch forgetting can be affected by the trained model, coupling the behaviors on the forgetting and affected retaining data, leaving less accuracy gap between them. In target or data mismatch forgetting, the class labels cannot fully represent the target concepts, leaving false retaining data (belongs to the target concept) not completely forgotten.

Representation-level forgetting dynamics

Figure 3. Forgetting dynamics on entangled/under-entangled feature representations of trained model.

From the tSNE visualization of the learned features from the pre-trained model trained by (left) superclass and (right) classes. We show the averaged loss value of forgetting data, concept/class-aligned data, and the remaining data during GA on the two representations.

Requirements

basic pre

  • python 3.8
  • pytorch 1.8+
  • einops
  • datasets

More enviornment details can refer to requirements.txt

Code Structure

The source code is organized as follows:

evaluation: contains MIA evaluation code.

models: contains the model definitions.

pretrain_base: saves the model pre-trained by classes or superclass.

dataset.py: contains the dataset constructions.

utils.py: contains the utility functions.

main_forget_partial.py: contains the main executable code for unlearning.

Command

For the experiments, we provide specific commands with detailed trial parameters in run_cifar10_automo.sh and run_cifar100_people.sh

Original-train

Classes

python -u main_imp.py --data ~/dataset/cifar-10 --dataset cifar10 --num_classes 10 --arch resnet18 --prune_type rewind_lt --rewind_epoch 8 --save_dir ./pretrain_base/pretrain_cifar10_omp0 --rate 0.0 --pruning_times 1 --num_workers 8

python -u main_imp.py --data ~/dataset/cifar-100 --dataset cifar100 --num_classes 100 --arch resnet18 --prune_type rewind_lt --rewind_epoch 8 --save_dir ./pretrain_base/pretrain_cifar100_omp0 --rate 0.0 --pruning_times 1 --num_workers 8

Superclass

python -u main_imp.py --data ~/dataset/cifar-10 --dataset cifar10_sub --num_classes 5 --arch resnet18 --prune_type rewind_lt --rewind_epoch 8 --save_dir ./pretrain_base/pretrain_cifar10_sub_omp0 --rate 0.0 --pruning_times 1 --num_workers 8

python -u main_imp.py --data ~/dataset/cifar-100 --dataset cifar100_sub --num_classes 20 --arch resnet18 --prune_type rewind_lt --rewind_epoch 8 --save_dir ./pretrain_base/pretrain_cifar100_sub_omp0 --rate 0.0 --pruning_times 1 --num_workers 8

Unlearning

FT

python -u main_forget_partial.py --data ~/dataset/cifar-10 --dataset cifar10 --save_dir ${save_dir} --mask ${mask_path} --unlearn FT --num_indexes_to_replace 4500 --class_to_replace 1 --unlearn_lr 0.01 --unlearn_epochs 10

GA

python -u main_forget_partial.py --data ~/dataset/cifar-10 --dataset cifar10 --save_dir ${save_dir} --mask ${mask_path} --unlearn GA --num_indexes_to_replace 4500 --class_to_replace 1 --unlearn_lr 0.0001 --unlearn_epochs 5

TARF

python -u main_forget_partial.py --data ~/dataset/cifar-10 --dataset cifar10 --save_dir ${save_dir} --mask ${mask_path} 4500 --class_to_replace 1 --unlearn_lr 0.01 --unlearn_epochs 10 --k=0.02 --no-l1-epochs 2

Citation

@inproceedings{
zhu2026decoupling,
title={Decoupling the Class Label and the Target Concept in Machine Unlearning},
author={Jianing Zhu and Bo Han and Jiangchao Yao and Jianliang Xu and Gang Niu and Masashi Sugiyama},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026}
}

Ackownledgement

The code is developed based on L1-Sparse [1] with MIT license. We thank the authors of L1-Sparse [1] for their open-source implementation and instructions on data preparation.

[1] Model Sparsity Can Simplify Machine Unlearning. NeurIPS'23 spotlight.

About

[ICLR 2026] "Decoupling the Class Label and the Target Concept in Machine Unlearning"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors