On this tutorial, we reveal a sensible knowledge poisoning assault by manipulating labels within the CIFAR-10 dataset and observing its influence on mannequin habits. We assemble a clear and a poisoned coaching pipeline facet by facet, utilizing a ResNet-style convolutional community to make sure steady, comparable studying dynamics. By selectively flipping a fraction of samples from a goal class to a malicious class throughout coaching, we present how delicate corruption within the knowledge pipeline can propagate into systematic misclassification at inference time. Take a look at the FULL CODES right here.
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.knowledge import DataLoader, Dataset
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix, classification_report
CONFIG = {
"batch_size": 128,
"epochs": 10,
"lr": 0.001,
"target_class": 1,
"malicious_label": 9,
"poison_ratio": 0.4,
}
torch.manual_seed(42)
np.random.seed(42)We arrange the core setting required for the experiment and outline all world configuration parameters in a single place. We guarantee reproducibility by fixing random seeds throughout PyTorch and NumPy. We additionally explicitly choose the compute system so the tutorial runs effectively on each CPU and GPU. Take a look at the FULL CODES right here.
class PoisonedCIFAR10(Dataset):
def __init__(self, original_dataset, target_class, malicious_label, ratio, is_train=True):
self.dataset = original_dataset
self.targets = np.array(original_dataset.targets)
self.is_train = is_train
if is_train and ratio > 0:
indices = np.the place(self.targets == target_class)[0]
n_poison = int(len(indices) * ratio)
poison_indices = np.random.alternative(indices, n_poison, change=False)
self.targets[poison_indices] = malicious_label
def __getitem__(self, index):
img, _ = self.dataset[index]
return img, self.targets[index]
def __len__(self):
return len(self.dataset)We implement a customized dataset wrapper that allows managed label poisoning throughout coaching. We selectively flip a configurable fraction of samples from the goal class to a malicious class whereas maintaining the check knowledge untouched. We protect the unique picture knowledge in order that solely label integrity is compromised. Take a look at the FULL CODES right here.
def get_model():
mannequin = torchvision.fashions.resnet18(num_classes=10)
mannequin.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)
mannequin.maxpool = nn.Identification()
return mannequin.to(CONFIG["device"])
def train_and_evaluate(train_loader, description):
mannequin = get_model()
optimizer = optim.Adam(mannequin.parameters(), lr=CONFIG["lr"])
criterion = nn.CrossEntropyLoss()
for _ in vary(CONFIG["epochs"]):
mannequin.prepare()
for photos, labels in train_loader:
photos = photos.to(CONFIG["device"])
labels = labels.to(CONFIG["device"])
optimizer.zero_grad()
outputs = mannequin(photos)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
return mannequinWe outline a light-weight ResNet-based mannequin tailor-made for CIFAR-10 and implement the complete coaching loop. We prepare the community utilizing commonplace cross-entropy loss and Adam optimization to make sure steady convergence. We hold the coaching logic equivalent for clear and poisoned knowledge to isolate the impact of information poisoning. Take a look at the FULL CODES right here.
def get_predictions(mannequin, loader):
mannequin.eval()
preds, labels_all = [], []
with torch.no_grad():
for photos, labels in loader:
photos = photos.to(CONFIG["device"])
outputs = mannequin(photos)
_, predicted = torch.max(outputs, 1)
preds.lengthen(predicted.cpu().numpy())
labels_all.lengthen(labels.numpy())
return np.array(preds), np.array(labels_all)
def plot_results(clean_preds, clean_labels, poisoned_preds, poisoned_labels, courses):
fig, ax = plt.subplots(1, 2, figsize=(16, 6))
for i, (preds, labels, title) in enumerate([
(clean_preds, clean_labels, "Clean Model Confusion Matrix"),
(poisoned_preds, poisoned_labels, "Poisoned Model Confusion Matrix")
]):
cm = confusion_matrix(labels, preds)
sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", ax=ax[i],
xticklabels=courses, yticklabels=courses)
ax[i].set_title(title)
plt.tight_layout()
plt.present()We run inference on the check set and acquire predictions for quantitative evaluation. We compute confusion matrices to visualise class-wise habits for each clear and poisoned fashions. We use these visible diagnostics to focus on focused misclassification patterns launched by the assault. Take a look at the FULL CODES right here.
rework = transforms.Compose([
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize((0.4914, 0.4822, 0.4465),
(0.2023, 0.1994, 0.2010))
])
base_train = torchvision.datasets.CIFAR10(root="./knowledge", prepare=True, obtain=True, rework=rework)
base_test = torchvision.datasets.CIFAR10(root="./knowledge", prepare=False, obtain=True, rework=rework)
clean_ds = PoisonedCIFAR10(base_train, CONFIG["target_class"], CONFIG["malicious_label"], ratio=0)
poison_ds = PoisonedCIFAR10(base_train, CONFIG["target_class"], CONFIG["malicious_label"], ratio=CONFIG["poison_ratio"])
clean_loader = DataLoader(clean_ds, batch_size=CONFIG["batch_size"], shuffle=True)
poison_loader = DataLoader(poison_ds, batch_size=CONFIG["batch_size"], shuffle=True)
test_loader = DataLoader(base_test, batch_size=CONFIG["batch_size"], shuffle=False)
clean_model = train_and_evaluate(clean_loader, "Clear Coaching")
poisoned_model = train_and_evaluate(poison_loader, "Poisoned Coaching")
c_preds, c_true = get_predictions(clean_model, test_loader)
p_preds, p_true = get_predictions(poisoned_model, test_loader)
plot_results(c_preds, c_true, p_preds, p_true, courses)
print(classification_report(c_true, c_preds, target_names=courses, labels=[1]))
print(classification_report(p_true, p_preds, target_names=courses, labels=[1]))We put together the CIFAR-10 dataset, assemble clear and poisoned dataloaders, and execute each coaching pipelines finish to finish. We consider the educated fashions on a shared check set to make sure a good comparability. We finalize the evaluation by reporting class-specific precision and recall to show the influence of poisoning on the focused class.
In conclusion, we noticed how label-level knowledge poisoning degrades class-specific efficiency with out essentially destroying general accuracy. We analyzed this habits utilizing confusion matrices and per-class classification reviews, which reveal focused failure modes launched by the assault. This experiment reinforces the significance of information provenance, validation, and monitoring in real-world machine studying methods, particularly in safety-critical domains.
Take a look at the FULL CODES right here. Additionally, be at liberty to comply with us on Twitter and don’t neglect to affix our 100k+ ML SubReddit and Subscribe to our Publication. Wait! are you on telegram? now you’ll be able to be part of us on telegram as nicely.
Take a look at our newest launch of ai2025.dev, a 2025-focused analytics platform that turns mannequin launches, benchmarks, and ecosystem exercise right into a structured dataset you’ll be able to filter, examine, and export.
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is dedicated to harnessing the potential of Synthetic Intelligence for social good. His most up-to-date endeavor is the launch of an Synthetic Intelligence Media Platform, Marktechpost, which stands out for its in-depth protection of machine studying and deep studying information that’s each technically sound and simply comprehensible by a large viewers. The platform boasts of over 2 million month-to-month views, illustrating its recognition amongst audiences.