AI Classification Project
Colorized MNIST experiments test whether a Keras classifier relies on digit geometry, color relationships, or brittle distribution cues.
- Institution
- Arts et Metiers / ENSAM
- Team
- Maxime Hache, Hassan Osman

- Test & Validation
- AI & Machine Learning

Overview
The team colorized grayscale MNIST digits by assigning random background colors and selected digit colors, then trained a Keras model on 60,000 training images and evaluated 10,000 test images.
Challenge
The central question was interpretability: when the model classifies digits, does it learn digit geometry or color relationships that fail under distribution shifts?
Process
The report compares normal colorized classification, color inversion, unseen digit-color tests, color-label classification, and grayscale conversion.
Engineering Details
Python, Keras, MNIST, RGB image generation, train/test datasets, confusion matrices, epochs, and grayscale conversion using CIE-inspired coefficients.
Implementation
Images expanded from 28 x 28 grayscale matrices to 28 x 28 x 3 RGB inputs for the color experiments.
Testing
The report states 98.61 percent accuracy on the first colorized digit task, 33 percent after swapping foreground and background colors, 90.56 percent on a fixed unseen-color split, 100 percent on color classification, and about 97.5 percent on grayscale inversion.
Outcomes
The model performed well on familiar distributions but was sensitive to color inversion in RGB, reinforcing the need to test dataset shifts instead of trusting headline accuracy alone.
Explore alternate network structures, more systematic color splits, early stopping, and visual explanations before drawing stronger interpretability conclusions.




