mcottondesign

Loving Open-Souce One Anonymous Function at a Time.

Going Big by Going Small

I am winding down the first phase of a customer pilot and wanted to celebrate a little before getting wrapped up in phase two.

This project needs some AI which is really data science, computer vision, and machine learning. There are several data sources that need to be integrated into a dashboard and data explorer. The interesting part is that this is a great use case for my prior classifier work.

The idea behind my classifier project is instead of having a single large model, with huge datasets, what about small models for each camera? A huge training dataset is needed for the model to generalize and prevent overfitting.

A security camera doesn't change often (if at all) and so it has less of a need to generalize. By embracing overfitting you can get great results with very modest training data. In fact, when training a binary classifier, you can get great results starting with tens of verified images in both classes.

If you iterate, correct misclassifications, and repeat this process, after several iterations you will have a semi-supervised classifier.

What are the downsides of this approach? By embracing overfitting, you run into the potential of misclassifications the first time your model encounters something new. It is unknown what will happen when it first encounters rain or bugs. This can be mitigated by continuing the semi-supervised learning approach. Anomalies and outliers need to be reviewed.