Where is my cat?
YOLO, say the data scientists. You only look once because you only live once. Grab a coffee, kick the cat off the couch, and join us in the world of computer vision.
The Origin of the World
Almost.
The origin of the current bubble, yes.
What bubble?
The artificial intelligence bubble we are living in right now, of course.
The performance of computer vision tools had been improving for a few years.
It really took off in 2012.
AlexNet
You don't know Alex Krizhevsky.
Like Yann LeCun, he worked under the supervision of Geoffrey Hinton.
The Geoffrey Hinton, British and Canadian, Turing Award winner 2018, former Google.
Alex was a postdoc in 2012.He was working on deep neural networks. AlexNet topped a computer vision competition by crushing the competition.
What a poser!
My uncle used to design world champion Othello programs.
Comp'oth, he called it.
It was in the 1980s.
Do I make a big deal out of it?
Of course not.
This AlexNet, honestly, what a poser! Moreover, it was inspired by LeNet.
LeNet
In 1998, soccer player Zidane became an idol.
Yann LeCun was in Canada.
He was not yet the chief of AI at Meta.
He had not yet won the Turing Award 2018.
He was not trolling on Twitter.
Was he watching football matches?
No, he was working.
He was working on character recognition. He designed a program capable of reading the amount on checks. This program was a neural network. In 1998, LeCun launched LeNet.
What's it for?
Deep neural networks are the right tools for image detection and recognition. They can also label image elements.
They are very good at recognizing patterns in loads of images. They show performances better than humans. These programs are therefore superhuman. On these tasks. And only on these tasks.
I'm going easy to not set off Laurent Alexandre.
What's it for me?
Do you need to count a few cats in an image?
Have a human do it.
Do you need to do it on a million images?
Have a deep neural network do it.
Between the two, it's more complicated.
I invite you to listen to Antonio Cassili talk about micro-work in this video for example.
It's Important
Detection, identification, recognition.
Many real-world applications benefit from these capabilities.
Internet players make extensive use, especially on social networks.
They help doctors analyze X-ray shots, interpret MRIs, etc.
And you too, use it every day.
These algorithms help us take beautiful photos.
And I suspect some TVs embed these programs.
It. Just. Works.
The field has evolved very rapidly.
There are software components available, off-the-shelf, ready to plug into applications and products. Data scientists use the YOLO models. You only look once.
A glance is enough to analyze your images.
These are pre-trained models.
With really interesting performances.
The Nirvana of computer vision.
Come as You Are
The tools are recent.
The YOLO series dates back to 2016.
An eternity in the field of machine learning!
Today, systems are easily set up.
The project owner provides their data and details their needs.
Data technicians just need to plug in the right algorithms.
Well, almost.
Come as You Were
The algorithms are pre-trained.
They don't start from scratch.
They still need to be adapted to the case at hand.
It's better to provide training data to the algorithms.
Data that's representative of the target data.
Designing a system to detect apples in an orchard does not require the same work as designing a system to detect scratches on a car.
You would have guessed.
As I Want You to Be
Are the training steps over?
Are you twitching with impatience to go into production?
Hold on, padawan.
Have you verified that the performances are good?
Have you identified biases in your model?
Have you set up a process for monitoring model drifts?
Do you have a way to insure against regressions?
It's a difficult art.
Limits?
Computer vision works well.
Very well, even.
There's a hitch.
Slight.
We still don't understand why these systems learn. Where is the description of a cat coded in a neural network? Hard to say.
Cats love to hide, and they adore dark places.