Facebook introduced their new project called “Learning from videos” to help their AI to learn like humans. They want to stop relying on labeled data which is a fundamental bottleneck on the pace of AI innovation. Labeled data is not only hard to gather but also is limited. So, keeping that in mind Facebook is trying to leverage publicly available Video.
Every day, people around the globe share lot’s of video in Facebook products and Facebook want to train their AI with those data which will not only help in improving their performance but will also help them in learning a lot of more new things about a users like their country’s culture, language, hobbies, interests, preferences in brand and clothes and other personal details. Although Facebook already has access to such information through its current ad-targeting operation, but videos could be a great source of rich and invasive source of data.
Infact, Facebook is already making use of them, within 6 months of developing a state-of-the-art, self-supervised framework for video understanding, they were able to improve their computer vision and speech recognition system. And also they have deployed an AI model in Instagram’s Reel recommendation system which shows a 20 percent reduction in speech recognition error and also showed positive results in A/B testing.
Facebook says popular videos often share common patterns as they consist of the same music and dance moves, but created and acted by different people. And self -supervised models automatically learn “themes,” and group them together to recommend you better clips, based on your recently watched videos while filtering out duplicates. This requires a model to learns the relationship between the sound and images in a video so, they are also improving their speech recognition technologies which will also help in auto video captioning and in flagging harmful content like hate speech.
But According to MIT’s recent Technology review report, this content recommendation algorithms for social network’s emphasis on growth and user engagement leads to the spread of misinformation and encourages political polarization. As Technology review says: “The machine learning models that maximize engagement also favor controversy, misinformation, and extremism”
However, this is not the only goal of Facebook, from this AI model trained on user’s videos, Facebook is trying to make it easy for users to recall their memory as capturing them. As smartphone cameras have made it simple to take photos and video on the go, the upcoming wearables(smart glasses) such as AR glasses will make it even easier to capture things (which Facebook is planning to release later this year).
Facebook says, to recall the right data from the user’s vast bank of digital memories We will need smarter AI systems that can understand what’s happening in videos in more details for instance if someone says “show me every time we sang happy birthday to Grandma” to make AI be able to process this kind of instruction we need to teach AI to match phrases like “happy birthday” to cakes, candles and people singing various birthday song and more. And this huge publicly available user’s video on their platform from around the globe will help them to train their AI.