The next (and somewhat disturbing) challenge for Facebook’s AI is to understand and remember the world seen through our eyes
We know that Facebook is investing large resources in its smart glasses and augmented reality, but at the moment, beyond recording video and taking photos, they do not have great capabilities.
That may change in the future thanks to artificial intelligence. How? Well, one of the possibilities is to provide AI with the ability to understand what is seen from our eyes, listen and get “remember scenes” (being able to answer questions like “where did I leave my keys?”) or “remember words” (being able to answer questions like “what did Juan say the other day?”). A most striking project that at the same time generates multiple doubts.
Ego4D, the project that makes us understand what Facebook wants to achieve in the future
Facebook today announced the development of Ego4D, a project of the artificial intelligence department where, in collaboration with 13 universities around the world, they have created a database for teach AI to understand typical images and photos recorded from first-person devices.
While it is common for algorithms to work with data sets of videos and photos seen from afar, the usual thing nowadays, on Facebook they want to anticipate a situation where first-person videos are more common. The problem is that while the AI is able to identify an image of a Ferris wheel, it does not have it so easy when the image of the Ferris wheel is from the person riding on it. Something similar happens in all kinds of situations where the angle is not from afar.
“Next-generation artificial intelligence systems will need to learn from a completely different type of data: videos that show the world from the center of the action, rather than on the sidelines”Explains Kristen Grauman, a Facebook researcher.
Ego4D has published a first study where Analyze 2,200+ Hours of First Person Video, with 700 participants showing day-to-day tasks. As Facebook explains, this involves multiplying by 20 the amount of material available to help train the algorithms.
What uses does Facebook think training AI in the first person can serve?
What use could it be if AI could see and hear everything we do in the first person? From the Facebook team they leave five possibilities:
Episodic memory: asking when something happened. By having a recording of our life, the AI can answer the question that in many cases only we know.
Forecast: anticipating certain routine steps. For example in a recipe, that the AI warns us if we have skipped a step.
Manipulation of objects: that the AI guides us to carry out certain steps, for example playing an instrument or making instructions on how to position our body.
Audiovisual diaries– By having a record of what we see, it would be possible to ask who said what and when. And so, for example, remember what figure the teacher said or what time he stayed.
Social interaction: By improving first-person understanding, algorithms could help improve vision or sound.
Huge potential … at the cost of teaching you our point of view
The possibilities of applying artificial intelligence from the point of view of our eyes opens up many possibilities, but it also generates multiple doubts at the level of privacy. At the moment it is a project completely within the academic world, but it already anticipates some possibilities of commercial use that we could see in the future.
“Advancing our ability to analyze stacks of photos and videos that were taken by humans for a very special purpose. To this fluid and continuous first-person visual flow that augmented reality systems and robots need to understand in the context of activity “, explains Grauman, who points out that not only Facebook glasses could make use of these capabilities, but all kinds of assistants.
The Ego4D project is a powerful exercise in the impact that artificial intelligence can have. Some algorithms where the more information we give and the more we let it into our daily lives, the more accurate the answers will be that they can give us. Establishing where the boundary is is also an important debate that will need to be addressed.
More information | Facebook AI Blog