New AI Can Predict When Two People Will Kiss

MIT researchers made a set of computer systems watch over 600 hours of shows such as “The Office,” “Desperate Housewives”, and “Big Bang Theory,” training the systems to recognise when two characters would kiss, hug, high-five, or shake hands. So far the algorithm has been accurate 43% of the time, whereas humans gave correct predictions 71% of the time.

Sometimes a lean is just a lean, and sometimes a lean leads to a kiss. A new deep-learning algorithm can predict the difference.

MIT researchers have trained a set of connected computer systems to understand body language patterns so that it can guess how two people will interact.

The school’s Computer Science and Artificial Intelligence Lab made one of its “neural networks” watch 600 hours of shows like “Desperate Housewives” and “The Office.”

Researchers then gave the algorithm new videos to watch and asked it to predict what people in the videos would do: hug, kiss, high-five or shake hands.

The deep-learning program predicted the correct action more than 43% of the time when the video was paused one second before the real action.

That’s far from perfect, but humans only predicted the correct action 71% of the time, according to the researchers.

“We wanted to show that just by watching large amounts of video, computers can gain enough knowledge to consistently make predictions about their surroundings,” said Carl Vondrick, a PhD student in computer science and artificial intelligence.

Humans use their experiences to anticipate actions, but computers have a tougher time translating the physical world into data they can process and use.

Vondrick and his team programmed their algorithm to not only study and learn from visual data, but also produce freezeframe-like images to show how a scene might play out.

Even though MIT’s algorithms aren’t accurate enough yet for real-world application, the study is another example of how technologists are trying to improve artificial intelligence.


Ultimately, MIT’s research could help develop robots for emergency response, helping the robot assess a person’s actions to determine if they are injured or in danger.

“There’s a lot of subtlety to understanding and forecasting human interactions,” said Vondrick, whose work was funded in part by research awards from Google. “We hope to be able to work off of this example to be able to soon predict even more complex tasks.”

Companies like Google, Facebook and Microsoft have been working on some form of deep-learning image recognition software in the past few years. Their main objectives at the moment are to help people categorize and organize personal photos. But eventually, as more photos are analyzed and accuracy improves, the systems could be used for to do detailed image captioning for the visually impaired.

6 thoughts on “New AI Can Predict When Two People Will Kiss

  1. Very true, iI have had similar experience. TV shows and soap are often too predictable. No use for AI even common people could predict so well.

  2. I wonder how the algorithm might be what are its input Is it based on 1. Body language 2. Expression in the eyes 3. Ways they communicate Is it also based on 1. Age group of the two individuals 2. Mother / child –> Related family members can also walk to-gether.. which will be rightly perceived by humans. But can AI do the same level of perception 3. Opposite genders / same gender… Interesting to visualize, how our minds think But how AI is programmed, How AI perceives the inputs through interfaces How AI calculates the probability How AI predicts the result and its success factors When there is a failed attempt, how it will understand the failure causes Can it determine the real reasons behind such failures Can it store the experiences ??????? Lots of question in mind, Got to carry out my regular works…

  3. OK. Personally, I found TV shows in the 1950s to be extremely predictable, with totally one dimensional characters. I would have suggest that as a place to start. (In 1980s repeats of old first generation Superman shows, I was even able to talk along with the characters’ lines; even though I hadn’t seen the show for 25-30 years. I was familiar enough with it to predict the next line – word for word.)

Leave a Reply

Your email address will not be published. Required fields are marked *