Artificial intelligence can create a 3D model of a person—from just a few seconds of video
Artificial intelligence has been used to create 3D models of people’s bodies for virtual reality avatars, surveillance, visualizing fashion, or movies. But it typically requires special camera equipment to detect depth or to view someone from multiple angles. A new algorithm creates 3D models using standard video footage from one angle.
The system has three stages. First, it analyzes a video a few seconds long of someone moving—preferably turning 360° to show all sides—and for each frame creates a silhouette separating the person from the background. Based on machine learning techniques—in which computers learn a task from many examples—it roughly estimates the 3D body shape and location of joints. In the second stage, it “unposes” the virtual human created from each frame, making them all stand with arms out in a T shape, and combines information about the T-posed people into one, more accurate model. Finally, in the third stage, it applies color and texture to the model based on recorded hair, clothing, and skin.
The researchers tested the method with a variety of body shapes, clothing, and backgrounds and found that it had an average accuracy within 5 millimeters, they will report in June at the Computer Vision and Pattern Recognition conference in Salt Lake City. The system can also reproduce the folding and wrinkles of fabric, but it struggles with skirts and long hair. With a model of you, the researchers can change your weight, clothing, and pose—and even make you perform a perfect pirouette. No practice necessary.