
Stand-In
A Lightweight and Plug-and-Play Identity Control for Video Generation
Bowen Xue*, Qixin Yan*, Wenjing Wang, Hao Liu, Chen Li
WeChat Vision, Tencent Inc.
Identity-Preserving Text-to-Video Generation

Prompt: In a corridor where the walls ripple like water, a woman reaches out to touch the flowing surface, causing circles of ripples to spread. The camera moves from a medium shot to a close-up, capturing her curious expression as she sees her distorted reflection.

Prompt: A man sits comfortably at his desk, facing the camera as if engaged in a conversation with loved ones on the other side of the screen. His gaze is focused and gentle, with a natural smile on his lips. The background reveals a thoughtfully arranged personal space, with photos and a world map on the wall, conveying a sense of intimate and modern communication.

Prompt: A woman crouches in a vibrant vegetable garden, having just picked a ripe tomato. At this moment, she looks up and gazes directly into the camera, her face glowing with the joy and satisfaction of the harvest. The soft morning light gently falls on her and the surrounding green leaves, filling the scene with a sense of natural, healthy living.

Prompt: A man in a white lab coat stands in front of a laboratory bench. Having just completed a critical step, he turns toward the camera, his eyes shining with the excitement of discovery. The background is filled with precise instruments and an array of beakers and test tubes, capturing the rigor and allure of scientific exploration.

Prompt: A graceful young woman sits in front of an easel, holding a paintbrush. She alternates between thoughtfully examining her work and leaning in to add delicate details.

Prompt: A young man dressed in traditional attire draws the long sword from his waist and begins to wield it. The blade flashes with light as he moves—his eyes sharp, his actions swift and powerful, with his flowing robes dancing in the wind.

Prompt: The video features a man with dark-haired hair, wearing a blue tank top and holding a pink tank top on a hanger. he appears to be in a clothing store or a similar retail environment, as there are racks of clothes visible in the background. The man is speaking to the camera, possibly providing a review or discussing the tank top he is holding. He has colorful bracelets on his wrist and is wearing a necklace with multiple beads. his expression suggests he is engaged in a conversation or presentation. The setting seems to be indoors, with artificial lighting illuminating the scene.

Prompt: The video features a man standing at an easel, focused intently as his brush dances across the canvas. His expression is one of deep concentration, with a hint of satisfaction as each brushstroke adds color and form. He wears a paint-splattered apron, and his hands move with confident precision. The setting, filled with scattered art supplies, open paint tubes, and unfinished sketches pinned to the wall, suggests an artist's studio. A large window on one side allows sunlight to stream in, casting a soft glow across the room and illuminating the colors on his canvas. The atmosphere is creative and inspired, with the man's intense focus and the lively colors on the canvas indicating a moment of artistic passion and expression.

Prompt: The video features a news reporter who is walking down a city street at night while holding a microphone and speaking to the camera. The reporter is wearing a white coat and a blue tie, and he appears to be reporting on a story related to the economy. The background shows a brightly lit cityscape with tall buildings and streetlights, creating a vibrant and dynamic atmosphere. The reporter's speech is accompanied by various text overlays that provide additional information about the story, such as ""Global Economic Forum"" and ""Economy is Growing Fast, Sure Global Rebound."" These text elements suggest that the reporter is discussing economic trends and forecasts. Overall, the video captures a moment of news reporting in an urban setting, with the reporter providing insights into the state of the global economy.

Prompt: The camera glides back from behind the green flowers in the foreground, gradually bringing the woman’s face into focus. What was once a soft blur sharpens into view, carrying a sense of alluring mystery, until her features are fully revealed—graceful and captivating.

Prompt: The video shows a man sitting on a park bench under a large oak tree, reading a book. He has a beard and is wearing a casual sweater and jeans. The park is quiet and green, with sunlight filtering through the tree branches. The man seems completely absorbed in his book, occasionally glancing up to enjoy the peaceful surroundings.

Prompt: The video features a woman sitting in a cozy armchair in a library. She is wearing glasses and a knitted sweater, with her legs tucked up under her. She is reading a book, and the warm light from a nearby lamp casts a soft glow on her face. The shelves around her are filled with books, and the atmosphere is calm and intellectual, with the occasional sound of pages turning.

Prompt: The video features a woman standing in front of a large screen displaying the words ""Tech Minute"" and the logo for CNET. She is wearing a purple top and appears to be presenting or speaking about technology-related topics. The background includes a cityscape with tall buildings, suggesting an urban setting. The woman seems to be engaged in a discussion or providing information on technology news or trends. The overall atmosphere is professional and informative, likely aimed at educating viewers about the latest developments in the tech industry.

Prompt: A young man, a streamer, is wearing a green sleeveless top and red headphones. The background is illuminated by vibrant neon lights.The setting is a well-lit room with a curtain and a lamp visible in the background. His expression and body language suggest that he is speaking passionately into the microphone.

Prompt: A man gently clutching a bouquet of vibrant flowers, his eyes radiating a serene contentment as he glances at the camera. His slightly upturned lips convey a sense of calm joy, accompanied by a faint twinkle in his eye. The scene is set in a lush garden, brimming with colorful blooms and verdant foliage, creating a tranquil haven. The shot captures him from the waist up, emphasizing his relaxed stance and the natural harmony of his surroundings.

Prompt: A woman is sitting in front of a pottery wheel, her hands covered in wet clay. She pauses her work and looks up at the camera, her face beaming with a proud smile as she displays the pottery she has just shaped. In the background, shelves are filled with ceramic works and tools.
Non-Human Subjects-Preserving Video Generation

Prompt: A chibi-style boy speeding on a skateboard, holding a detective novel in one hand. The background features city streets, with trees, streetlights, and billboards along the roads.

Prompt: The video features an anime girl standing on a busy street, surrounded by a hurried crowd. The buildings and shops in the background create a classic cityscape. The girl smiles as she puts her headphones on, her movements smooth and natural. Her expression is playful and relaxed, as if she's about to immerse herself in her favorite music. The camera focuses on her face, capturing her joyful expression and vibrant energy. The background is slightly blurred, emphasizing the contrast between her and her surroundings, creating a sense of relaxed urban living.

Prompt: Shot in a medium shot of a brightly lit room, a girl, approximately seven or eight years old, stands in the center. She has long black hair and wears a light blue dress, her expression focused and gentle. Holding a doll in both hands, she presents her beloved toy to the camera. As the camera slowly zooms in, the details of her face are clearly visible: the soft fabric, the delicate stitching, and the slightly upturned corners of her mouth are all captured. The entire scene is filled with childlike innocence and warmth.

Prompt: In a snow-covered forest on a winter's day, snow is falling. An anime girl stands quietly. With a gentle smile on her face, she looks towards the viewer. Then, she scoops up the snow and smiles.
Identity-Preserving Stylized Video Generation
Select a Style

Prompt: A woman sits on a boat, gazing at the camera with a gentle smile. Behind her is the endless sea, waves crashing against the side of the boat, and a lighthouse in the distance stands tall under the bright sky.