SOPHY: Learning to Generate Simulation-Ready Objects
with PHYsical Materials
Technical University of Crete
Abstract
We present SOPHY, a generative model for 3D physics-aware shape synthesis. Unlike existing 3D generative models that focus solely on static geometry or 4D models that produce physics-agnostic animations, our method jointly synthesizes shape, texture, and material properties related to physics-grounded dynamics, making the generated objects ready for simulations and interactive, dynamic environments. To train our model, we introduce a dataset of 3D objects annotated with detailed physical material attributes, along with an efficient pipeline for material annotation. Our method enables applications such as text-driven generation of interactive, physics-aware 3D objects and single-image reconstruction of physically plausible shapes. Furthermore, our experiments show that jointly modeling shape and material properties enhances the realism and fidelity of the generated shapes, improving performance on both generative geometry and physical plausibility.
Model Architecture
Results
(Best viewed in a PC instead of a mobile device)
$\triangleright$ Image-to-4D Results $\triangleleft$
Too soft headband.
Too stiff headband.
Too soft headband.
Too stiff headband.
Expected outcome: Throw a headband.
Poor alignment.
Unrealistic shape.
Poor alignment.
Unrealistic shape.
Expected outcome: Throw a chair.
Too stiff / Poor alignment.
Too stiff / Poor alignment.
Too stiff / Poor alignment.
Too stiff / Poor alignment.
Expected outcome: Drop a bag.
$\triangleright$ Text-to-4D Results $\triangleleft$
Too stiff plant.
Too soft container.
Too stiff plant.
Too soft container.
Input Condition: Drag a planter made of a plant, a soil, a wood vase neck, and a metal container.
Unrealistic flat shape.
Too stiff body.
Unrealistic flat shape.
Too stiff body.
Input Condition: Throw a teddy bear made of a polyester eye, a cotton ear, a denim mouth, a denim leg, a wool body, a cotton arm, a cotton head, and a wool design.
Poor alignment.
Unrealistic dynamics.
Poor alignment.
Unrealistic dynamics.
Input Condition: Throw a pillow made of a dark navy blue pillow fabric, and a dark gray fabric piping.
$\triangleright$ Generalization Results $\triangleleft$
(Input images are from Objaverse)
Dataset Examples
$\triangleright$ Material Annotations $\triangleleft$
3D Shape
Teddy Bear
3D Shape
Loveseat
$\triangleright$ Simulation Sequences $\triangleleft$
Hat
Drop
Drop
Sofa
Drop
Drop
Bag
Throw
Throw
Teddy Bear
Throw
Throw
Chair
Tilt
Tilt
Sofa
Tilt
Tilt
Planter
Drag
Drag
Planter
Drag
Drag
Planter
Wind
Wind
Vase
Wind
Wind
BibTeX
@article{Cao_2025_SOPHY,
author = {Cao, Junyi and Kalogerakis, Evangelos},
title = {{SOPHY}: Learning to Generate Simulation-Ready Objects with Physical Materials},
journal = {arXiv:2504.12684},
year = {2025}
}
Remark
$\dagger$: "B. Dec." is a baseline method considered in our experiments. This baseline excludes color and material properties from the generation process, i.e., it generates a 3D shape, then predicts color conditioned on the shape through a decoder, and then the material through another decoder. The choice of this baseline attempts to answer the question of whether there is any benefit of incorporating the physical materials in the generation process. Please refer to our paper for more details.
$\ddagger$: "B. Perc." is a baseline method considered in our experiments. This baseline uses perceptual models to estimate material properties based on an off-the-shelf 3D generation model. Specifically, we adopt TRELLIS, a state-of-the-art 3D generation model, to generate textured 3D shapes given image or text conditions. To obtain the material properties of each generated shape, we then leverage an open-vocabulary 3D part segmentation model, Find3D, to get the part labels for the sampled surface points. Note that the query part names we provide to Find3D are derived from the set of part labels in our dataset, which comes from 3DCoMPaT200. Finally, we leverage ChatGPT-4o by providing it with two renderings of the generated 3D object and asking it to estimate the material properties for each part of the object retrieved by Find3D. Please refer to our paper appendix for more details.
This webpage is modified from the template provided here.