SOPHY: Learning to Generate Simulation-Ready Objects
with PHYsical Materials
Technical University of Crete
[arXiv]   [Code]   [Data]  


Abstract

We present SOPHY, a generative model for 3D physics-aware shape synthesis. Unlike existing 3D generative models that focus solely on static geometry or 4D models that produce physics-agnostic animations, our method jointly synthesizes shape, texture, and material properties related to physics-grounded dynamics, making the generated objects ready for simulations and interactive, dynamic environments. To train our model, we introduce a dataset of 3D objects annotated with detailed physical material attributes, along with an efficient pipeline for material annotation. Our method enables applications such as text-driven generation of interactive, physics-aware 3D objects and single-image reconstruction of physically plausible shapes. Furthermore, our experiments show that jointly modeling shape and material properties enhances the realism and fidelity of the generated shapes, improving performance on both generative geometry and physical plausibility.



Model Architecture
Results
(Best viewed in a PC instead of a mobile device)
$\triangleright$ Image-to-4D Results $\triangleleft$
Input
                    View 1                    
                    View 2                    
Condition
B. Dec.$\dagger$
B. Perc.$\ddagger$
SOPHY
B. Dec.$\dagger$
B. Perc.$\ddagger$
SOPHY
image condition
B. Dec. result
Too soft headband.
B. Perc. result
Too stiff headband.
Ours result
B. Dec. result
Too soft headband.
B. Perc. result
Too stiff headband.
Ours result

Expected outcome: Throw a headband.

image condition
B. Dec. result
Poor alignment.
B. Perc. result
Unrealistic shape.
Ours result
B. Dec. result
Poor alignment.
B. Perc. result
Unrealistic shape.
Ours result

Expected outcome: Throw a chair.

image condition
B. Dec. result
Too stiff / Poor alignment.
B. Perc. result
Too stiff / Poor alignment.
Ours result
B. Dec. result
Too stiff / Poor alignment.
B. Perc. result
Too stiff / Poor alignment.
Ours result

Expected outcome: Drop a bag.

$\triangleright$ Text-to-4D Results $\triangleleft$
                    View 1                    
                    View 2                    
B. Dec.$\dagger$
B. Perc.$\ddagger$
SOPHY
B. Dec.$\dagger$
B. Perc.$\ddagger$
SOPHY
B. Dec. result
Too stiff plant.
B. Perc. result
Too soft container.
Ours result
B. Dec. result
Too stiff plant.
B. Perc. result
Too soft container.
Ours result

Input Condition: Drag a planter made of a plant, a soil, a wood vase neck, and a metal container.

B. Dec. result
Unrealistic flat shape.
B. Perc. result
Too stiff body.
Ours result
B. Dec. result
Unrealistic flat shape.
B. Perc. result
Too stiff body.
Ours result

Input Condition: Throw a teddy bear made of a polyester eye, a cotton ear, a denim mouth, a denim leg, a wool body, a cotton arm, a cotton head, and a wool design.

B. Dec. result
Poor alignment.
B. Perc. result
Unrealistic dynamics.
Ours result
B. Dec. result
Poor alignment.
B. Perc. result
Unrealistic dynamics.
Ours result

Input Condition: Throw a pillow made of a dark navy blue pillow fabric, and a dark gray fabric piping.

$\triangleright$ Generalization Results $\triangleleft$
(Input images are from Objaverse)
Input
Input Image Input Image
       View 1       
Input Image
       View 2       
Input Image
Input
Input Image Input Image
     View 1     
Input Image
     View 2     
Input Image
Input Image Input Image
Input Image
Input Image
Input Image Input Image
Input Image
Input Image
Dataset Examples
$\triangleright$ Material Annotations $\triangleleft$
3D Shape
3D Shape
Teddy Bear
Part-level Annotations
3D Shape
3D Shape
Loveseat
Part-level Annotations
$\triangleright$ Simulation Sequences $\triangleleft$
3D Shape
   View 1   
   View 2   
3D Shape
   View 1   
   View 2   
3D Shape
Hat
Simulation sequences
Drop
Simulation sequences
Drop
3D Shape
Sofa
Simulation sequences
Drop
Simulation sequences
Drop
3D Shape
Bag
Simulation sequences
Throw
Simulation sequences
Throw
3D Shape
Teddy Bear
Simulation sequences
Throw
Simulation sequences
Throw
3D Shape
Chair
Simulation sequences
Tilt
Simulation sequences
Tilt
3D Shape
Sofa
Simulation sequences
Tilt
Simulation sequences
Tilt
3D Shape
Planter
Simulation sequences
Drag
Simulation sequences
Drag
3D Shape
Planter
Simulation sequences
Drag
Simulation sequences
Drag
3D Shape
Planter
Simulation sequences
Wind
Simulation sequences
Wind
3D Shape
Vase
Simulation sequences
Wind
Simulation sequences
Wind
BibTeX
@article{Cao_2025_SOPHY,
    author    = {Cao, Junyi and Kalogerakis, Evangelos},
    title     = {{SOPHY}: Learning to Generate Simulation-Ready Objects with Physical Materials},
    journal   = {arXiv:2504.12684},
    year      = {2025}
}
Remark

$\dagger$: "B. Dec." is a baseline method considered in our experiments. This baseline excludes color and material properties from the generation process, i.e., it generates a 3D shape, then predicts color conditioned on the shape through a decoder, and then the material through another decoder. The choice of this baseline attempts to answer the question of whether there is any benefit of incorporating the physical materials in the generation process. Please refer to our paper for more details.

$\ddagger$: "B. Perc." is a baseline method considered in our experiments. This baseline uses perceptual models to estimate material properties based on an off-the-shelf 3D generation model. Specifically, we adopt TRELLIS, a state-of-the-art 3D generation model, to generate textured 3D shapes given image or text conditions. To obtain the material properties of each generated shape, we then leverage an open-vocabulary 3D part segmentation model, Find3D, to get the part labels for the sampled surface points. Note that the query part names we provide to Find3D are derived from the set of part labels in our dataset, which comes from 3DCoMPaT200. Finally, we leverage ChatGPT-4o by providing it with two renderings of the generated 3D object and asking it to estimate the material properties for each part of the object retrieved by Find3D. Please refer to our paper appendix for more details.

This webpage is modified from the template provided here.