Generating 3D objects with complex, nonlinear shapes directly from images is still an open research area. To address this problem, several state-of-the-art methods use Deep Learning (DL) to predict a set of parameters from images, which are then used to generate the 3D geometry, leveraging the characteristics of procedural modeling. Recently, Kolmogorov-Arnold Networks (KANs) have emerged as an alternative to traditional Multilayer Perceptrons (MLPs) in DL, and have been successfully integrated into architectures such as Convolutional Neural Networks (CNNs), Graph Neural Networks, and Transformers. In this work, we propose a DL architecture consisting of a hybrid CNN-KAN network for parametric 3D model generation from images. The model combines the ability of KANs to capture complex nonlinear functions with the strong visual feature extraction capabilities of CNNs. The method is evaluated using both quantitative error metrics and qualitative visualizations comparing predicted shapes with ground truth, and its performance is compared against a more standard CNN-MLP architecture.

Combining CNN Feature Extraction and Kolmogorov-Arnold Networks Regression for Procedural 3D Shape Generation

Gilda Manfredi
;
Nicola Capece;Ugo Erra;
2025-01-01

Abstract

Generating 3D objects with complex, nonlinear shapes directly from images is still an open research area. To address this problem, several state-of-the-art methods use Deep Learning (DL) to predict a set of parameters from images, which are then used to generate the 3D geometry, leveraging the characteristics of procedural modeling. Recently, Kolmogorov-Arnold Networks (KANs) have emerged as an alternative to traditional Multilayer Perceptrons (MLPs) in DL, and have been successfully integrated into architectures such as Convolutional Neural Networks (CNNs), Graph Neural Networks, and Transformers. In this work, we propose a DL architecture consisting of a hybrid CNN-KAN network for parametric 3D model generation from images. The model combines the ability of KANs to capture complex nonlinear functions with the strong visual feature extraction capabilities of CNNs. The method is evaluated using both quantitative error metrics and qualitative visualizations comparing predicted shapes with ground truth, and its performance is compared against a more standard CNN-MLP architecture.
File in questo prodotto:
File Dimensione Formato  
19_Combining_CNN_Feature_Extraction_and_Kolmogorov-Arnold_Networks.pdf

solo utenti autorizzati

Licenza: Creative commons
Dimensione 4.89 MB
Formato Adobe PDF
4.89 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11563/210517
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact