Abstract
Efficient compression of sparse point cloud geometry remains a critical challenge in 3D content processing, particularly for low-rate scenarios where conventional codecs struggle to maintain efficiency. This work proposes a system-level framework for low-rate sparse geometry coding, leveraging three variants of autoencoder-based deep learning architectures: folding-based, graph convolution-based, and transformer-based. The proposed framework operates at the object level and utilizes semantic segmentation to structure scenes into interpretable intermediate representations. Two novel autoencoder architectures—a Graph Autoencoder and a Transformer Graph Autoencoder—are proposed and evaluated alongside a FoldingNet baseline. These models are benchmarked on both synthetic and real-world datasets and compared against standard codecs (G-PCC and Draco) as well as a state-of-the-art learning-based codec (CRCIR). Experimental results indicate that the proposed architectures consistently outperform both traditional and recent learning-based methods, particularly at low bitrates. High robustness to latent noise is also observed, enabling applicability in various transmission scenarios. The proposed Transformer Graph Autoencoder architecture achieves the best overall rate-distortion performance and exhibits strong generalization to unseen data.
| Original language | English (US) |
|---|---|
| Pages (from-to) | 214122-214140 |
| Number of pages | 19 |
| Journal | IEEE Access |
| Volume | 13 |
| DOIs | |
| State | Published - 2025 |
| Externally published | Yes |
All Science Journal Classification (ASJC) codes
- General Computer Science
- General Materials Science
- General Engineering
Keywords
- Autoencoder
- coding
- compression
- machine learning
- point cloud
- source coding
Fingerprint
Dive into the research topics of 'Autoencoder Architectures for Low-Rate Sparse Point Cloud Geometry Coding'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver