Light of Normals: Unified Feature Representation for Universal Photometric Stereo

Hong Li^1,2* Houyuan Chen^1,3* Chongjie Ye^1,4 Zhaoxi Chen⁵ Bohan Li¹ Shaocong Xu¹ Xianda Guo¹ Xuhui Liu² Yikai Wang⁶ Baochang Zhang² Satoshi Ikehata⁷ Boxin Shi⁹ Anyi Rao⁸ Hao Zhao^1,10

¹BAAI, ²BUAA, ³NJU, ⁴FNii, CUHKSZ, ⁵NTU, ⁶BNU, ⁷NII, ⁸HKUST, ⁹PKU, ¹⁰AIR, THU
^*Equal Contribution ^†Equal Contribution Corresponding Author

arXiv Code 🤗 Live Demo

Abstract

Universal photometric stereo (PS) aims to recover high-quality surface normals from objects under arbitrary lighting conditions without relying on specific illumination models. Despite recent advances such as SDM-UniPS and Uni MS-PS, two fundamental challenges persist: 1) the deep coupling between varying illumination and surface normal features, where ambiguity in observed intensity makes it difficult to determine whether brightness variations stem from lighting changes or surface orientation; and 2) the preservation of high-frequency geometric details in complex surfaces, where intricate geometries create self-shadowing, inter-reflections, and subtle normal variations that conventional feature processing operations struggle to capture accurately.

Method

Overview of the LINO-UniPS architecture, featuring a Light-Normal Contextual Encoder, Decoder, and loss computation.

LINO-UniPS significantly performs better when processing data characterized by high-frequency information.

Attention maps of lighting registers tokens on the encoder's final-layer. Different tokens exhibit specialized attention on diverse lighting information from multiple directions.

The features extracted by our LiNO-UniPS encoder effectively disentangle lighting from surface normal information and concurrently exhibit enhanced consistency.

Ablation Study

Ablation study first reveals that training on PS-Verse is substantially better than on PS-Mix. It further demonstrates that by incrementally adding key components like light registers tokens, global attention, light alignment, wavelet transform and gradient perception loss, the Performance (MAE) and the feature consistency (CSIM, SSIM) can be steadily improved compared to the baseline.

Feature consistency

LINO-UniPS enhances the decoupling of lighting and normals, resulting in features with greater consistency.

Some Visual Results

Hover to view an example from the multi-light input images and the corresponding surface normals reconstructed by LINO-UniPS.

PS-Verse Dataset

Level 1

Level 2

Level 3

Level 4

Level 5

Downstream Application

The detailed surface normals reconstructed by LINO-UniPS can be directly applied to downstream tasks such as 3D generation. Hi👋3DGen:

LINO-UniPS excels as a normal estimation module, capable of replacing the second stage of the Neural LightRig pipeline. The figure below compares the multi-light inputs from Stage 1 of Neural LightRig pipeline (left) with the final normal maps. This comparison highlights that LINO-UniPS (bottom right) achieves significantly more detailed and precise results than the original Neural LightRig pipeline Stage 2 (top right). Additionally, a future release will incorporate support for PBR material prediction.

Nerual LightRig

LINO-UniPS

BibTeX

@article{li2025lightnormalsunifiedfeature,
      title={Light of Normals: Unified Feature Representation for Universal Photometric Stereo}, 
      author={Hong Li and Houyuan Chen and Chongjie Ye and Zhaoxi Chen and Bohan Li and Shaocong Xu and Xianda Guo and Xuhui Liu and Yikai Wang and Baochang Zhang and Satoshi Ikehata and Boxin Shi and Anyi Rao and Hao Zhao},
      journal={arXiv preprint arXiv:2506.18882},
      year={2025}
}