
Unsupervised monocular depth estimation with aggregating image features and wavelet SSIM (Structural SIMilarity) loss

Figure 4. The proposed depth network architecture. The width and height of every cube indicates output channels, and the size is reduced by half every time. The first yellow cube is a convolution block, while the rest of the yellow cubes are ResNeXt blocks. The orange blocks represent the five-scale feature map, F. In the decoder network, convolution layers are blue. Upsample and convolution operations are red. D is the four-scale depth map.

Intelligence & Robotics
ISSN 2770-3541 (Online)
Follow Us


All published articles are preserved here permanently:


All published articles are preserved here permanently: