Image dehazing is one of the most challenging imaging inverse problems. Although deep learning methods produce compelling results, one of the most crucial practical challenge is that of non-homogeneous haze, which remains an open problem. To address this challenge, we propose 3 models that are inspired by ensemble techniques. First, we propose a DenseNet based single-encoder four-decoders structure denoted as EDN-3J, wherein among the four decoders, three of them output estimates of dehazed images (J1, J2, J3) that are then weighted and combined via weight maps learned by the fourth decoder. In our second model called EDN-AT, the single-encoder four-decoders structure is maintained while three decoders are transformed to jointly estimate two physical inverse haze models that share a common transmission map t with two distinct ambient light maps (A1, A2). The two inverse haze models are then weighted and combined for the final dehazed image. To endow two sub-models flexibility and to induce capability of modeling non-homogeneous haze, we apply attention masks to ambient lights. Both the weight maps and attention maps are generated from the fourth decoder. Finally, in contrast to the above two ensemble models, we propose an encoder-decoder-U-net structure called EDN-EDU, which is a sequential hierarchical ensemble of two different dehazing networks with different modeling capacities. Experiments performed on challenging benchmark image datasets of NTIRE'20 and NTIRE'19 demonstrate that the proposed models outperform many state-of-the-art methods and this fact is particularly demonstrated in the NTIRE-2020 contest where the EDN-AT model achieves the best result in the sense of the perceptual quality metric LPIPS.