Policy makers and the government rely heavily on survey data when making policy-related decisions. Survey data is labour intensive, costly and time consuming, hence it cannot be frequently or extensively collected. The main aim of this research is to demonstrate how deep learning in computer vision coupled with statistical regression modelling can be used to estimate poverty on aerial images supplemented with national household survey data. This is executed in two phases; aerial classification and detection phase and poverty modelling phase. The poverty measure estimated in this paper is the Sen-Shorrocks-Thon index (SST). The models in phase one performed relatively well with the classification model having an accuracy rate of 90.85% and a log loss of 0.5783 while the instance segmentation model has a log-loss of 0.839. The ridge model in phase two also performed well with an R2 of 0.708, a root mean square error (RMSE) of 0.081, and a strong positive correlation between the estimated SST and actual SST of 0.838 which indicates the strong ability of the model to estimate the SST from the geo-type and dwelling type of an area.