The aim of this project is to test and develop nowcasting and forecasting machine learning methods of population levels and migratory flows. Diverse data sources, not linked to statistical surveying of population, may prove to be valuable proxies. Nevertheless, their performance and generalizability need to be empirically established across geographical scales and different time frames.
Whereas a rich literature exists on the use of alternative data (night and daytime light intensity) in estimating various demographic and economic processes, most of these studies focus on peace time contexts. During the war, the connections between electricity use/nightlights and population activities will no longer conform peace-time estimated models. For example, night masking was employed in Ukraine in the early days of the Russian 2022 invasion, nevertheless many economic activities continues (the lack of light was thus not a good proxy for economic activity or human presence in a given area). As the war tactics changed in 2022, the damage of electricity production and distribution networks meant lack of light would again be a correlate of economic activity. Similarly, social media behavior may actually flip correlation signs during a war time - whereas in normal times, the share of tweets or FB with a picture in total tweets/FB posts were useful proxies of income, during war-time they may depict infrastructure destruction or war-related human misery. These same challenges emerge also during times of social unrest or natural disasters (earthquakes or flooding). These are but a few such examples of the peculiarities of using alternative data in nowcasting or forecasting of population presence.
The use of machine learning models can alleviate some of the raised issues above by testing usefulness of various datasources in different contexts. With improved computing power and development of non-linear techniques, highly specific models can be developed, ensembles may then provide sharper estimates of aggregate target variables, hopefully at more disaggregated geographical levels (oblast, large metro areas). Data on geo-located military violence, social media activity as well as internet latency may act as additional ML features.
The team we're looking for:
Space Technology Experts: Individuals with expertise in space technology, satellite data acquisition, and remote sensing platforms which can guide the integration of night-time and daytime data.
Social Media Data Expert: Individuals with expertise in social media data and/or internet latency data.
ML Expert: Individuals with expertise in the use of causal ML and/or spatial-temporal ML models
Here some useful resources: