AI-driven urban planning models mislead public health interventions, study shows

by · News-Medical

AI models relying on Google Street View images may misinterpret environmental features, leading to misguided public health efforts to reduce obesity and diabetes, a new study warns.

Study: Utilizing big data without domain knowledge impacts public health decision-making. Image Credit: TippaPat / Shutterstock.com

How is AI used in urban planning?

Recent advancements in AI have accelerated the incorporation of this technology into crucial fields, such as public health and urban planning, which can potentially affect large numbers of people at the community level. For example, GSV images have been combined with object detection by deep learning to evaluate the health outcomes associated with neighborhood properties defined by census tract.

GSV data provides information about the environment, including the types of vegetation, as well as urban development, such as road networks and building structures. These data have been mined using deep learning to devise local interventions targeting mental and cardiometabolic disease and the prevalence of the coronavirus disease 2019 (COVID-19).

The study analyzed over 2 million Google Street View images in New York City to assess built environment features like sidewalks and crosswalks and their relationship with obesity and diabetes rates​.

However, predictive models that use AI have encountered certain challenges, including the inability to identify spurious and biased data and the tendency to make spurious correlations that subsequently inform these predictions. These challenges are exacerbated when other factors may mediate the associations between exposure and health outcomes.

What did the study show?

The current study examined how GSV-derived features of the environment interact with the mean prevalence of obesity and diabetes in the census tract in New York City. It also assessed the relationship between these health conditions and physical inactivity, which is a significant contributor to this association.

GSV-derived data indicated that higher crosswalk density correlates with lower disease prevalence. The impact of physical activity on obesity was greater than that on diabetes, which was expected based on previous GSV-based crosswalk estimates. However, compared to previous studies, no association was observed between GSV estimates of sidewalk density and health outcomes.

Physical inactivity intervention vs. GSV feature

The effect of the prevalence of crosswalks and sidewalks on health outcomes was due to the prevalence of physical inactivity in the census tract. Thus, rather than the built environment itself, physical activity levels in that census tract accounted for health outcome changes.

With each unit of reduction in physical inactivity, the prevalence of both obesity and diabetes declined by 4.17 and 17.2 times, respectively, as compared to a single unit decrease in crosswalk prevalence.

Built environment out of sync with GSV features

The built environment, which was the basis of inferences made by GSV labels within the city, fails to match reality. For example, sidewalks may be represented near bridges or highways despite being absent, whereas a blocked sidewalk may be reported as absent.

The findings indicate that physical inactivity significantly mediates the relationship between environmental features and health outcomes, making behavior more influential than infrastructure.

Conclusions

Unlike previous studies, which relied on qualitative reviews to compare areas, the current study, for the first time, compares GSV features with ground-level reality.

The researchers utilized a causal framework to compensate for mediating factors like physical activity. This revealed that if 10% of the samples in the two lowest tertiles of physical inactivity were improved, a significant reduction of 4.17 and 17.2 times in the prevalence of obesity and diabetes mellitus, respectively, would be observed.

Nevertheless, data limitations, as well as the changing status of the built environment, individual behavior, and consequent health outcomes, must be carefully specified when leveraging this type of data for public health interventions.

Journal reference:

  • Zhang, M., Rahman, S., Mhasawade, V., et al. (2024). Utilizing big data without domain knowledge impacts public health decision-making. PNAS Environmental Sciences. doi:10.1073/pnas.2402387121.