StreetLens: Enabling Human-Centered AI Agents for
Neighborhood Assessment from Street View Imagery

Jina Kim^†

Leeje Jang^†

Yao-Yi Chiang^†

Guanyu Wang^‡

Michelle Pasco^‡

^† Department of Computer Science and Engineering, University of Minnesota

^‡ Family Social Science, College of Education and Human Development, University of Minnesota

Introduction

StreetLens is a human-centered, researcher-configurable workflow, that embeds relevant social science expertise in a vision language model (VLM) for scalable neighborhood environmental assessments. StreetLens mimics the process of trained human coders by grounding the analysis in questions derived from established interview protocols, retrieving relevant street view imagery (SVI), and generating a wide spectrum of semantic annotations from objective features (e.g., the number of cars) to subjective perceptions (e.g., the sense of disorder in an image).

Jupyter Notebooks

1_data_exploration.ipynb 2_assess_neighborhood_environment.ipynb

Try Out StreetLens

This demo integrates data from the original case study (refer to paper for details) and connects to Cloudflare through the University of Minnesota server. Use the toggle buttons one at a time to explore StreetLens. If you encounter any issues with this demo, please contact Jina Kim <kim01479@umn.edu>.

Logs will appear here...

BibTeX

@misc{kim2025streetlensenablinghumancenteredai, title={StreetLens: Enabling Human-Centered AI Agents for Neighborhood Assessment from Street View Imagery}, author={Jina Kim and Leeje Jang and Yao-Yi Chiang and Guanyu Wang and Michelle Pasco}, year={2025}, eprint={2506.14670}, archivePrefix={arXiv}, primaryClass={cs.HC}, url={https://arxiv.org/abs/2506.14670}, }