Leveraging free, code-first tools to iterate toward advanced analytics
Alex Zajichek
Research Data Scientist, Cleveland Clinic
February 27, 2025
A Little Background
Who is QHS?
Department of 120+ biostatisticians, data scientists, programmers, etc. that collaborate on and supply quantitative support to research activities at Cleveland Clinic
From clinical trials and study design to precision medicine, population health, AI in medicine, and more, across many disease areas
My area focuses on clinical prediction modeling and observational statistical analysis, primarily using EHR and/or registry data
# Load packageslibrary(tidyverse)library(tidycensus)library(mapgl)# Import WI tractswi_tracts <- arcgislayers::arc_read(url ="https://tigerweb.geo.census.gov/arcgis/rest/services/Generalized_ACS2023/Tracts_Blocks/MapServer/4", where ="STATE = '55'" )# Extract median income by tractdat <-get_acs(geography ="tract",variables ="B19013_001", # Median income,state ="WI",year =2022,progress_bar =FALSE ) |># Join to get boundariesinner_join(y = wi_tracts |>select(GEOID, geometry),by ="GEOID" ) |># Make an information columnmutate(Info =paste0(str_remove(NAME, ";.+$"), "<br>Median Income ($): ", round(estimate)) ) |># Convert to spatial data frame sf::st_as_sf()# Make the makemaplibre() |># Focus the mapping areafit_bounds(dat) |># Fill with the data valuesadd_fill_layer(id ="mc_acs",source = dat,fill_outline_color ="black",fill_color =interpolate(column ="estimate",values =range(dat$estimate, na.rm =TRUE),stops =c("#f2d37c", "#08519c"),na_color ="gray" ),fill_opacity =0.50,popup ="Info" ) |>add_legend(legend_title ="Median income ($)",values =range(dat$estimate, na.rm =TRUE),colors =c("#f2d37c", "#08519c") )