Visualize Your Geospatial Data Like A Pro with Geomapviz 🗺️👨‍🔬

Thomas Bury
5 min readMay 13, 2023

--

Are you tired of looking at boring and static maps? Do you want to spice up your geospatial data visualization game? Look no further, my friend! Introducing Geomapviz 🎉🎉🎉 (web-documentation).

Geomapviz is a Python library that will help you create beautiful and interactive maps of your geospatial data. With Geomapviz, you can easily aggregate your tabular data at the geoid level and merge it with a shapefile. The library provides a simple API to plot the average for single or multiple columns.

But wait, there’s more! Geomapviz can also produce a single map or a panel of maps, making it useful for comparing how different models capture geographical patterns. The package even supports returning average values either raw or automatically binned.

And did I mention that you can customize the background color? You can switch from a boring old white background burning your eyes to a dark one!

Let me show you some examples of what Geomapviz can do.

pip install -U geomapviz

Artificial data

First up, let’s create a dummy dataset to test out the library’s features. For instance, let’s take the greatest country in the world, Belgium, which is obviously the first military and economic power in the universe. Using the load_shp function, we can load the country’s shapefile and generate an artificial dataset with multiple features, including geographic coordinates, weights, and other variables that you can play around with. This feature allows you to experiment with the library and create beautiful and interactive visualizations without any prior knowledge of geospatial data.

# load the shapefile
shp_file = load_shp(country="BE")
geom_df = shp_file.copy()

# create correlation with the geo entities
feat_1 = np.repeat(np.log10(geom_df.INS.astype(int).values), 10)
feat_1 = (feat_1 - feat_1.min()) / (feat_1.max() - feat_1.min())

# dummy data
X = (np.repeat(geom_df.long.values, 10) - (geom_df.long.mean())) / geom_df.long.std()
Y = (np.repeat(geom_df.lat.values, 10) - (geom_df.lat.mean())) / geom_df.lat.std()

# dummy data
bel_df = pd.DataFrame(
{
"geoid": np.repeat(geom_df.INS.values, 10),
"truth": (1 - Y + X + Y**3) * np.exp(-(X**2 + Y**2)),
"feat_2": (1 - Y**3 + X**3 + Y**5) * np.exp(-(X**2 + Y**2))
+ np.random.beta(0.5, 0.5, size=len(feat_1)),
"feat_3": (1 + Y * X + Y**3) * np.exp(-(X**2 + Y**2))
+ np.random.beta(0.5, 0.5, size=len(feat_1)),
"feat_4": feat_1 + np.random.beta(5, 2, size=len(feat_1)),
"weight": np.random.random(size=len(feat_1))
* (1 - Y + X + Y**3)
* np.exp(-(X**2 + Y**2)),
}
)

bel_df = bel_df.merge(
geom_df[["INS", "borough", "district"]], left_on="geoid", right_on="INS"
)

bel_df.head()

Setting the style

To set up the style using the PlotOptions in Geomapviz, you can define the colors and other visual features of the maps such as the figure size, font size, and title. This allows for a great deal of customization to ensure that the visualization is tailored to your needs. The PlotOptions also allow you to control the legend, the colorbar, and other aspects of the plot, giving you complete control over the final product.

options = PlotOptions(
# data arguments
df=bel_df,
target="truth",
other_cols_avg=["feat_2", "feat_3", "feat_4"],
# weights arguments
weight=None,
plot_weight=False,
# geospatial arguments
dissolve_on=None,
geoid="INS",
shp_file=shp_file,
# uncertainty arguments
distr="gaussian",
plot_uncertainty=False,
# style arguments
background=None,
figsize=(5, 5),
ncols=2,
cmap=None,
facecolor="#2b303b",
nbr_of_dec=None,
# binning arguments
autobin=False,
normalize=False,
n_bins=7,
)

we have a beautiful map of Belgium 🇧🇪, with data on the average features for each province. With just a few lines of code, we were able to create this stunning visualization that is sure to impress.

Illustration of the mean value for each geographical entity. Image by the author.

Dissolving per province

Dissolving per province is a useful technique to gain a broader understanding of geospatial data. By combining smaller geographic entities into larger ones, we can create a more general view of a region, which can reveal patterns that would otherwise be missed. This technique is particularly effective when analyzing data at the national or regional level and can be easily implemented using the dissolve function in Geomapviz.

options.dissolve_on = "borough"
f = spatial_average_plot(options=options)
Aggregate by province for a less detailed view. Image by the author.

Adding a background

With GeoMapViz, it’s easy to add a background map by using the plot_options parameter and specifying a tile provider, such as OpenStreetMap or Stamen or even your own background. This can be done with just a few lines of code, allowing you to quickly customize the look of your map.

# Downloading the background
options.background = cx.providers.Stamen.TonerLite
f = spatial_average_plot(options=options)
# Using your own background
from pathlib import Path
p = Path().absolute()
be_tif = p.parents[0].joinpath("src/geomapviz/bckgd/belgium_sd.tif")

options.background = be_tif
f = spatial_average_plot(options=options)
Adding a background. Image by the author.

Facet plot and auto-binning

The facetplot in geomapviz allows for multiple maps to be displayed side-by-side, making it easy to compare different maps. Additionally, geomapviz offers an auto-binning feature that can automatically bin the data and even normalize w.r.t. a given column, making it easy to visualize data distributions. Together, these features provide a powerful tool for visualizing and comparing data across different regions.

# A bigger figure
options.figsize = (16, 12)
# Autobinning
options.autobin = True
# Normalise w.r.t the target column
options. Normalize = True

# Facet plot function
f = spatial_average_facetplot(options=options)
The predictors and target are normalized and binned at the province level. Image by the author.

Netherlands

shp_file = load_shp(country="NL")
geom_df = shp_file.copy()
geom_df["PC4CODE"] = geom_df["PC4CODE"].astype(str)

# create correlation with the geo entities
feat_1 = np.repeat(np.log10(geom_df["PC4CODE"].astype(int).values), 10)
feat_1 = (feat_1 - feat_1.min()) / (feat_1.max() - feat_1.min())
# dummy data
X = (np.repeat(geom_df.XCOORD.values, 10) - (geom_df.XCOORD.mean())) / geom_df.XCOORD.std()
Y = (np.repeat(geom_df.YCOORD.values, 10) - (geom_df.YCOORD.mean())) / geom_df.YCOORD.std()

# dummy data
nl_df = pd.DataFrame(
{
"geoid": np.repeat(geom_df["PC4CODE"].values, 10),
"truth": (1 - Y + X + X*Y**3) * np.exp(-(X**2 + Y**2)),
"feat_2": (1 - Y**3 + X**3 + Y**5) * np.exp(-(X**2 + Y**2))
+ np.random.beta(0.5, 0.5, size=len(feat_1)),
"feat_3": (1 + Y * X + Y**3) * np.exp(-(X**2 + Y**2))
+ np.random.beta(0.5, 0.5, size=len(feat_1)),
"feat_4": feat_1 + np.random.beta(5, 2, size=len(feat_1)),
"weight": np.random.random(size=len(feat_1))
* (1 - Y + X + Y**3)
* np.exp(-(X**2 + Y**2)),
}
)

nl_df = nl_df.merge(
geom_df[["PC4CODE", "PROVC_NM", "GEMNAAM"]], left_on="geoid", right_on="PC4CODE"
)

nl_df.head()

set the style using the dedicate data class

options = PlotOptions(
# data arguments
df=nl_df,
target="truth",
other_cols_avg=["feat_2", "feat_3", "feat_4"],
# weights arguments
weight=None,
plot_weight=False,
# geospatial arguments
dissolve_on=None,
geoid="PC4CODE",
shp_file=shp_file,
# uncertainty arguments
distr="gaussian",
plot_uncertainty=False,
# style arguments
background=cx.providers.OpenStreetMap.Mapnik, #nl_tif,
figsize=(5, 5),
ncols=2,
cmap=None,
facecolor="#2b303b",
nbr_of_dec=None,
# binning arguments
autobin=False,
normalize=False,
n_bins=7,
)

plot the map

f = spatial_average_plot(options=options)
Adding a OpenStreetMap background. Image by the author.

And more

Explore the additional functionalities such as interactive maps and confidence intervals available in the documentation notebook.

--

--

Thomas Bury

Physicist by passion and training, Data Scientist and MLE for a living (it's fun too), interdisciplinary by conviction.