August 18, 2024

Geospatial Querying & Map Interactions

📍 You are here.

Intro & Motivation#

Suppose we wanted to build a map that renders a marker for each location in a dataset. The problem? The dataset contains thousands and thousands of locations. If our application wanted to retrieve this data via a network request, we'd have to request thousands of records at once when the map first loads. Since the round trip of the request will scale based on the number of locations, we can expect this solution to be slow when the map first loads.

Alright, so maybe we can avoid the network round trip by reading from a file on disk? However, this now makes the dataset harder to update, making a file a single source of truth that must safely be flushed to.

Even if we don't care about the data being stale, it may simply be too expensive to render all the data on a map (consider a data set with millions of locations). And while there are ways to circumvent rendering tons of map data at once (e.g. clustering), we still face the issue of having millions of data points in memory.

At this point, the logical answer is to paginate the data - what if we only load in the locations that were in proximity to where the user's viewport of the map is? This way, we can avoid loading in all the data at once, and limit the amount of data in memory at a given time. This is geospatial querying. Chances are you've interacted with something like this before, such as on a "store locator", a common feature for businesses with brick-and-mortar locations that renders a map with markers for nearby stores.

Starting Simple - Rendering a Map with Everything#

We're going to explore this problem through the motivating example of building a map that renders markers for airports in the U.S., using this data set as our ground truth. From this dataset, there are approximately 20,000 airports in the U.S.

There's two key problems we'll have to solve when building something like this:

When to query for new data (based on map interactions)
How to query for new data (implementing geospatial queries)

Let's start with Problem 1.

Since the dataset we're using in this example is relatively small (a few MB), we can afford to load all the data in memory. Thus, we can start our example by loading all the data at once and rendering it on a map. Here's what we'll be using:

Leaflet: A popular open-source JavaScript library for interactive maps.
OpenStreetMap: Map tiles that Leaflet will render to display a map.
Dataset: A dataset of all airports in the U.S.

Massaging the Dataset#

First, we'll need to convert the data to a format that our map can easily render. GeoJSON is the standard for describing geometry geospatially, and is directly supported by Leaflet. I wrote a simple Python script to ingest the CSV and output a GeoJSON file:

aiport_csv_to_json.py

import csv
import json
 
def csv_to_json(csv_file, json_file):
    features = []
 
    with open(csv_file, newline='') as file:
        reader = csv.DictReader(file)
        next(reader)  # Skip secondary CSV header
        for row in reader:
            features.append({
                "type": "Feature",
                "properties": {
                    "name": row['name'],
                    "id": row['ident'],
                },
                "geometry": {
                    "type": "Point",
                    "coordinates": [float(row['longitude_deg']), float(row['latitude_deg'])]
                }
            })
 
    with open(json_file, 'w') as file:
        file.write(json.dumps({
            "type": "FeatureCollection",
            "features": features
        }, indent=2))
 
 
if __name__ == "__main__":
    csv_to_json("airports.csv", "airports.json")

The resulting GeoJSON file looks like this, and is ~7MB large:

airports.json

{
  "type": "FeatureCollection",
  "features": [
    {
      "type": "Feature",
      "properties": {
        "name": "Los Angeles International Airport",
        "id": "KLAX"
      },
      "geometry": {
        "type": "Point",
        "coordinates": [
          "33.94250107",
          "-118.4079971"
        ]
      }
    },
    ...
  ]
}

Rendering the Map#

Now that we have our data in a format that Leaflet can render, let's render the map with Leafet. I'll be working in a fresh TypeScript project with an HTML file that imports the GeoJSON output and renders each location with a map marker. We're not going to dive into implementation details here, since this step is pretty basic Leaflet usage. Note that we're simulating loading the data over the network by adding in a delay, and we're using a loading overlay to indicate that the data is being "fetched":

index.ts

import * as L from "leaflet";
import "leaflet/dist/leaflet.css";
import * as data from "./airports.json";
 
type AirportFeature = {
  type: "Feature";
  properties: {
    name: string;
    id: string;
  };
  geometry: {
    type: "Point";
    coordinates: [number, number];
  };
};
 
type AirportCollection = {
  type: "FeatureCollection";
  features: AirportFeature[];
};
 
const MAP_ROOT_ID = "map";
const LOADING_OVERLAY_ID = "loading-overlay";
 
const DEFAULT_ZOOM = 10;
const DEFAULT_CENTER: L.LatLngTuple = [38.90051858534433, -77.05347061157227];
const MAX_ZOOM = 19;
const MIN_ZOOM = 8;
 
function initializeMap(): L.Map {
  const map = L.map(MAP_ROOT_ID)
    .setView(DEFAULT_CENTER, DEFAULT_ZOOM)
    .setMinZoom(MIN_ZOOM)
    .setMaxZoom(MAX_ZOOM);
  L.tileLayer("https://tile.openstreetmap.org/{z}/{x}/{y}.png").addTo(map);
 
  return map;
}
 
async function fetchData(): Promise<AirportCollection> {
  const originalData = data as AirportCollection;
  const artificialDelayMs = 500;
  return new Promise((resolve) => {
    setTimeout(() => resolve(originalData), artificialDelayMs);
  });
}
 
function featureToPopup(feature: AirportFeature) {
  const { name, id } = feature.properties;
  const [lon, lat] = feature.geometry.coordinates;
  return `<div>
 
<div>
  <b>
    ${name} - ${id}
  </b>
</div>
<div>
  (Lon, Lat): (${lon.toFixed(5)}, ${lat.toFixed(5)})
</div>
 
  </div>`;
}
 
function featureToMarker(_feature: AirportFeature, coords: L.LatLng): L.Layer {
  return L.marker(coords, {
    icon: L.icon({
      iconUrl: "./map-pin.svg",
      iconSize: [32, 32],
      iconAnchor: [16, 16],
    }),
  });
}
 
async function loadDataToMap(
  map: L.Map,
  layerGroup: L.LayerGroup
): Promise<void> {
  const data = await fetchData();
 
  layerGroup.clearLayers();
 
  L.geoJson(data, {
    pointToLayer: (feature: AirportFeature, coords: L.LatLng) =>
      featureToMarker(feature, coords).addTo(layerGroup),
    onEachFeature: (feature: AirportFeature, layer: L.Layer) =>
      layer.bindPopup(featureToPopup(feature)),
  }).addTo(layerGroup);
 
  layerGroup.addTo(map);
}
 
function displayLoadingOverlay(show: boolean): void {
  const overlay = document.getElementById(LOADING_OVERLAY_ID);
  overlay.style.display = show ? "block" : "none";
}
 
function main() {
  const map = initializeMap();
  const markerLayerGroup = L.layerGroup().addTo(map);
 
  async function queryMap(): Promise<void> {
    try {
      displayLoadingOverlay(true);
      await loadDataToMap(map, markerLayerGroup);
    } finally {
      displayLoadingOverlay(false);
    }
  }
 
  queryMap();
}
 
main();

Even from a brief interaction with this first map, we immediately observe performance issues with rendering. When dragging the map, the FPS drops to a whopping 5-10 FPS. This is because our map is rendering 20,000 DOM nodes at once:

It's worth noting that Leaflet and its community has answers for this DOM performance bottleneck - rendering all markers on a single canvas layer (reducing DOM nodes), various clustering options, etc. Even with these options, we're facing the problem of getting this data to our map in the first place, along with keeping this data in memory.

Let's see how we can do better.

Only Rendering Data in View#

With our boilerplate set up, we now have to tackle the first phase of the problem - rendering only the data would appear on the user's current view of the map. This is where geospatial querying comes in, which allows us to query data based on where it's located. There's two common geospatial queries:

Bounding Box Query: Retrieves data that falls within a given bounding box region. A bounding box can be defined by its the bottom-left (south-west) and top-right (north-east) corners.
Radial Query: Retrieves data that falls within a given radius of a point.

Query Types

The viewport of our map itself can be expressed as a bounding box, so let's use a bounding box query. We'll be discussing how these are implemented in a database in a later section. For now, we're just going to simulate a naïve bounding box query in memory by iterating through all the data and checking which ones are in the bounds of the a given box. Let's modify the fetchData function to accept a bounding box filter.

async function fetchDataWithinBounds(
  boundingBox: L.LatLngBounds
): Promise<AirportCollection> {
  const originalData = data as AirportCollection;
  const artificialDelayMs = 500;
  return new Promise((resolve) => {
    setTimeout(() => {
      resolve({
        type: originalData.type,
        features: originalData.features.filter((feature) => {
          const [lon, lat] = feature.geometry.coordinates;
          return boundingBox.contains([lat, lon]);
        }),
      });
    }, artificialDelayMs);
  });
}

Something tricky: the GeoJSON format stores coordinates in [longitude, latitude] order, but Leaflet's LatLng class expects [latitude, longitude] order. That's why we're flipping the coordinates when checking if a point is contained within the bounding box.

With the bounding box query implemented, let's update our loadDataToMap function to accept a bounding box:

async function loadDataToMap(
  map: L.Map,
  layerGroup: L.LayerGroup,
  queryBounds: L.LatLngBounds
): Promise<void> {
  const data = await fetchData(queryBounds);
  const data = await fetchDataWithinBounds(queryBounds);
 
  layerGroup.clearLayers();
 
  L.geoJson(data, {
    pointToLayer: (feature: AirportFeature, coords: L.LatLng) =>
      featureToMarker(feature, coords).addTo(layerGroup),
    onEachFeature: (feature: AirportFeature, layer: L.Layer) =>
      layer.bindPopup(featureToPopup(feature)),
  }).addTo(layerGroup);
 
  layerGroup.addTo(map);
}

And lastly, we need to update our main function to pass the map viewport bounding box to loadDataToMap. First, let's create a function to get the bounding box of a map:

function getCurrentQueryRegion(map: L.Map): L.LatLngBounds {
  return map.getBounds();
}

Then, in main:

function main() {
  const map = initializeMap();
  const markerLayerGroup = L.layerGroup().addTo(map);
 
  async function queryMap(): Promise<void> {
    try {
      displayLoadingOverlay(true);
      await loadDataToMap(map, markerLayerGroup);
      await loadDataToMap(map, markerLayerGroup, getCurrentQueryRegion(map));
    } finally {
      displayLoadingOverlay(false);
    }
  }
 
  queryMap();
}

Here's what we get:

The good news: the map interactions are significantly smoother now thanks to lower DOM overhead. The bad news: when the user interacts with the map, the data doesn't update. We need to update the data on the map when the user moves. This is where geospatial querying comes in.

Triggering a Query#

We need to listen to actions that would change the map viewport, and "re-query" the data based on the new viewport. Typically, these actions are dragging and zooming. In our main function, let's register event listeners that retrieve new data based on the new viewport:

map.on("zoomend", queryMap);
map.on("dragend", queryMap);

And now, the data updates on interaction!

This seems great, but we've got a new problem. As seen in the video, when the user drags/zooms even just a little, we're requerying the data. This is unnecessary if the viewport hasn't significantly changed.

Pitfall: Drag/Zoom Thresholds#

At this point, you may be tempted to make a mistake that I once did: defining custom thresholds for the size of a user's drag/zoom to trigger a new query. This introduces quite a bit of complexity.

Let's walk through this approach: we'll define thresholds for when the user drags by a certain amount, or zooms by a certain amount. When the drag/zoom exceeds the threshold, we trigger a new query. We'll need two pieces of state, one for the last center point where the map was queried, and one for the last zoom level where the map was queried. We'll then compare the new center point and zoom level to the old ones, and trigger a new query if the difference exceeds the threshold. This is our first piece of complexity - two pieces of state that are of different units and have their own meanings, which means they have to be updated separately.

How do we decide what the thresholds are? We can define them as a fixed distance, such as a drag of 500 meters, but this will change based on the zoom level. We'll have to define the drag threshold as a percentage of the viewport size instead, and do something similar for the zoom. However, zooms are different than drags, for one, a zoom is an arbitrary unit. Moreover, a zoom in will always show a subset of the data that was previously shown, but the same cannot be promised for a drag. Our event handler for a zoom will have to account for this.

We also have to be careful to track the state at the point of the last query, rather than saving the state at the point of the last drag/zoom. This is because the user can make a series of small drags that add up to a large distance, and we only care about the final distance.

Solution: Viewport Overlap#

A simpler solution is to track the ends rather than the means - we only care about the viewport after the user drags or zooms, not the actual dragging or zooming itself. Let's track the last bounding box that was queried, and compare it to the current viewport of the map. If they are sufficiently different, we can trigger a new query. This works identically for dragging and zooming, and the only state we need to store is the last bounding box queried.

Of course, we now need to define what "sufficiently different" means. We can define this as the percentage of area of the last queried bounding box contained in the current map viewport. If the percentage is below a certain threshold, we can trigger a new query.

Let $\textit{before}$ be the last queried bounding box, defined by its bottom-left corner $(bx_{1}, by_{1})$ and top-right corner $(bx_{2}, by_{2})$ . Let $\textit{after}$ be the current viewport, defined by its bottom-left corner $(ax_{1}, ay_{1})$ and top-right corner $(ax_{2}, ay_{2})$ .

If $\textit{after}$ is contained within $\textit{before}$ , the percentage is 100%. This covers the case of a user zooming in (the viewport after zooming in is always a subset of the previous viewport). This also covers the (unlikely) case of the two viewports being exactly the same. In either case, we would not want to trigger a new query.
If $\textit{before}$ and $\textit{after}$ are completely disjoint, the percentage is 0%. This covers the case of a user dragging to a completely different region of the map. In this case, we would always want to trigger a new query.
When the viewports partially overlap, let $\textit{intersect}$ be the bounding box formed by the interection of the two boxes. The percentage of overlap would be:

$\% \textit{overlap} = \textit{area}(\textit{intersect}) / (\textit{area}(\textit{before}) + \textit{area}(\textit{after}) - \textit{area}(\textit{intersect}))$

Now, the only question is how to find $\textit{intersect}$ . We can take a look at some cases of overlap, with $\textit{before}$ in blue and $\textit{after}$ in red. The corners of the bounding box are shown with black dots.

Bounding box overlap

It becomes apparent that the bottom-left corner of $\textit{intersect}$ is a point that maximizes the $x$ and $y$ components of the two bottom-left corners, and the top-right corner is a point that minimizes the $x$ and $y$ components of the two top-right corners. In other words, $\textit{intersect}$ is defined by a bottom-left corner $(\max(bx_{1}, ax_{1}), \max(by_{1}, ay_{1}))$ and top-right corner $(\min(bx_{2}, ax_{2}), \min(by_{2}, ay_{2}))$ .

Let's implement this in code. We'll define a separate utils.ts file for handling these calculations:

utils.ts

import * as L from "leaflet";
 
export function areaOfBounds(bounds: L.LatLngBounds): number {
  const { lat: north, lng: east } = bounds.getNorthEast();
  const { lat: south, lng: west } = bounds.getSouthWest();
  return Math.abs(north - south) * Math.abs(east - west);
}
 
export function boundsOverlap(
  before: L.LatLngBounds,
  after: L.LatLngBounds
): number {
  if (!after.overlaps(before)) return 0;
  if (before.contains(after)) return 1;
 
  const { lat: bx1, lng: by1 } = before.getSouthWest();
  const { lat: bx2, lng: by2 } = before.getNorthEast();
  const { lat: ax1, lng: ay1 } = after.getSouthWest();
  const { lat: ax2, lng: ay2 } = after.getNorthEast();
 
  const intersect: L.LatLngBounds = L.latLngBounds(
    [Math.max(ax1, bx1), Math.max(ay1, by1)],
    [Math.min(ax2, bx2), Math.min(ay2, by2)]
  );
 
  const intersectArea = areaOfBounds(intersect);
 
  return (
    intersectArea / (areaOfBounds(before) + areaOfBounds(after) - intersectArea)
  );
}

A few notes:

This method of calculating the area of a bounding box is not meaningful outside of this context because we're only using the area to calculate the percentage of overlap. Latitude and longitude are angular measurements, so calculating area based off this is not meaningful.
We have the luxury of using Leaflet's contains and overlaps methods. If your mapping library doesn't have these, they can easily be implemented with simple comparisons of the bounding boxes' corners.

Let's use our boundsOverlap function to selectively trigger a query. First, we can get rid of event listeners for drag and zoom, and instead register a single listener for when the map view changes, moveend. Then, let's update main to store the last queried bounds, and only trigger a new query if the percentage of overlap is below 70% (i.e., the new viewport contains less than 70% of the old viewport). The higher the percentage, the less change is required to trigger a new query.

import { boundsOverlap } from "./utils";
 
const MAX_QUERY_OVERLAP_THRESHOLD = 0.7;
 
...
 
function main() {
  const map = initializeMap();
  const markerLayerGroup = L.layerGroup().addTo(map);
 
  let lastQueriedBounds: L.LatLngBounds = getCurrentQueryRegion(map);
 
  async function queryMap(): Promise<void> {
    const queryRegion = getCurrentQueryRegion(map);
    try {
      displayLoadingOverlay(true);
      await loadDataToMap(map, markerLayerGroup, getCurrentQueryRegion(map));
      await loadDataToMap(map, markerLayerGroup, queryRegion);
      lastQueriedBounds = queryRegion;
    } finally {
      displayLoadingOverlay(false);
    }
  }
 
  queryMap();
 
  map.on("moveend", () => {
    if (
      boundsOverlap(lastQueriedBounds, getCurrentQueryRegion(map)) <=
      MAX_QUERY_OVERLAP_THRESHOLD
    ) {
      queryMap();
    }
  });
}

Let's make another UX improvement by making our query a bit more eager. Rather than querying for points that are exactly within the map bounding box, we can instead query for a bounding box that's a slight scale factor larger than the viewport. This creates a buffer zone, allowing the user can drag/zoom around a bit without triggering a new query:

Query Region Padding

We can modify our getCurrentQueryRegion function to use Leaflet's pad method to increase each dimension of the bounding box by a certain ratio (this can be done manually by multiplying each corner component by a scale factor):

const QUERY_BUFFER_RATIO = 0.25;
 
...
 
function getCurrentQueryRegion(map: L.Map): L.LatLngBounds {
  return map.getBounds();
  return map.getBounds().pad(QUERY_BUFFER_RATIO);
}

Nice, we can make small drags/zooms around the map without triggering a new query until we've moved a significant distance. Also note that zooming in never triggers a new query.

Moving the Query to a Database#

Until now, we've been simulating a geospatial bounding box query using a function that filters locations on whether the point is contained within a bounding box. Let's see how we can implement this with a SQL based query.

SELECT *
FROM airports
WHERE latitude_deg BETWEEN :min_lat AND :max_lat
  AND longitude_deg BETWEEN :min_lon AND :max_lon;

This query works, but it's not efficient in that it must filter through all records in our table. Effectively, it's doing the same thing our in-memory client-side query.

Enter geospatial databases, which introduce data types and queries for handling geospatial data. Geospatial databases introduce a spatial index which allows for more efficient querying of a record's location.

In my previous experience with geospatial databases, I was using AWS DynamoDB with their Geo Library, which uses a geohash as a spatial index. Unfortunately, this tool no longer appears to be actively updated, with their examples now being archived.

Perhaps the most mature of these tools is PostGIS, a PostgreSQL extension that brings geospatial capabilities to PostgreSQL.

Setting Up PostGIS#

Let's get set up with PostGIS (this assumes you're already set up with PostgreSQL). On Mac, I installed PostGIS with Homebrew:

brew install postgis

Then, let's enable PostGIS on our database and create our table and initialize a spatial index on the location column:

CREATE EXTENSION postgis;
 
CREATE TABLE IF NOT EXISTS public.airports (
 id int generated by default as identity primary key,
 code text not null,
 name text not null,
 location GEOGRAPHY(POINT) not null
);
 
CREATE INDEX airports_geo_index
  on public.airports
  using GIST (location);

Notice that our location column is of type GEOGRAPHY(POINT). The GEOGRAPHY type is used to model features on the Earth's surface. I'll use a Python script similar to the one above to populate the table.

The GIST index is a "generic index structure". The inner workings are fascinating and beyond the scope of this article, but I recommend reading the PostGIS docs on the topic. Essentially, we are creating an index on the bounding box of the feature.

Populating/Querying the Table#

Let's use another Python script to fill our table with contents from the original CSV:

populate_airports.py

import csv
import psycopg2
 
def populate_db(csv_file):
    conn = psycopg2.connect("dbname='geospatial' host='127.0.0.1' port='5432'")
 
    features = []
 
    with open(csv_file, newline='') as file:
        reader = csv.DictReader(file)
        next(reader)  # Skip secondary CSV header
        for row in reader:
            features.append({
                "name": row['name'].replace("'", ""),
                "code": row['ident'],
                "lat": float(row['latitude_deg']),
                "lng": float(row['longitude_deg'])
            })
 
    values_string = ",\n".join(
        [f"(\'{feature['code']}\', \'{feature['name']}\', ST_POINT({feature['lng']}, {feature['lat']}, 4326))" for feature in features])
    query_string = f"INSERT INTO airports (code, name, location) VALUES {values_string};"
 
    cur = conn.cursor()
    cur.execute(query_string)
    conn.commit()
 
if __name__ == "__main__":
    populate_db("airports.csv")

Notice that we're inserting data into the location column using the ST_POINT, which creates a GEOGRAPHY(POINT) object. The 4326 argument is the SRID (Spatial Reference Identifier) for the WGS 84 coordinate system, the standard for GPS coordinates. We don't have to pass the SRID here, since WGS 84 is the default.

Now, we can try querying for a bounding box in the Washington D.C. area (just like our previous examples):

-- When selecting the lat and lng, we use the ST_Y and ST_X functions
-- to calculate the latitude and longitude from the stored geometry data type.
SELECT id, name, st_y(location::geometry) AS lat, st_x(location::geometry) AS lng
 FROM public.airports
 -- The && operator is the intersection operator.
 -- The ST_SetSRID function sets the coordinate system of the bounding box
 -- to the WGS84 coordinate system.
 WHERE location && ST_SetSRID(ST_MakeBox2D(ST_Point(-77.1198, 38.7916), ST_Point(-76.9094, 38.9955)), 4326)
 LIMIT 5;

"id"	"name"	"lat"	"lng"
20	"Ronald Reagan Washington National Airport"	38.8521	-77.037697
1411	"College Park Airport"	38.9805984497	-76.9223022461
14623	"Inova Alexandria Hospital Heliport"	38.8226013184	-77.10410308840001
14727	"Natl Hosp For Orthopaedics/Rehabilitation Heliport"	38.8480987549	-77.07689666750001
15480	"Prince Georges Hospital Center Heliport"	38.930301666259766	-76.9207992553711

We now have geospatial querying! This opens up the door for new user experiences, such as querying/sorting based on proximity to the center point of the map, containment within more complex geometries, and more.

Sidebar: Database Contents#

Let's take a look in our DB to see what's happening. We notice that our DB has another table, spatial_ref_sys. When we created the PostGIS extension, it created a table of spatial reference systems. There's 8,500 rows, one of of which is the WGS 84 coordinate system we're using (at id 4326)!

SELECT * FROM spatial_ref_sys WHERE srid = 4326;

In our airports table, we can see that the location column is of type geography:

PostGIS Table

The data in this column isn't human readable - it stores an encoded version of the geography feature. We can use the ST_AsText function to convert it to a human-readable format:

SELECT ST_AsText(location) FROM airports LIMIT 2;

PostGIS Data

This is why we had to use the ST_Y and ST_X functions to extract the lat/lng from our points.

Something fun: in PgAdmin, clicking the map icon on the location column will display all the results on a map!

Replacing the In-Memory Query#

All we have to do is surface this query via an API, and we can replace our in-memory query with a network request. For our toy example, let's make a simple Express server that will query the database for airports within a bounding box, and map the results to a GeoJSON response:

server.js

const express = require("express");
const bodyParser = require("body-parser");
 
const app = express();
const port = 4000;
const cors = require("cors");
 
const Pool = require("pg").Pool;
const pool = new Pool({
  host: "127.0.0.1",
  database: "geospatial",
  port: 5432,
});
 
app.use(cors());
app.use(bodyParser.json());
app.use(bodyParser.urlencoded({ extended: true }));
 
app.get("/airports", async (request, response) => {
  const { x1, y1, x2, y2 } = request.query;
  if (!x1 || !y1 || !x2 || !y2) {
    response.status(400).send("Missing query parameters");
    return;
  }
  const records = await pool.query(`
    SELECT id, name, st_y(location::geometry) AS lat, st_x(location::geometry) AS lng
        FROM public.airports
        WHERE location && ST_SetSRID(ST_MakeBox2D(ST_Point(${y1}, ${x1}), ST_Point(${y2}, ${x2})), 4326)
    `);
 
  const result = {
    type: "FeatureCollection",
    features: records.rows.map((row) => ({
      type: "Feature",
      geometry: { type: "Point", coordinates: [row.lng, row.lat] },
      properties: { id: row.id, name: row.name },
    })),
  };
 
  response.status(200).json(result);
});
 
app.listen(port, () => {
  console.log(`API running on port ${port}.`);
});

In our client map, we'll update the fetchDataWithinBounds function to make a fetch request to our API:

index.ts

const BASE_URL = "http://localhost:4000";
 
...
 
async function fetchDataWithinBounds(
  boundingBox: L.LatLngBounds
): Promise<AirportCollection> {
  const lowerLeft = boundingBox.getSouthWest();
  const upperRight = boundingBox.getNorthEast();
 
  const url = new URL(`${BASE_URL}/airports`);
  url.searchParams.append("x1", lowerLeft.lat.toString());
  url.searchParams.append("y1", lowerLeft.lng.toString());
  url.searchParams.append("x2", upperRight.lat.toString());
  url.searchParams.append("y2", upperRight.lng.toString());
 
  const res: AirportCollection = await fetch(url.href).then((res) =>
    res.json()
  );
 
  return res;
}

And for the moment of truth:

And that's it! We're successfully querying our database for airports based on the map viewport. Our client no longer relies on in-memory data and filtering, and our method of persistence is scalable and reliable (compared to a local file in the client).

Conclusion#

We've explored the problem of rendering large amounts of geospatial data on a map, and accomplished our initial goals:

Rendering Data in View: Improved map rendering performance and keeping client memory clean by only rendering data in the user's map viewport.
Triggering a Query: Triggered new queries based on the user's map interactions, but only when merited to provide a smoother user experience.
Geospatial Querying in a Database: We've persisted our data in a database, a reliable storage method that allows for more complex queries and scales better than in-memory solutions.

There's a few improvements we could make in a more complete implementation:

Unbounded Zooming: Right now, we've limited the zoom level so that a user can't zoom too far out, triggering a potentially large querying and causing a render bottleneck. We could release this restriction by updating our query. If our dataset included a column of popularity/size, we could use this to filter our query so only results above a certain popularity are shown. The popularity filter could scale with the zoom level, such that only the most popular airports are queried at a high zoom level, and smaller airports appear when zoomed in.
Seamless Loading: In our toy example, we query when the user's interaction with the map ends, resulting in a loading spinner briefly flashing on screen. What if we wanted the experience to be completely seamless? We could implement event listeners for drag and zoom that anticipate the user's next bounding box based on the velocity of their drag and query in advance. Of course, this needs to be carefully implemented/debounced so that the API is not inundated with requests.