Why Map Buildings That Don't Officially Exist?
Nairobi's informal settlements house over 2 million people. On official government maps, many of these areas appear as blank spaces. No building footprints. No road networks. No service infrastructure. This isn't an oversight — it's a policy consequence. Structures built without formal permits don't get formal maps.
But you can't plan water infrastructure for a community you can't see. You can't route emergency services through streets that don't exist on any map. You can't estimate population density without building counts.
At Spatial Collective, our mission was to make these communities visible on the map — literally.
The Mapping Pipeline
Data Collection: Street-Level Imagery
Before you can digitize buildings, you need to see them. Satellite imagery of informal settlements is often obscured by corrugated iron roofs that make individual structures indistinguishable. Our solution was street-level 360° imagery using Mapillary-compatible cameras:
- GoPro MAX cameras mounted on bicycle handlebars
- Community mappers cycling through settlement paths (many too narrow for cars)
- Mapillary upload pipeline — images geotagged and uploaded for the global community
- Quality checkpoints — coverage completeness verified against settlement boundaries
Digitization: JOSM and the Art of Tracing
JOSM (Java OpenStreetMap Editor) was our primary tool for converting imagery into map data. Each building footprint required:
- Identifying structure boundaries from combined satellite + street-level imagery
- Tracing the polygon with enough vertices to capture the building shape
- Tagging attributes — building material (concrete, iron sheet, wood), levels, roof type
- Quality validation — checking topology (no overlapping buildings), completeness, and tag accuracy
A skilled mapper could digitize 80-120 buildings per hour. At 47,000+ buildings across three settlements, this was a scale problem that required organized workflow.
The Microtasking Approach
We couldn't rely on a handful of expert mappers. Instead, we built a microtasking pipeline:
- Settlement area divided into grid cells (~100m × 100m each)
- Each cell assigned to a mapper via the HOT Tasking Manager
- First pass: trace buildings (quantity over perfection)
- Second pass: validate and correct (experienced mappers review)
- Third pass: attribute enrichment (add building tags from street imagery)
This three-pass approach meant that even novice mappers contributed useful data in pass one, while quality was ensured by experienced validators in passes two and three.
Data Quality at Scale
The Topology Problem
When 30+ mappers trace buildings independently across adjacent grid cells, boundary artifacts appear. Building A traced by Mapper 1 overlaps with Building B traced by Mapper 2. Roads traced in one cell don't connect to roads in the adjacent cell.
We ran automated quality checks using JOSM's validator and custom scripts:
- Overlap detection — flagged any buildings with >5% polygon overlap
- Gap detection — identified building clusters where imagery clearly showed structures but no footprints existed
- Road connectivity — verified that road segments connected at cell boundaries
- Tag consistency — ensured building material tags used standard OSM vocabulary
The Attribution Problem
OpenStreetMap has conventions. A building's roof material should be tagged roof:material=metal not roof=iron_sheet or material=mabati (the Swahili word for corrugated iron). With mappers from diverse backgrounds, tag consistency required:
- Custom JOSM presets — dropdown menus with only valid tag values
- Automated tag correction scripts run post-upload
- Weekly quality reviews with the mapping team
PostGIS: From Footprints to Intelligence
Raw building polygons in OSM are useful. Building polygons in PostgreSQL with PostGIS transforms them into intelligence:
-- Buildings per settlement
SELECT settlement_name, COUNT(*) as building_count,
SUM(ST_Area(ST_Transform(way, 32637))) as total_area_sqm
FROM buildings
JOIN settlement_boundaries ON ST_Within(buildings.way, settlement_boundaries.boundary)
GROUP BY settlement_name;
-- Population estimation (avg 4.2 persons per structure)
SELECT settlement_name, COUNT(*) * 4.2 as estimated_population
FROM buildings
JOIN settlement_boundaries ON ST_Within(buildings.way, settlement_boundaries.boundary)
GROUP BY settlement_name;
These queries powered dashboards used by urban planners, NGOs, and government agencies. For the first time, decision-makers had data-driven answers to "how many people live in Kibera?" that went beyond census estimates.
From Buildings to Infrastructure Planning
The building map was never the end goal — it was the foundation. Once you know where structures are, you can analyze:
- Service coverage — what percentage of buildings are within 200m of a water point?
- Access — which buildings are more than 50m from the nearest mapped path?
- Density hotspots — where are building-to-land ratios highest (indicating overcrowding)?
- Change detection — comparing quarterly imagery to identify new construction or demolition
The Impact
After 18 months of systematic mapping:
- 47,000+ building footprints digitized and uploaded to OpenStreetMap
- 3 major settlements fully mapped — Kibera, Mathare, Mukuru
- 120+ community mappers trained in JOSM and field surveying
- 15 organizations using the data for infrastructure planning
- Open data — every footprint freely available on OpenStreetMap for anyone to use
The most rewarding outcome wasn't the count. It was hearing that a water infrastructure project used our building density data to optimize pipe routing through Mukuru — serving 12,000 more households than the original plan because they could see, for the first time, where people actually lived.
What I Learned
Mapping is political. Making informal settlements visible on maps challenges power structures that benefit from their invisibility. We navigated this carefully, working with community leaders and local government rather than around them.
Quality scales with process, not people. The three-pass microtasking approach produced better results than having a small team of experts. Process design mattered more than individual skill.
Open data multiplies impact. By contributing everything to OpenStreetMap rather than a proprietary database, we enabled uses we never anticipated — from academic research to ride-hailing route optimization. The data continues to be used and maintained by the global OSM community years after our focused mapping campaigns ended.