← Back to Projects

Polygon Optimizer

PythonGeoPandasShapelyVector TilesGISFastAPI

The Challenge

Carbon credit projects from multiple registries (ACR, VCS, VERRA, ISO, CAR, PURO) provide geospatial data in various formats and quality levels. These polygon datasets often contain:

  • Excessive coordinate density - Files with thousands of unnecessary vertices, resulting in bloated file sizes
  • Invalid geometries - Self-intersections, topology errors, and mixed coordinate dimensions
  • Inconsistent projections - Different coordinate reference systems across registries
  • Poor web performance - Raw GeoJSON files too large for efficient web mapping applications

These issues make it difficult to visualize and interact with carbon credit project data on web maps, especially when dealing with thousands of projects simultaneously.

The Solution

I built a comprehensive 7-step geospatial data processing pipeline that transforms raw polygon data into optimized vector tiles suitable for high-performance web mapping:

01

Simplify & Unify Polygons

Uses the Douglas-Peucker algorithm via Shapely to reduce coordinate density while preserving geometric accuracy. Includes CRS normalization to WGS84, geometry validation and repair, 3D to 2D conversion, and adaptive tolerance based on coordinate density.

02

Compute Bounding Boxes

Generates spatial indices and bounding boxes for all projects, enabling efficient spatial queries and tile generation optimization.

03

Filter & Enrich

Removes invalid projects, filters out geometry collections not supported by vector tiles, and enriches missing project data from multiple registry sources.

04

Merge Projects

Consolidates filtered projects from all registries into a single unified GeoJSON file, maintaining project IDs and metadata for traceability.

05

Generate Vector Tiles

Uses Tippecanoe to create MBTiles format vector tiles (zoom levels 0-12) with smart simplification at lower zooms, preserving all project properties including IDs for data linking.

06

Serve Tiles

FastAPI-based tile server that efficiently serves vector tiles with CORS support, enabling integration with web mapping libraries like Mapbox GL JS and Leaflet.

07

Export to S3

Automated deployment pipeline that exports optimized tiles to AWS S3 for global CDN distribution and high-availability access.

Technical Stack

Core Libraries

  • GeoPandas - Geospatial data manipulation
  • Shapely - Geometric operations & validation
  • Tippecanoe - Vector tile generation
  • FastAPI - High-performance tile server

Infrastructure

  • Boto3 - AWS S3 integration
  • Mercantile - Tile coordinate systems
  • PostgreSQL/PostGIS - Spatial database support
  • MBTiles - Efficient tile storage format

Results & Impact

90%+
File Size Reduction
6+
Registries Supported
1000s
Projects Processed

The pipeline enables smooth, interactive web mapping of carbon credit projects at any scale. Vector tiles load instantly, pan and zoom operations are fluid, and the optimized data structure allows for efficient property lookups and filtering - all while maintaining complete data integrity and traceability back to source registries.

Key Learnings

  • Adaptive simplification is crucial - different geometries require different tolerance levels based on their coordinate density and complexity
  • Topology healing through smart buffering can fix invalid geometries while preserving visual accuracy
  • Vector tiles are far superior to serving raw GeoJSON for web mapping, especially at scale
  • Preserving metadata through the pipeline enables linking visualizations back to source data
  • Multi-step processing with intermediate outputs allows for debugging and quality assurance at each stage