Part 1

Accessing and Downloading Data

Downloading the Fire-Climate Classification Data

We utilize the fire-climate classification (FC) dataset developed by Sedana-Rivera et al. (2022). This dataset categorizes the globe into 13 distinct classes based on climatic zones—tropical, arid, boreal, and temperate—and fire risk occurrence. The fire risk is classified into three levels: recurrent (more than 7 fire-prone years per decade), occasional (3-7 fire-prone years per decade), and infrequent (0-3 fire-prone years per decade).

To work with the FC dataset, here’s a step-by-step guide: - Download the dataset - Visit the Dataverse page - Click the ‘Access Dataset’ button on the top right of the page, in the pop-up select ‘Download Zip’ option. This will begin downloading the dataset.

Additionaly, browse through the README document to better understand the data organization.

  • Unzip and locate the file
    • After downloading unzip the ‘dataverse_files’ folder to the location you want.
    • Navigate to ‘dataverse_files > Classification > 2-Classification’.

We will use the FC.nc file to extract the FC data. A ‘.nc’ file is a netCDF, you can read more about them here.

  • The classes in the data are coded as follows
    • 11: Tropical-dry season-recurrent
    • 12: Tropical-dry season-occasional
    • 13: Tropical-dry season-infrequent
    • 21: Arid-flora-recurrent
    • 22: Arid-flora-occasional
    • 23: Arid-flora-infrequent
    • 31: Temperate-dry season-recurrent
    • 32: Temperate-dry season-occasional
    • 33: Temperate-dry season-infrequent
    • 41: Boreal-hot season-recurrent
    • 42: Boreal-hot season-occasional
    • 43: Boreal-hot season-infrequent
    • 0: Non fire-prone

We extract the relevant variable from the file and convert it into a vector format for subsequent data manipulation.

Converting the file (FC.nc) from netCDF format to vector (geoJSON) format

#Import packages
import netCDF4 as nc
import numpy as np
import geopandas as gpd
from shapely.geometry import shape
import xarray as xr
import matplotlib.pyplot as plt
from osgeo import gdal, osr, ogr
#Loading the fire netCDF file
fire_data = xr.open_dataset('path/to/your/file/FC.nc')

fire_data

In this case, FC is the variable of interest that we aim to convert into a vector format.

To convert a NetCDF file to a vector format (GeoJSON), we use a function that opens the NetCDF file with GDAL, retrieves the first band containing the fire classification data (FC.nc file contains a single band that stores the fire classification data), and creates a GeoJSON shapefile with the appropriate spatial reference. The function then adds an integer field to the shapefile, polygonizes the raster band to convert the data into vector polygons, and finally closes both the shapefile and raster files to release resources.

def netcdf_to_vector(netcdf_path, output_path, layer_name, field_name):
    # Open the raster file
    raster = gdal.Open(netcdf_path)
   
    # Get the first raster band
    band = raster.GetRasterBand(1)

    # Get the raster projection
    proj = raster.GetProjection()
    shp_proj = osr.SpatialReference()
    shp_proj.ImportFromWkt(proj)

    # Create the shapefile
    driver = ogr.GetDriverByName('GeoJSON')
    create_shp = driver.CreateDataSource(output_path)

    # Create the shapefile layer
    shp_layer = create_shp.CreateLayer(layer_name, srs=shp_proj)
 
    # Create the new field
    new_field = ogr.FieldDefn(field_name, ogr.OFTInteger)
    
    shp_layer.CreateField(new_field)

    # Get the index of the new field
    dst_field = shp_layer.GetLayerDefn().GetFieldIndex(field_name)

    # Polygonize the raster band
    gdal.Polygonize(band, None, shp_layer, dst_field, [], callback=None)

    # Close and cleanup
    create_shp = None
    raster = None
#setting paths to the netcdf file and output vecto path, and defining the layer_name and field_name variable
netcdf_path = 'path/to/your/file/FC.nc'
output_path = 'path/to/your/file/fire_classification_vector.geojson'
layer_name = 'fire_classification'
field_name = 'FC' #Variable of interest

#Running the function
netcdf_to_vector(netcdf_path, output_path, layer_name, field_name)

Loading the fire classification vector and make sure it works

## Reading the shapefile
fire_vec = gpd.read_file('path/to/your/file/fire_classification_vector.geojson')
fire_vec.dtypes

Since the fire classification data is coded numerically, it is currently stored as an integer variable. We will create a new categorical variable to represent these classifications more meaningfully

fire_vec['FC_cat'] =  fire_vec['FC'].astype("category")
fire_vec.dtypes 
#Plotting to make sure the FC vector file works
fire_vec.plot(column='FC_cat', legend=True)

We have converted out Fire-Climate classification netCDF to a .geojson format file. Next we will download and process the geospatial database

Download the geospatial database of carbon offset projects

Download the geospatial database from this zenodo repository

The database includes six geopackages, each representing a different continent. For a comprehensive global analysis, we will use all of them. However, if you encounter performance issues due to the large size of the geopackages, you may choose to work with a subset instead.

We will combine these six geopackages into a single unified geopackage.

Since this analysis focuses solely on Avoided Deforestation (AD) and Improved Forest Management (IFM) projects, we will create a new geopackage containing only these project types.

import pandas as pd
from glob import glob
import geopandas as gpd
# Loading the geopackages, stacking them, and filtering them for AD and IFM projects. 
gpkg_path = 'path/to/your/geopackages'

# Load in the GeoJSON files
all_gdfs = []

for geopackage_path in glob(f"{gpkg_path}/*.gpkg"):
    
    # Load in the geopackage dataframe
    gdf = gpd.read_file(geopackage_path)
    all_gdfs.append(gdf)
        
# Stack the continental geodataframes
all_gdfs = pd.concat(all_gdfs, axis=0)

#Filtering for AD and IFM projects only
ad_ifm_gdf = all_gdfs[(all_gdfs['Project Type'] == "AD") | (all_gdfs["Project Type"] == "IFM")]

#Saving the filtered dataset
ad_ifm_gdf.to_file('path/to/save/stacked/filtered/geoapackge/data_ad_ifm.gpkg', driver='GPKG') 

References

  1. Senande-Rivera, M., Insua-Costa, D. & Miguez-Macho, G. (2022). Spatial and temporal expansion of global wildland fire activity in response to climate change. Nat Commun 13, 1208. https://doi.org/10.1038/s41467-022-28835-2