Part 1

Downloading the data

Downloading Global Forest Watch: Tree Cover Loss Data

To access the data, follow these steps:

  • Visit the Global Forest Watch data page.
  • Scroll down to the “Download Instructions” section.
  • Click on the desired tile on the world map.
  • Below the map, select the third link to download the loss year tile.

For this analysis, we utilized all available tiles. However, running this analysis is computationally expensive and you can download tiles required for your area of interest.

Converting the individual tiles to a VRT

Once you’ve saved the tiles to a folder, we are going to stitch them into a VRT. A VRT is a virtual raster file that acts as a mosaic of multiple raster datasets without physically merging them into a single file. We will create a function to do this and then run it on the files.

#Loading required packages
import os
import subprocess
def create_vrt_from_tifs(tif_directory, output_vrt):
    """Iterates through .tif files in the specified directory and stitches them into a single VRT
    
    Args: 
    tif_directory: path to the directory where the files are stored
    output_vrt: path to the folder where you want the generated vrt to be saved
    
    Returns:
    Generates a VRT file from the input tifs and saves it to the specified output directory
    """
    
    # Collect all .tif files in the specified directory
    tif_files = [os.path.join(tif_directory, f) for f in os.listdir(tif_directory) if f.endswith('.tif')]

    if not tif_files:
        print("No .tif files found in the specified directory.")
        return

    # Create the VRT using gdalbuildvrt
    command = ['gdalbuildvrt', output_vrt] + tif_files
    subprocess.run(command)

    print(f"Created VRT file: {output_vrt}")
 # Define the directory containing the .tif files and the output VRT file
tif_directory = 'path/to/your/directory'  # Change this to your .tif files directory
output_vrt = 'path/to/save/your/output/output.vrt'  # Change this to your desired VRT file path

# Create the VRT from the .tif files
create_vrt_from_tifs(tif_directory, output_vrt)

Next we will download and prep the geospatial database of carbon offset projects

Download the geospatial database from this link.

The database contains six geopackages, one for each continent. Since this analysis is global in scope, we will use all six geopackages.

We need to combine these geopackages into a single dataset, setting the project accounting area as the geometry. For projects without an accounting area, we will use the project area as the geometry.

#Importing packages
from glob import glob
import pandas as pd
import geopandas as gpd
from shapely.wkt import loads
def set_geometry(row):
    '''
    Process the geometry for a given feature. 
    
    If the Project Accounting Area column is defined, use that geometry. 
    Else, use the Project Area is the same as the Project Accounting Area. 
    '''
    if pd.notna(row['Project Accounting Area']):
        return loads(row['Project Accounting Area'])
    else:
        return loads(row['Project Area'])
# Define a path to the project database's geopackage folder
db_geopackage_folder = "path/to/your/geopackages"
    
# Define the output path
output_path = "path/to/save/your/geopackages/project_accounting_areas.gpkg"
    
# Load in the GeoJSON files
all_gdfs = []
for geopackage_path in glob(f"{db_geopackage_folder}/*.gpkg"):
    
    # Load in the geopackage dataframe
    gdf = gpd.read_file(geopackage_path)
    all_gdfs.append(gdf)
        
# Stack the continental geodataframes
all_gdfs = pd.concat(all_gdfs, axis=0)
    
# Set the accounting area geometry
all_gdfs['geometry'] = all_gdfs.apply(set_geometry, axis=1)
all_gdfs.set_geometry('geometry', inplace=True)
    
# Save the project accounting areas
all_gdfs.to_file(output_path, driver='GPKG')

Similar to the GFW tiles, you can choose to work with just the geopackage that corresponds to your area of interest rather than combining all of them.

To load a specific geopackage, use the following code. For example, to load projects from North America, use:

north_america = gpd.read_file('path/to/your/file/north_america.gpkg')

You can further filter the data by country. Use the following code to identify the countries included in the geopackage:

north_america['Country'].unique()

Then, filter the data for a specific country, such as the United States of America, with:

usa = north_america[north_america['Country'] == 'United States of America']

Both the databases are now ready to be used. In the next tutorial, we will learn how to extract data from the VRT for each project and combine it with our geospatial database