Python FME API: working with rasters and GDAL

FME Workbench is a fantastic ETL but sometimes it is necessary to use coding in order to compensate for missing transformers. For instance, there is no easy way to generate proximity rasters in FME, while it is piece of cake with command line. Given an input raster with non-zero target pixels (roads in the example below), generates a raster whose pixels represent minimal distances to these targets.

Proximity raster

The question is: how to obtain the same result with FME Workbench? Using a PythonCaller transformer and importing GDAL is an option. I provide a FME template here.

First things first, TransportationRoads reader opens up a geopackage file containing the road layer. FeatureColorSetter changes the color of the roads in a color different from black. This is an important detail because otherwise the pixels of the rasterized roads would be 0, and ComputeProximity function in GDAL Python API requires non-zero target pixels (I hope this tutorial can be a help to use any of the functions of GDAL and not only ComputeProximity). Then, ImageRasterizer rasterizes the road layer and sets the background color to black.

Now comes the fun with the PythonCaller.

import fmeobjects
import numpy as np
from osgeo import gdal

FME documentation explains how to install Python packages, but osgeo and numpy should already be in the scope of your FME Workbench installation.

class MyBandTilePopulator(fmeobjects.FMEBandTilePopulator):
    This is a subclass of the FMEBandTilePopulator superclass.
    It will be used when data is requested to create a new tile and
    and populate it to a new FMEBand.
    def __init__(self, rasterData):
        self._rasterData = rasterData

    # required method
    def clone(self):
        This method is used to create a copy of the data
        multiple times while creating a new band
        return MyBandTilePopulator(self._rasterData)

    # required method
    def getTile(self, startRow, startCol, tile):
       Creates a new tile that's sized based on the input tile.
       Populates that tile using this populator's raster data beginning
       at the startRow and startCol.
       numRows, numCols = tile.getNumRows(), tile.getNumCols()
       newTile = fmeobjects.FMEUInt16Tile(numRows, numCols)
       data = newTile.getData()
       for row in range(startRow, startRow+numRows):
           for col in range(startCol, startCol+numCols):
               if row < len(self._rasterData) and col < len(self._rasterData[0]):
                   data[row - startRow][col - startCol] = self._rasterData[row][col]
       return newTile

This class code is mainly a copy/paste of the Python FME API documentation. I just changed the tile type to FMEUInt16Tile so as to be compatible with the raster interpretation in ImageRasterizer (Gray16). In all, 24 different interpretations are possible. Regarding the class itself, the documentation is a bit laconic. It is used to store raster band data as well as to characterize how these data can be reached with getTile method.

class FeatureProcessor(object):
    def __init__(self):
        self.rasterData = []

    def input(self, feature):
        self.sysRef = feature.getCoordSys()
        raster = feature.getGeometry()
        rp = raster.getProperties()
        band = raster.getBand(0)
        bp = band.getProperties()
        print("=== FME Input Raster ===")
        print("Coordinate System = " + self.sysRef)
        self.pixelWidth = rp.getSpacingX()
        print("pixelWidth = " + str(self.pixelWidth))
        self.pixelHeight = rp.getSpacingY()
        print("pixelHeight = " + str(self.pixelHeight))
        self.originX = rp.getOriginX()
        print("originX = " + str(self.originX))
        self.originY = rp.getOriginY()
        print("originY = " + str(self.originY))
        numRows = rp.getNumRows()
        print("numRows = " + str(numRows))
        numCols = rp.getNumCols()
        print("numCols = " + str(numCols))
        tile = fmeobjects.FMEGray16Tile(numRows, numCols)
        bandData = band.getTile(0, 0, tile).getData()
        array = np.array(bandData)
        driver = gdal.GetDriverByName('GTiff')
        num_of_bands = 1
        print("Creating file raster_transportation_roads.tiff")
        srcRaster = driver.Create('raster_transportation_roads.tiff', numCols, numRows, num_of_bands, gdal.GDT_UInt16)
        srcRaster.SetGeoTransform((self.originX, self.pixelWidth, 0, self.originY, 0, self.pixelHeight))
        srcBand = srcRaster.GetRasterBand(1)
        srcBand = None
        srcRaster = None
        src_ds = gdal.Open('raster_transportation_roads.tiff')
        geotransform = src_ds.GetGeoTransform()
        srcBand = src_ds.GetRasterBand(1)
        print("Origin = ({}, {})".format(geotransform[0], geotransform[3]))
        print("Pixel Size = ({}, {})".format(geotransform[1], geotransform[5]))
        print("Creating file proximity_transportation_roads.tiff with GDAL")
        drv = gdal.GetDriverByName('GTiff')
        dst_ds = drv.Create(dst_filename, numCols, numRows, num_of_bands, gdal.GetDataTypeByName('UInt16'))
        dstBand = dst_ds.GetRasterBand(1)
        gdal.ComputeProximity(srcBand, dstBand, ["DISTUNITS=PIXEL"])
        self.rasterData = dstBand.ReadAsArray().tolist()

        dstBand = None
        srcBand = None

The input of PythonCaller is the one-band raster encapsulated in feature. This band has many properties such as the pixel size (SpacingX and SpacingY) in meters in the chosen projected coordinate system, the origin coordinates of the band (originX and originY), and its size (numRows and numCols).

Line 29 - 30 are very important and, to my knowledge, not documented in FME API. They allow to get the band data by setting a tile having the same size. The rest is in accordance with GDAL documentation. The band data is put into a Numpy array and two files, raster_transportation_roads.tiff and proximity_transportation_roads.tiff are generated in the FME template folder. This is no mandatory but handy to verify data transformation. To compute the distance raster, I use ComputeProximity function, and put the corresponding data to rasterData Python list (ReadAsArray returns a Numpy array).

def close(self):

        # creating the raster data and specifying the formatting of the new raster
        rasterData = self.rasterData

        # specifying all of the properties for the new FMERaster
        numRows, numCols = len(rasterData), len(rasterData[0])
        xCellOrigin, yCellOrigin = 0.5, 0.5
        xSpacing, ySpacing = self.pixelWidth, self.pixelHeight
        xOrigin, yOrigin = self.originX, self.originY
        xRotation, yRotation = 0.0, 0.0

        # creating the new FMERaster
        rasterProperties = fmeobjects.FMERasterProperties(numRows, numCols,
                                                          xSpacing, ySpacing,
                                                          xCellOrigin, yCellOrigin,
                                                          xOrigin, yOrigin,
                                                          xRotation, yRotation)
        raster = fmeobjects.FMERaster(rasterProperties)

        # Populating the contents of the band and appending it to the raster
        bandTilePopulator = MyBandTilePopulator(rasterData)
        bandName = ''
        bandProperties = fmeobjects.FMEBandProperties(bandName,
                                                      numRows, numCols)
        band = fmeobjects.FMEBand(bandTilePopulator, rasterProperties,

        # creating a new feature with the FMERaster geometry to be output
        feature = fmeobjects.FMEFeature()

    def process_group(self):
        """When 'Group By' attribute(s) are specified, this method is called 
        once all the FME Features in a current group have been sent to input().

        FME Features sent to input() should generally be cached for group-by 
        processing in this method when knowledge of all Features is required. 
        The resulting Feature(s) from the group-by processing should be emitted 
        through self.pyoutput().

        FME will continue calling input() a number of times followed
        by process_group() for each 'Group By' attribute, so this 
        implementation should reset any class members for the next group.

In this last part, I create new raster with the same properties as the initial raster. I populate a new band with the proximity data and attach it to the raster. The output of PythonCaller is then generated by pyoutput().

Related Posts

Corine Land Cover: climate change in French Alps

These days, climate change warnings are everywhere. But I have to admit, the first time I had the opportunity to verify it for myself in data, it…

This Post Has 2 Comments

  1. Nice one! I guess I would have used FeatureWriter with a temp raster followed by a SystemCaller to GDAL but you offer a really good example of the API. Thanks for the article!

Leave a Reply

Your email address will not be published.