tutorialsreferenceguide

Data

The SPAI Library has several functions defined in the data module to explore and download geospatial data.

This is the structure of the data module:

/data
|- /satellite
    |- download_stac.py

Before starting, we should know what collections and what data are available. To see detailed information about each collection, go to Data collections.

from spai.data.satellite import AVAILABLE_COLLECTIONS

AVAILABLE_COLLECTIONS

# Output
['sentinel-2-l2a', 'sentinel-1-grd', 'cop-dem-glo-30', 'cop-dem-glo-90']

explore_satellite_imagery

This function is designed to explore and collect satellite imagery within a specified area of interest (AOI) for a given time interval and sensor type.

def explore_satellite_images(
    aoi: Any, 
    date: Optional[List[Union[str, datetime]]] = None, 
    collection: str = "sentinel-2-l2a", 
    crs: Optional[str] = "epsg:4326", **kwargs
) -> List[Dict[str, Any]]:

Parameters:

  • aoi : Any. Area of interest. It can be a GeoDataFrame, a list of coordinates, a bounding box, etc.
  • date : Optional[List[Union[str, datetime]]], optional. Date of the image, by default None. If None, the available images of the last month will be loaded.
  • collection : str, optional. Satellite collection to download, by default “sentinel-2-l2a”
  • crs : Optional[str], optional. Coordinate Reference System, by default “epsg:4326”
  • kwargs : dict. Extra parameters to pass to the downloader, such as bands, cloud_cover, vegetation_percentage an so on.

Returns: The function returns the search results containing the satellite images that match the provided criteria, in a dictionary. If no image is found, it returns None. If any extra parameter is given, it will be added to the output.

Example

Let’s see an example for exploring available images introducing:

  • The area of interest as a location name of Barcelona
  • A time interval of 1 month (from January 2024 to February 2024)
  • A cloud cover of 0.2%.
from spai.data.satellite import explore_satellite_imagery

images = explore_satellite_imagery("Barcelona", ("2024-01-01", "2024-02-01"), cloud_cover=0.2)

Now let’s see another example for exploring available images introducing:

  • The area of interest as the coordinates of a bounding box, which origin could come from drawing a polygon on a map, or from a geojson file reader, for example.
  • A cloud cover of 10%
bbox = (2.02846789, 41.27036258,  2.27905924, 41.45938995)
images = explore_satellite_imagery(bbox, ("2022-10-24", "2023-10-24"), cloud_cover=10)

Now, let’s list the images that are available, for date and cloud cover:

for image in images:
    print(image['datetime'].split('T')[0], image['cloud_cover'])

From this list, we could select which image we want to download.

download_satellite_imagery

This function download_satellite_imagery is designed to download a satellite image for a specified area of interest (AoI).

def download_satellite_imagery(
    storage: Storage,
    aoi: Any,
    date: Optional[List[Union[str, datetime]]] = None,
    collection: str = "sentinel-2-l2a",
    name: Optional[str] = None,
    clip: Optional[bool] = False,
    crs: Optional[str] = "epsg:4326",
    **kwargs,
) -> List[str]:

Parameters:

  • storage : Storage. Storage object to save the data.
  • aoi : Any. Area of interest. It can be a GeoDataFrame, a list of coordinates, a bounding box, etc.
  • date : Optional[List[Union[str, datetime]]], optional. Date of the image, by default None. If None, the last available image will be loaded.
  • collection : str, optional. Satellite collection to download, by default “sentinel-2-l2a”
  • clip : Optional[bool], optional. Clip the data to the area of interest, by default False
  • crs : Optional[str], optional. Coordinate Reference System, by default “epsg:4326”
  • kwargs : dict. Extra parameters to pass to the downloader, such as bands, cloud_cover, vegetation_percentage an so on.

Returns: It returns the storage path where the image has been downloaded to.

Example

Let’s see an example for downloading a sentinel-2-l2a image locally on date = 2023-07-16. For that, we will need to declare an instance of the object Storage to specify the path to download the images.

from spai.storage import Storage
from spai.data.satellite import download_satellite_imagery

storage = Storage()['data']

path = download_satellite_imagery(storage, 'Barcelona', '2023-07-16')

By default, images will be downloaded with the collection name and date (for example, sentinel-2-l2a_2023-07-16) but this can be changed by adding the name parameter.

path = download_satellite_imagery(storage, 'Barcelona', '2023-07-16', name='barcelona_image')

Furthermore, if you want the data to be only those corresponding to the geometries of your AoI, you can use the clip parameter to do so.

path = download_satellite_imagery(storage, 'Barcelona', '2023-07-16', name='barcelona_image', clip=True)

The image can be easily shown by using the thumbnail function, available in image module.

from spai.image import thumbnail

thumbnail(path)

load_satellite_imagery

What this function does is load satellite imagery into memory, from a given area of ​​interest (aoi) and date in memory. So what it returns is a Dataset of xarray. This function is very interesting for performing operations and applying mathematics on the data without having to download it.

This function is for the bravest 😜

def load_satellite_imagery(
    aoi: Any,
    date: Optional[List[Union[str, datetime]]] = None,
    collection: str = "sentinel-2-l2a",
    clip: Optional[bool] = False,
    crs: Optional[str] = "epsg:4326",
    **kwargs,
) -> xr.Dataset:

Parameters:

  • aoi : Any. Area of interest. It can be a GeoDataFrame, a list of coordinates, a bounding box, etc.
  • date : Optional[List[Union[str, datetime]]], optional. Date of the image, by default None. If None, the last available image will be loaded.
  • collection : str, optional. Satellite collection to download, by default “sentinel-2-l2a”
  • clip : Optional[bool], optional. Clip the data to the area of interest, by default False
  • crs : Optional[str], optional. Coordinate Reference System, by default “epsg:4326”
  • kwargs : dict. Extra parameters to pass to the downloader, such as bands, cloud_cover, vegetation_percentage an so on.

Returns: It returns a xr.Dataset with satellite imagery data in memory.

Example

As seen, data can be loaded directly into memory. This is very powerful, because it allows us to execute operations and download only the data we need.

from spai.data.satellite import load_satellite_imagery

bbox = (2.02846789, 41.27036258,  2.27905924, 41.45938995)
images = load_satellite_imagery(bbox, ("2022-10-24", "2023-10-24"))

Now, for example, we can calculate the median.

median = images.median('time')

If we now want to save it, we should simply use our Storage.

from spai.storage import Storage

storage = Storage()['data']

storage.create(median, name='image.tif')

Extra filter parameters

Of course, you can apply a multitude of filters and add extra parameters with which to filter the data we need. These filters can be applied to any of the functions showed above.

For example, we can filter the data by bands. This will depend on the bands of each collection. By default, the bands returned are the standard bands of each collection (for example, for sentinel-1-grd vv, vh are returned, and for sentinel-2-l2a the 12 bands are returned).

bands_s1 = ['vv']
bands_s2_rgb = ['red', 'green', 'blue']
bbox = (2.02846789, 41.27036258,  2.27905924, 41.45938995)

s1 = load_satellite_imagery(bbox, bands=bands_s1, collection='sentinel-1-grd')
s2 = load_satellite_imagery(bbox, bands=bands_s2_rgb, collection='sentinel-2-l2a')

💫 Pro tip! You can download the SCL filtering by band!

bands_s2 = ['scl']
s2 = load_satellite_imagery(bbox, bands=bands_s2, collection='sentinel-2-l2a')

Another example is to filter by cloud_cover. This parameter is a float that represents the percentage of cloud cover that the image can have. Of course, it only applies to optical images such as sentinel-2-l2a.

bbox = (2.02846789, 41.27036258,  2.27905924, 41.45938995)

images = explore_satellite_imagery(bbox, cloud_cover=10)

Of course, the filters will also change depending on each collection. For example, sentinel-2-l2a has a filter of cloud_cover or vegetation_percentage, and sentinel-1-grd has a filter of sar:orbit_state.

bbox = (2.02846789, 41.27036258,  2.27905924, 41.45938995)

images = explore_satellite_imagery(bbox, cloud_cover=10, vegetation_percentage=80)

On the other hand, you can change both the projection and the resolution of the data.

Caution! The resolution must be in the units of the projection!

bbox = (2.02846789, 41.27036258,  2.27905924, 41.45938995)
crs = 'epsg:3857'
resolution = 10

images = explore_satellite_imagery(bbox, crs=crs, resolution=resolution)

Troubleshooting

If you encounter any issues during the installation of SPAI, please get in touch with use through our Discord server.

Back to top