Data
The SPAI Library has several functions defined in the data
module to explore and download geospatial data.
This is the structure of the data
module:
/data
|- /satellite
|- download_stac.py
Before starting, we should know what collections and what data are available. To see detailed information about each collection, go to Data collections.
from spai.data.satellite import AVAILABLE_COLLECTIONS
AVAILABLE_COLLECTIONS
# Output
['sentinel-2-l2a', 'sentinel-1-grd', 'cop-dem-glo-30', 'cop-dem-glo-90']
explore_satellite_imagery
This function is designed to explore and collect satellite imagery within a specified area of interest (AOI) for a given time interval and sensor type.
def explore_satellite_images(
aoi: Any,
date: Optional[List[Union[str, datetime]]] = None,
collection: str = "sentinel-2-l2a",
crs: Optional[str] = "epsg:4326", **kwargs
) -> List[Dict[str, Any]]:
Parameters:
aoi
: Any. Area of interest. It can be a GeoDataFrame, a list of coordinates, a bounding box, etc.date
: Optional[List[Union[str, datetime]]], optional. Date of the image, by default None. If None, the available images of the last month will be loaded.collection
: str, optional. Satellite collection to download, by default “sentinel-2-l2a”crs
: Optional[str], optional. Coordinate Reference System, by default “epsg:4326”kwargs
: dict. Extra parameters to pass to the downloader, such as bands, cloud_cover, vegetation_percentage an so on.
Returns: The function returns the search results containing the satellite images that match the provided criteria, in a dictionary. If no image is found, it returns None. If any extra parameter is given, it will be added to the output.
Example
Let’s see an example for exploring available images introducing:
- The area of interest as a location name of Barcelona
- A time interval of 1 month (from January 2024 to February 2024)
- A cloud cover of 0.2%.
from spai.data.satellite import explore_satellite_imagery
images = explore_satellite_imagery("Barcelona", ("2024-01-01", "2024-02-01"), cloud_cover=0.2)
Now let’s see another example for exploring available images introducing:
- The area of interest as the coordinates of a bounding box, which origin could come from drawing a polygon on a map, or from a geojson file reader, for example.
- A cloud cover of 10%
bbox = (2.02846789, 41.27036258, 2.27905924, 41.45938995)
images = explore_satellite_imagery(bbox, ("2022-10-24", "2023-10-24"), cloud_cover=10)
Now, let’s list the images that are available, for date and cloud cover:
for image in images:
print(image['datetime'].split('T')[0], image['cloud_cover'])
From this list, we could select which image we want to download.
download_satellite_imagery
This function download_satellite_imagery
is designed to download a satellite image for a specified area of interest (AoI
).
def download_satellite_imagery(
storage: Storage,
aoi: Any,
date: Optional[List[Union[str, datetime]]] = None,
collection: str = "sentinel-2-l2a",
name: Optional[str] = None,
clip: Optional[bool] = False,
crs: Optional[str] = "epsg:4326",
**kwargs,
) -> List[str]:
Parameters:
storage
: Storage. Storage object to save the data.aoi
: Any. Area of interest. It can be a GeoDataFrame, a list of coordinates, a bounding box, etc.date
: Optional[List[Union[str, datetime]]], optional. Date of the image, by default None. If None, the last available image will be loaded.collection
: str, optional. Satellite collection to download, by default “sentinel-2-l2a”clip
: Optional[bool], optional. Clip the data to the area of interest, by default Falsecrs
: Optional[str], optional. Coordinate Reference System, by default “epsg:4326”kwargs
: dict. Extra parameters to pass to the downloader, such as bands, cloud_cover, vegetation_percentage an so on.
Returns: It returns the storage path where the image has been downloaded to.
Example
Let’s see an example for downloading a sentinel-2-l2a
image locally on date = 2023-07-16
. For that, we will need to declare an instance of the object Storage
to specify the path to download the images.
from spai.storage import Storage
from spai.data.satellite import download_satellite_imagery
storage = Storage()['data']
path = download_satellite_imagery(storage, 'Barcelona', '2023-07-16')
By default, images will be downloaded with the collection name and date (for example, sentinel-2-l2a_2023-07-16
) but this can be changed by adding the name
parameter.
path = download_satellite_imagery(storage, 'Barcelona', '2023-07-16', name='barcelona_image')
Furthermore, if you want the data to be only those corresponding to the geometries of your AoI, you can use the clip
parameter to do so.
path = download_satellite_imagery(storage, 'Barcelona', '2023-07-16', name='barcelona_image', clip=True)
The image can be easily shown by using the thumbnail
function, available in image
module.
from spai.image import thumbnail
thumbnail(path)
load_satellite_imagery
What this function does is load satellite imagery into memory, from a given area of interest (aoi) and date in memory. So what it returns is a Dataset
of xarray
. This function is very interesting for performing operations and applying mathematics on the data without having to download it.
This function is for the bravest 😜
def load_satellite_imagery(
aoi: Any,
date: Optional[List[Union[str, datetime]]] = None,
collection: str = "sentinel-2-l2a",
clip: Optional[bool] = False,
crs: Optional[str] = "epsg:4326",
**kwargs,
) -> xr.Dataset:
Parameters:
- aoi : Any. Area of interest. It can be a GeoDataFrame, a list of coordinates, a bounding box, etc.
- date : Optional[List[Union[str, datetime]]], optional. Date of the image, by default None. If None, the last available image will be loaded.
- collection : str, optional. Satellite collection to download, by default “sentinel-2-l2a”
- clip : Optional[bool], optional. Clip the data to the area of interest, by default False
- crs : Optional[str], optional. Coordinate Reference System, by default “epsg:4326”
- kwargs : dict. Extra parameters to pass to the downloader, such as bands, cloud_cover, vegetation_percentage an so on.
Returns: It returns a xr.Dataset
with satellite imagery data in memory.
Example
As seen, data can be loaded directly into memory. This is very powerful, because it allows us to execute operations and download only the data we need.
from spai.data.satellite import load_satellite_imagery
bbox = (2.02846789, 41.27036258, 2.27905924, 41.45938995)
images = load_satellite_imagery(bbox, ("2022-10-24", "2023-10-24"))
Now, for example, we can calculate the median.
median = images.median('time')
If we now want to save it, we should simply use our Storage
.
from spai.storage import Storage
storage = Storage()['data']
storage.create(median, name='image.tif')
Extra filter parameters
Of course, you can apply a multitude of filters and add extra parameters with which to filter the data we need. These filters can be applied to any of the functions showed above.
For example, we can filter the data by bands
. This will depend on the bands of each collection. By default, the bands returned are the standard
bands of each collection (for example, for sentinel-1-grd
vv
, vh
are returned, and for sentinel-2-l2a
the 12 bands are returned).
bands_s1 = ['vv']
bands_s2_rgb = ['red', 'green', 'blue']
bbox = (2.02846789, 41.27036258, 2.27905924, 41.45938995)
s1 = load_satellite_imagery(bbox, bands=bands_s1, collection='sentinel-1-grd')
s2 = load_satellite_imagery(bbox, bands=bands_s2_rgb, collection='sentinel-2-l2a')
💫 Pro tip! You can download the SCL filtering by band!
bands_s2 = ['scl'] s2 = load_satellite_imagery(bbox, bands=bands_s2, collection='sentinel-2-l2a')
Another example is to filter by cloud_cover
. This parameter is a float that represents the percentage of cloud cover that the image can have. Of course, it only applies to optical images such as sentinel-2-l2a
.
bbox = (2.02846789, 41.27036258, 2.27905924, 41.45938995)
images = explore_satellite_imagery(bbox, cloud_cover=10)
Of course, the filters will also change depending on each collection. For example, sentinel-2-l2a
has a filter of cloud_cover
or vegetation_percentage
, and sentinel-1-grd
has a filter of sar:orbit_state
.
bbox = (2.02846789, 41.27036258, 2.27905924, 41.45938995)
images = explore_satellite_imagery(bbox, cloud_cover=10, vegetation_percentage=80)
On the other hand, you can change both the projection and the resolution of the data.
Caution! The resolution must be in the units of the projection!
bbox = (2.02846789, 41.27036258, 2.27905924, 41.45938995)
crs = 'epsg:3857'
resolution = 10
images = explore_satellite_imagery(bbox, crs=crs, resolution=resolution)
Troubleshooting
If you encounter any issues during the installation of SPAI, please get in touch with use through our Discord server.
Back to top