{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# **Multiprocessing**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[](https://colab.research.google.com/github/kenoz/SITS_utils/blob/main/docs/source/tutorials/colab_sits_ex04.ipynb)" ] }, { "cell_type": "markdown", "metadata": { "id": "K2FIkbDYrq9l" }, "source": [ "---\n", "\n", "We aim to retrieve satellite time series for a set of points randomly located in Europe. Rather than processing the points sequentially, we use here the capacities offered by the `sits.Multiproc()` class to distribute the calculations and thus optimize the processing times.\n", "\n", "

\n", "\n", "\n", "> _`sits.Multiproc()` method needs a multi-core CPU to work efficiently._\n", "\n", "---\n", "\n", "## 1. Installation of SITS package and its depedencies\n", "\n", "First, install `sits` package with [pip](https://pypi.org/project/SITS/). We also need some other packages for displaying data." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 33951, "status": "ok", "timestamp": 1738164860034, "user": { "displayName": "kose tetistraining", "userId": "06823399031118728700" }, "user_tz": -60 }, "id": "xoL6NstiVcp9", "outputId": "07ba85a1-4718-4771-db40-2b371dd55b9b" }, "outputs": [], "source": [ "# SITS package\n", "!pip install -q --upgrade sits\n", "\n", "# other packages\n", "!pip install -q \"dask[dataframe]\"\n", "!pip install -q mapclassify\n", "#!pip install -q netCDF4\n", "#!pip install -q folium\n", "#!pip install -q matplotlib" ] }, { "cell_type": "markdown", "metadata": { "id": "OYAb0l91zqj4" }, "source": [ "Now we can import `sits` and some other libraries." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "executionInfo": { "elapsed": 11651, "status": "ok", "timestamp": 1738164875497, "user": { "displayName": "kose tetistraining", "userId": "06823399031118728700" }, "user_tz": -60 }, "id": "neqafgGHWIkj" }, "outputs": [], "source": [ "import os\n", "# sits lib\n", "from sits import sits\n", "# geospatial libs\n", "import geopandas as gpd\n", "import pandas as pd\n", "# date format\n", "from datetime import datetime\n", "# ignore warnings messages\n", "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": { "id": "o4n__qxr0Kqi" }, "source": [ "## 2. Handling the input vector file\n", "\n", "### 2.1. Data loading\n", "\n", "The geojson vector file, stored in the [Github repository](https://github.com/kenoz/SITS_utils), includes 24 points over Europe. We download it into our current workspace. " ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 732, "status": "ok", "timestamp": 1738164882882, "user": { "displayName": "kose tetistraining", "userId": "06823399031118728700" }, "user_tz": -60 }, "id": "soIORtc8bGDd", "outputId": "6cb289c1-0ecd-4958-ff6b-c461c70f67d2" }, "outputs": [], "source": [ "!mkdir -p test_data\n", "![ ! -f test_data/rand_pts.geojson ] && wget https://raw.githubusercontent.com/kenoz/SITS_utils/refs/heads/main/sits/data/rand_pts.geojson -P test_data" ] }, { "cell_type": "markdown", "metadata": { "id": "n0jbqcwQ1ppL" }, "source": [ "We load the vector file, named `rand_pts.geojson`, as a geoDataFrame object with the `sits` method: `sits.Vec2gdf()`." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 206 }, "executionInfo": { "elapsed": 715, "status": "ok", "timestamp": 1738164886246, "user": { "displayName": "kose tetistraining", "userId": "06823399031118728700" }, "user_tz": -60 }, "id": "7MWDWpr7nxB3", "outputId": "ccf75a57-6edf-4bd1-ba96-5499a2a0f34c" }, "outputs": [ { "data": { "text/html": [ "
| \n", " | id | \n", "pt_id | \n", "geometry | \n", "
|---|---|---|---|
| 0 | \n", "1 | \n", "1 | \n", "POINT (8.49138 49.85437) | \n", "
| 1 | \n", "3 | \n", "2 | \n", "POINT (8.41277 53.14555) | \n", "
| 2 | \n", "7 | \n", "3 | \n", "POINT (11.17678 50.01380) | \n", "
| 3 | \n", "9 | \n", "4 | \n", "POINT (23.79724 40.06894) | \n", "
| 4 | \n", "10 | \n", "5 | \n", "POINT (16.80020 48.98809) | \n", "
| \n", " | bbox_4326 | \n", "bbox_3035 | \n", "bbox_tuple | \n", "
|---|---|---|---|
| 0 | \n", "[8.481383388733258, 49.84437017219194, 8.50138... | \n", "[4211764.670013768, 2971302.3517032275, 421324... | \n", "([8.481383388733258, 49.84437017219194, 8.5013... | \n", "
| 1 | \n", "[8.402773153496353, 53.135550217099215, 8.4227... | \n", "[4214105.200943511, 3337504.924185555, 4215492... | \n", "([8.402773153496353, 53.135550217099215, 8.422... | \n", "
| 2 | \n", "[11.166778399392497, 50.00380142653234, 11.186... | \n", "[4404616.995128865, 2988598.266294405, 4406085... | \n", "([11.166778399392497, 50.00380142653234, 11.18... | \n", "
| 3 | \n", "[23.78724480625752, 40.05894472525313, 23.8072... | \n", "[5495790.682699846, 1992060.233281068, 5497840... | \n", "([23.78724480625752, 40.05894472525313, 23.807... | \n", "
| 4 | \n", "[16.790197637816156, 48.978094737864616, 16.81... | \n", "[4817203.221624014, 2896784.6045746827, 481886... | \n", "([16.790197637816156, 48.978094737864616, 16.8... | \n", "

" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "executionInfo": { "elapsed": 409682, "status": "ok", "timestamp": 1738168270835, "user": { "displayName": "kose tetistraining", "userId": "06823399031118728700" }, "user_tz": -60 }, "id": "2wXfY_z_jjyN", "outputId": "98fad65a-c984-4563-e267-5d75baa2dca3", "tags": [] }, "outputs": [], "source": [ "%%time\n", "\n", "multi = sits.Multiproc('image', 'nc', data_dir)\n", "\n", "multi.addParams_stacAttack(bands=['B03', 'B04', 'B08', 'SCL'])\n", "multi.addParams_searchItems(date_start=datetime(2024, 1, 1),\n", " date_end=datetime(2025, 1, 1),\n", " query={\"eo:cloud_cover\": {\"lt\": 10}})\n", "multi.addParams_loadCube(resolution=20)\n", "multi.addParams_mask(mask_values=[0, 1, 3, 8, 9, 10])\n", "\n", "for gid, i in enumerate(test_process['bbox_tuple'][:2]): # here we process only the two first images, remove or modify the slicing\n", " multi.fetch_func(i[0], i[1], gid, mask=True, gapfill=True)\n", "multi.dask_compute();" ] }, { "cell_type": "markdown", "metadata": { "id": "ww_Zjvx9391P" }, "source": [ "### 3.3. Producing patches from the vector layer\n", "\n", "It is also possible to specify the output image size. The option, called \"patch\", refers to a small, localized region or segment of an input image. These patches need to be of the same size in deep leaning models to ensure consistent processing, especially in architectures like convolutional neural networks (CNNs) or vision transformers (ViTs).\n", "\n", "

" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "-YudS0E8jjyN", "tags": [] }, "outputs": [], "source": [ "%%time\n", "\n", "multi = sits.Multiproc('patch', 'nc', data_dir)\n", "\n", "multi.addParams_stacAttack(bands=['B03', 'B04', 'B08', 'SCL'])\n", "multi.addParams_searchItems(date_start=datetime(2024, 1, 1),\n", " date_end=datetime(2025, 2, 1),\n", " query={\"eo:cloud_cover\": {\"lt\": 10}})\n", "multi.addParams_loadCube(dimx=10, dimy=10, resolution=20)\n", "multi.addParams_mask()\n", "\n", "for gid, i in enumerate(test_process['bbox_tuple'][:2]): # here we process only the two first patches, remove or modify the slicing\n", " multi.fetch_func(i[0], i[1], gid, mask=True, gapfill=True)\n", "multi.dask_compute();" ] } ], "metadata": { "colab": { "provenance": [ { "file_id": "https://github.com/kenoz/SITS_utils/blob/main/examples/colab_sits_ex02.ipynb", "timestamp": 1738147041012 } ] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.9" } }, "nbformat": 4, "nbformat_minor": 4 }