SiDroForest: Synthetic Siberian Larch Tree Crown Dataset of 10.000 instances in the Microsoft's Common Objects in Context dataset (coco) format

This synthetic Siberian Larch tree crown dataset was created for upscaling and machine learning purposes as a part of the SiDroForest (Siberia Drone Forest Inventory) project. The SiDroForest data collection (https://www.pangaea.de/?q=keyword%3A%22SiDroForest%22) consists of vegetation plots covered in Siberia during a 2-month fieldwork expedition in 2018 by the Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research in Germany. During fieldwork fifty-six, 50*50-meter vegetation plots were covered by Unmanned Aerial Vehicle (UAV) flights and Red Green Blue (RGB) and Red Green Near Infrared (RGNIR) photographs were taken with a consumer grade DJI Phantom 4 quadcopter. The synthetic dataset provided here contains Larch (Larix gmelinii (Rupr.) Rupr. and Larix cajanderi Mayr.) tree crowns extracted from the onboard camera RGB UAV images of five selected vegetation plots from this expedition, placed on top of full-resized images from the same RGB flights. The extracted tree crowns have been rotated, rescaled and repositioned across the images with the result of a diverse synthetic dataset that contains 10.000 images for training purposes and 2000 images for validation purposes for complex machine learning neural networks. In addition, the data is saved in the Microsoft's Common Objects in Context dataset (COCO) format (Lin et al.,2013) and can be easily loaded as a dataset for networks such as the Mask R-CNN, U-Nets or the Faster R-NN. These are neural networks for instance segmentation tasks that have become more frequently used over the years for forest monitoring purposes. The images included in this dataset are from the field plots: EN18062 (62.17° N 127.81° E), EN18068 (63.07° N 117.98° E), EN18074 (62.22° N 117.02° E), EN18078 (61.57° N 114.29° E), EN18083 (59.97° N 113° E), located in Central Yakutia, Siberia. These sites were selected based on their vegetation content, their spectral differences in color as well as UAV flight angles and the clarity of the UAV images that were taken with automatic shutter and white balancing (Brieger et al. 2019). From each site 35 images were selected in order of acquisition, starting at the fifteenth image in the flight to make up the backgrounds for the dataset. The first fifteen images were excluded because they often contain a visual representation of the research team. The 117 tree crowns were manually cut out in Gimp software to ensure that they were all Larix trees.Of the tree crowns,15% were included that are at the margin of the image to make sure that the algorithm does not rely on a full tree crown in order to detect a tree. As a background image for the extracted tree crowns, 35 raw UAV images for each of the five sites were selected were included. The images were selected based on their content. In some of the UAV images, the research teams are visible and those have been excluded from this dataset. The five sites were selected based on their spectral diversity, and their vegetation content. The raw UAV images were cropped to 640 by 480 pixels at a resolution of 72 dpi. These are later rescaled to 448 by 448 pixels in the process of the dataset creation. In total there were 175 cropped backgrounds. The synthetic images and their corresponding annotations and masks were created using the cocosynth python software provided by Adam Kelly (2019). The software is open source and available on GitHub: https://github.com/akTwelve/cocosynth. The software takes the tree crowns and rescales and transform them before placing up to three tree crowns on the backgrounds that were provided. The software also creates matching masks that are used by instance segmentation and object detection algorithms to learn the shapes and location of the synthetic crown. COCO annotation files with information about the crowns name and label are also generated. This format can be loaded into a variety of neural networks for training purposes.

Data and Resources

This dataset has no data

Cite this as

van Geffen, Femke, Brieger, Frederic, Pestryakova, Luidmila A, Zakharov, Evgenii S, Herzschuh, Ulrike, Kruse, Stefan (2021). Dataset: SiDroForest: Synthetic Siberian Larch Tree Crown Dataset of 10.000 instances in the Microsoft's Common Objects in Context dataset (coco) format. https://doi.org/10.1594/PANGAEA.932795

DOI retrieved: 2021

Additional Info

Field Value
Imported on November 30, 2024
Last update November 30, 2024
License CC-BY-4.0
Source https://doi.org/10.1594/PANGAEA.932795
Author van Geffen, Femke
Given Name Femke
Family Name van Geffen
More Authors
Brieger, Frederic
Pestryakova, Luidmila A
Zakharov, Evgenii S
Herzschuh, Ulrike
Kruse, Stefan
Source Creation 2021
Publication Year 2021
Subject Areas
Name: Lithosphere

Related Identifiers
Title: SiDroForest: a comprehensive forest inventory of Siberian boreal forest investigations including drone-based point clouds, individually labeled trees, synthetically generated tree crowns, and Sentinel-2 labeled image patches
Identifier: https://doi.org/10.5194/essd-14-4967-2022
Type: DOI
Relation: References
Year: 2022
Source: Earth System Science Data
Authors: van Geffen Femke , Heim Birgit , Brieger Frederic , Geng Rongwei , Shevtsova Iuliia , Schulte Luise , Stuenzi Simone Maria , Bernhardt Nadine , Troeva Elena I , Pestryakova Luidmila A , Zakharov Evgenii S , Pflug Bringfried , Herzschuh Ulrike , Kruse Stefan , Brieger Frederic , Herzschuh Ulrike , Pestryakova Luidmila A , Bookhagen Bodo , Zakharov Evgenii S , Kruse Stefan , Kelley A , Kruse Stefan , Bolshiyanov Dimitry Yu , Grigoriev Mikhail N , Morgenstern Anne , Pestryakova Luidmila A , Tsibizov Leonid , Udke Annegret , Lin Tsung-Yi , et al. .

Title: Advances in the Derivation of Northeast Siberian Forest Metrics Using High-Resolution UAV-Based Photogrammetric Point Clouds
Identifier: https://doi.org/10.3390/rs11121447
Type: DOI
Relation: References
Year: 2019
Source: Remote Sensing
Authors: van Geffen Femke , Heim Birgit , Brieger Frederic , Geng Rongwei , Shevtsova Iuliia , Schulte Luise , Stuenzi Simone Maria , Bernhardt Nadine , Troeva Elena I , Pestryakova Luidmila A , Zakharov Evgenii S , Pflug Bringfried , Herzschuh Ulrike , Kruse Stefan , Brieger Frederic , Herzschuh Ulrike , Pestryakova Luidmila A , Bookhagen Bodo , Zakharov Evgenii S , Kruse Stefan , Kelley A , Kruse Stefan , Bolshiyanov Dimitry Yu , Grigoriev Mikhail N , Morgenstern Anne , Pestryakova Luidmila A , Tsibizov Leonid , Udke Annegret , Lin Tsung-Yi , et al. .

Title: Complete Guide to Creating COCO Datasets
Identifier: https://github.com/akTwelve/cocosynth
Type: DOI
Relation: References
Year: 2019
Source: GitHub repository
Authors: van Geffen Femke , Heim Birgit , Brieger Frederic , Geng Rongwei , Shevtsova Iuliia , Schulte Luise , Stuenzi Simone Maria , Bernhardt Nadine , Troeva Elena I , Pestryakova Luidmila A , Zakharov Evgenii S , Pflug Bringfried , Herzschuh Ulrike , Kruse Stefan , Brieger Frederic , Herzschuh Ulrike , Pestryakova Luidmila A , Bookhagen Bodo , Zakharov Evgenii S , Kruse Stefan , Kelley A , Kruse Stefan , Bolshiyanov Dimitry Yu , Grigoriev Mikhail N , Morgenstern Anne , Pestryakova Luidmila A , Tsibizov Leonid , Udke Annegret , Lin Tsung-Yi , et al. .

Title: Russian-German Cooperation: Expeditions to Siberia in 2018
Identifier: https://doi.org/10.2312/BzPM_0734_2019
Type: DOI
Relation: References
Year: 2019
Source: Berichte zur Polar- und Meeresforschung = Reports on Polar and Marine Research
Authors: van Geffen Femke , Heim Birgit , Brieger Frederic , Geng Rongwei , Shevtsova Iuliia , Schulte Luise , Stuenzi Simone Maria , Bernhardt Nadine , Troeva Elena I , Pestryakova Luidmila A , Zakharov Evgenii S , Pflug Bringfried , Herzschuh Ulrike , Kruse Stefan , Brieger Frederic , Herzschuh Ulrike , Pestryakova Luidmila A , Bookhagen Bodo , Zakharov Evgenii S , Kruse Stefan , Kelley A , Kruse Stefan , Bolshiyanov Dimitry Yu , Grigoriev Mikhail N , Morgenstern Anne , Pestryakova Luidmila A , Tsibizov Leonid , Udke Annegret , Lin Tsung-Yi , et al. .

Title: Microsoft COCO: Common Objects in Context
Identifier: https://doi.org/10.1007/978-3-319-10602-1_48
Type: DOI
Relation: References
Year: 2014
Source: In: Fleet, D, Pajdla, T, Schiele, B, Tuytelaars, T (eds.), Computer Vision – ECCV 2014, Lecture Notes in Computer Science, 8693, Springer International Publishing, Cham
Authors: van Geffen Femke , Heim Birgit , Brieger Frederic , Geng Rongwei , Shevtsova Iuliia , Schulte Luise , Stuenzi Simone Maria , Bernhardt Nadine , Troeva Elena I , Pestryakova Luidmila A , Zakharov Evgenii S , Pflug Bringfried , Herzschuh Ulrike , Kruse Stefan , Brieger Frederic , Herzschuh Ulrike , Pestryakova Luidmila A , Bookhagen Bodo , Zakharov Evgenii S , Kruse Stefan , Kelley A , Kruse Stefan , Bolshiyanov Dimitry Yu , Grigoriev Mikhail N , Morgenstern Anne , Pestryakova Luidmila A , Tsibizov Leonid , Udke Annegret , Lin Tsung-Yi , et al. .

Title: SiDroForest Synthetic Tree Crowns Dataset - README
Type: DOI
Relation: References
Authors: van Geffen Femke , Heim Birgit , Brieger Frederic , Geng Rongwei , Shevtsova Iuliia , Schulte Luise , Stuenzi Simone Maria , Bernhardt Nadine , Troeva Elena I , Pestryakova Luidmila A , Zakharov Evgenii S , Pflug Bringfried , Herzschuh Ulrike , Kruse Stefan , Brieger Frederic , Herzschuh Ulrike , Pestryakova Luidmila A , Bookhagen Bodo , Zakharov Evgenii S , Kruse Stefan , Kelley A , Kruse Stefan , Bolshiyanov Dimitry Yu , Grigoriev Mikhail N , Morgenstern Anne , Pestryakova Luidmila A , Tsibizov Leonid , Udke Annegret , Lin Tsung-Yi , et al. .