Skip to main content
Data pipeline for near real-time analysis of plankton images – Available for the end-users

Data pipeline for near real-time analysis of plankton images – Available for the end-users

Finnish Environment Institute – SYKE, one of the PHIDIAS consortiums involved in the Ocean use case, has created a data pipeline for near real-time analysis of plankton images using CSC Allas object storage and cloud computing services. Neural network classifier analyses the image data and results, e.g., species composition of nuisance cyanobacterial blooms, is available for end-users in a delay of about 2 hours.

The marine environment is evolving continuously, and because marine observation is still expensive, observation data are unique and must be well preserved and easy to be retrieved. – PHIDIAS Ocean Use case team

Cyanobacteria blooms are an annual nuisance in the Baltic Sea for recreation, fisheries, and other uses and they have also cascaded effects on ecosystem functioning. Emerging plankton imaging technologies can be used to track the development of phytoplankton biomass and to provide information on the phytoplankton community composition, e.g., which cyanobacteria species dominate the blooms. This information is valuable for remote sensing validation, ecosystem modeling, management of the seas, and for the public, especially as some of the species are toxic.

Latest achievement: PHIDIAS Ocean use case

SYKE has an autonomous and operational imaging device Imaging FlowCytobot installed at Utö Marine Research Station, in the Baltic Sea. The instrument provides thousands of plankton images in an hour (see image below) and there is an urgent need to sort out the issues in data management.

CSC Allas cloud object storage service
                                            CSC Allas Cloud onject storage service

In PHIDIAS, the team created a near real-time data pipeline from the instrument to CSC Allas cloud object storage. Allas is based on CEPH object storage technology. From Allas they share data to other services within the CSC's computing platform and perform subsequent neural network analysis on a Linux virtual machine with 6 vCPUs and 16 GB of memory, also provided by CSC cloud computing services. A neural network is based on pre-trained ResNET-18 and fine-tuned with a labeled Baltic Sea phytoplankton image data set.

Data transfer and analysis result in a delay of about two hours from the image capture to the point when the image has been classified and data is available for users.

The system was tested in 2021, and near real-time data was used in weekly national algae reviews by SYKE targeted to the public. Further developments of data use and visualisation will be done within JERICO-S3 (EU H0202 INFRAIA project). To our knowledge this is the first near the real-time application of such an image data set, normally the classifications are done in the delayed mode.

Although the availability of such near real-time results would be useful for modeling and earth observation communities, there are still two bottlenecks to be solved. First, there is no generally agreed data (short or long-term) storage for plankton images coming from such systems, as the amount of data is much too large to be handled by existing systems like EcoTaxa. Second, there is no data-aggregator taking up the Ai-classified results in their databases, an issue that prevents the wider use of the results. These topics are not for PHIDIAS to solve, but they may promote the solutions.

Scheme of phytoplankton image data flows and processing created to obtain NRT products for biomass and species structure during cyanobacteria blooms (up). Examples of the images of bloom-forming cyanobacteria (Kraft et al 2021) (down)

Click here, to learn more on the latest updates on the PHIDIAS Ocean use case.

Get in touch with us

The 30-months project will push the Next Generation Internet a step further by providing cascade funding to EU-based researchers and innovators in carrying out Next Generation Internet related experiments in collaboration with US research teams.

contact action add button