Cross-platform training data aggregation service

Service description:

Service that allows users to create a training set on a particular group of living organisms on-demand, i.e. allowing them to solicit specific data, such as images of specific species and/or specific platforms, or images with a sufficient quality of expert validation 

Example: A data scientist or a developer who wants to train an AI model on a particular group of species using pytorch software will be able to do it very easily.

Development & functioning :

To carry out this service will be necessary to:  

  • Ensure a common species dictionary across projects. This will take a global backbone dictionary such as that provided by GBIF or Catalogue of Life and ensure that regional dictionaries such as the UK Species Inventory work together to create a single dictionary that can be used by all participants and global repositories.
  • Data access services with quality assessment. This will enable training of different taxonomic domains. Not only does the very basic observation data needs to be supplied to the AI as a web service, it also needs information on the confidence of the identification. On-demand training set creation service. To train AI models on potentially any group or set of living species, the aggregation of verified data from several platforms is necessary.
  • AI performance indicators by measuring which species the AI becomes better at identifying and which species it still struggles with. Besides, this is vital to assess user behaviour i.e. are they simply choosing the most likely species offered by the AI even if other users later show this to be the wrong ID. It is also important to know the effect of the AI on the learning cycle and the reputation system on the platform.

Innovation for citizen observatories:

The service will allow for the aggregation of massive sets of image data related to large groups of species, using the APIs of several platforms and citizen observatories, in order to facilitate learning by efficient AI models for automated identification. 

There is nothing similar available. 

We are working on this service, it will be available soon!
35%

Keywords:

TRAINING DATA, ARTIFICIAL INTELLIGENCE, CROSS PLATFORM DATA LOADER, DATA QUALITY, APIs, AUTOMATED SPECIES IDENTIFICATION

Coordinator: