Development & functioning :
What is usually done in ecology is to predict the distribution on the basis of a representation in the environmental space, typically a feature vector composed of climatic variables (average temperature at that location, precipitation, etc.) and other variables such as soil type, groundcover, distance from water, etc. Our methodology will be to revisit such approaches using big data and deep-learning techniques.
This will require developing a complete information system aggregating data from several sources(e.g . GBIF species occurrences, Chelsea or KNMI climate data, Corine Land Cover data, satellite imagery data, etc.) in order to train a large-scale predictive model and deploy a web service on top of it. In terms of quality control, the service will provide indicators of the occurrence probability of the reported species and allow users to execute probability range queries (e.g.to focus on common or rare species in the location spot). It will also allow for the integration of warning messages conditioned by the probability of observing some particular species (e.g. to signal the presence of invasive species).
This service will be integrated in the deep-hybrid-datacloud marketplace.
Example: A start-up developing a gamified citizen science app will be able to suggest the species to be observed at a given location.
Innovation for citizen observatories:
Automatically predicting the list of species that are the most likely to be observed at a given location is a key technology for many research domains, as well as practical usage in sustainable landscape management, environmental education, ecotourism, etc.
This is challenging as it is not usually possible to learn species distribution models (SDM) directly from available spatial positions because of the limited number of occurrences and the sampling bias. So far, there is nothing similar available.