How ZSL uses ML to classify gunshots to protect wildlife November 24, 2020
International conservation charity, ZSL
(Zoological Society of London) has made
another leap forward in its battle to
protect animals using AI and machine
learning (ML) from Google Cloud.
We’ve been
privileged to
partner with ZSL
for three years, co-developing
custom ML models
to identify and better track endangered
species around the world. The next
dataset in ZSL’s arsenal to tackle
animal conservation is
sound—specifically gunshots captured by
recording devices.
WWF
estimates the illegal wildlife trade is
worth about $20bn a year and has
contributed to a catastrophic decline in
some species. Technology, particularly
machine learning, is at the forefront of
conservation efforts, but standing up
these systems in wildlife reserves is no
walk in the park.
The
analysis of acoustic (sound) data to
support wildlife conservation is one of
the major lines of work at ZSL’s
monitoring and technology programme.
Compared to camera traps that are
limited to detection at close range,
acoustic sensors can detect events up to
1 kilometre (about half a mile) away.
This has the potential to enable
conservationists to track wildlife
behaviour and threats over much greater
areas.
In early
2018, ZSL deployed 69 acoustic recording
devices in the northern sector of the
Dja Faunal Reserve, in Cameroon, central
Africa. The objectives of the project
were twofold: to collect acoustic data
that could be analyzed for monitoring
key endangered species, and to see if
the acoustic data could be used to
investigate illegal hunting activity.
Over the course of a month, ZSL’s
acoustic devices captured 267 days'
worth of continuous audio totalling
350GB. Even one-month’s worth of data
would be too labor intensive for a human
to listen to and analyze manually; so
ZSL’s research team worked in
collaboration with Google Cloud to find
a quicker solution.
ZSL was
particularly interested in identifying
and analysing instances of gunshots. For
each audio file in the dataset we needed
to answer the following:
The team
leveraged a pre-trained machine learning
model called
YAMNet,
originally developed and open-sourced by
Google. YAMNet is a deep net that
predicts 521 audio event classes, and
was trained using the soundtracks of
millions of YouTube videos. YAMNet was
used to recognize sound events in ZSL’s
dataset, stored in Google Cloud Storage.
The initial classification of 350GB
worth of data took less than 15 minutes
to complete and identified 1,746
instances with a high confidence of
being gunshots.
The
output from the classifier was pushed
into a BigQuery table. Each
classification represented a row in the
table, including details of the acoustic
recording device, it’s location, the
time at which the sound occurred, the
confidence level it contained a gunshot
sound, and a reference to the
originating audio file. This allowed
ZSL to quickly query and focus on only
the audio files with the highest
probability of containing a gunshot
sound from thousands of hours of
recording, for further analysis.
These
instances would need to be manually
listened to and visually inspected as
spectrograms to be confirmed as
gunshots. ZSL needed an easy way to
listen to those audio clips identified
as containing gunshots. So the next step
was to build a Jupyter notebook using
AI Platform
to load, visualise and listen to a
sample of audio files to validate the
model's findings as shown in Figure 2.
The team
used the BigQuery API to return the
Cloud Storage URL of each file
corresponding to a gunshot instance
identified with high confidence. Each
audio file was then visualized as a
spectrogram (to speed up validation),
with a button for the researchers to
enable playback of the sound, without
needing to leave the notebook
environment.
ZSL
confirmed three unique gunshots that
took place during the study at three
different locations, dates, and times.
Manual validation is tedious work, and
the instances returned by the Google
classifier took about 2.5 hours;
ordinarily this task could have taken
many months of effort by a team of
researchers.
In this
short one-month study that only covered
a portion of the reserve, the research
team were able to contribute new
insights to the human threats to species
in the Dja reserve. Past data suggests
gunshots are more likely to take place
at night to evade ranger detection, but
using ecoacoustics alone, ZSL provided
evidence of illegal hunting occurring
during the day.
Down the
road, ZSL’s findings will also inform
development of on-device threat
classification to enable longer, cheaper
monitoring and, ultimately, real-time
alerts. With animal populations under
enormous pressure, technology and
in-particular machine learning, has huge
potential for enabling conservation
groups like ZSL to deploy their
resources more efficiently in the battle
against illegal wildlife trade. |
Terms of Use | Copyright © 2002 - 2020 CONSTITUENTWORKS SM CORPORATION. All rights reserved. | Privacy Statement