Spoofing and Anti-Spoofing (SAS) corpus

The objective of the Spoofing and Anti-Spoofing (SAS) corpus is a standard spoofing database for both spoofing and anti-spoofing (also known as countermeasure) research. The SAS corpus will include three forms of spoofing: speech synthesis, voice conversion and replay. In each form of spoofing, multiple state-of-the-art techniques will be employed to generate the spoofing materials. For example in speech synthesis, statistical parametric and unit selection techniques will be used.

SAS will have two subsets: SAS-VCTK and SAS-RSR. SAS-VCTK is based on the Voice Cloning Toolkit (VCTK) database from the Centre for Speech Technology Research (CSTR) at the University of Edinburgh in United Kingdom. VCTK is English and freely available. SAS-VCTK aims to text-independent speaker verification research.

SAS-RSR is based on the RSR2015 database developed by the Human Language Technology (HLT) department at Institute for Infocomm Research (I2R) in Singapore. SAS-RSR aims at text-dependent (fixed pass-phrase) speaker verification research.

Currently, SAS-VCTK has two speech synthesis and seven voice conversion spoofing sets, while SAS-RSR has one replay spoofing set. We are implementing more spoofing techniques on both databases.

SAS is available for free, and is released under a Creative Commons licence.
If you are interested in the corpus, please contact Zhizheng Wu (email: zhizheng.wu {at}

