Imperceptible NLP Attacks

Nicholas Boucher on adversarial examples that can be used to attack text-based models, and the state of homomorphic encryption for machine learning.

Subscribe: Apple • Android • Spotify • Stitcher • Google • AntennaPod • RSS.

Nicholas Boucher is a PhD at Cambridge University where his focus is on security including on topics like homomorphic encryption, voting systems, and adversarial machine learning. He is the lead author of a fascinating new paper – “Bad Characters: Imperceptible NLP Attacks” – which provides a taxonomy of attacks against text-based NLP models, that are based on Unicode and other encoding systems. We discussed the key findings in their paper, and we also briefly talked about the state of homomorphic encryption for machine learning and analytics.

Download the 2021 NLP Survey Report and learn how companies are using and implementing natural language technologies.

Nicholas Boucher:

It started with a conversation I was having with another of the future authors on the paper, and we were talking about how challenging it can be for multilingual speakers to type their language. If those languages for example, don’t use the same set of characters. These different characters, they’re encoded differently. Sometimes you may just choose to type the thing that looks the most similar in English to the language that you’re ultimately writing in, because you don’t want to switch modes on your keyboard or switch over to your other keyboard. We started thinking: “Wait a minute, natural language processing and other areas of machine learning models that take text as input, they really assume that text is going to look a certain way.” Moments like this are when security people start to get really excited.

Highlights in the video version: