NSA warns US adversaries may have AI edge of personal data for free


Electrical engineer Gilbert Herrera was appointed research director of the US National Security Agency in late 2021, just as the AI ​​revolution was brewing inside the US tech industry.

The NSA, sometimes jokingly called no such agency, has long hired top mathematics and computer science talent. Its technology leaders have been early and avid users of advanced computing and AI. And yet when Herrera spoke to me on the phone from NSA headquarters in Fort Meade, Maryland, about the implications of the latest AI boom, it seemed that, like many others, the agency was intrigued by the recent success of the big language model behind ChatGPIT. Stunned. And other hit AI products. The conversation has been lightly edited for clarity and length.

A man in a suit smiling in front of the American and National Security Agency flags

gilbert herreraCourtesy National Security Agency

How big a surprise was the ChatGPIT moment for the NSA?

Oh, I thought your first question would be “What did the NSA learn from the Ark of the Covenant?” This has been happening repeatedly since about 1939. I would love to tell you, but I can't.

I think what everyone learned from the ChatGPIT moment is that if you throw enough data and enough computing resources at AI, these emerging assets appear.

The NSA actually considers artificial intelligence to be at the forefront of a long history of using automation to accomplish our missions with computing. AI has long been seen as one of the ways we can work smarter, faster, and at scale. And so we've been involved in research for over 20 years to get to this moment.

Large language models have existed long before generative pretrained (GPT) models. But it's the “ChatGPT moments” – the one time you can ask it to write a joke, or the one time you can join the conversation – that really sets it apart from other things we've done and others have done.

The NSA and its counterparts among US allies have sometimes developed critical technologies before anyone else but kept it secret, like public key cryptography in the 1970s. Could the same perhaps have happened with larger language models?

At the NSA we couldn't build these big transformer models, because we couldn't use the data. We cannot use the data of American citizens. The second thing is budget. I listened to a podcast where someone shared a Microsoft earnings call, and they said they were spending $10 billion per quarter on platform costs. [The total US intelligence budget in 2023 was $100 billion.]

It must actually be people who have enough money to invest capital that amounts to tens of billions [who] Have access to the kind of data that can produce these emerging assets. And so it's really the hyperscalers [largest cloud companies] And potentially governments that don't care about individual privacy don't have to follow individual privacy laws, and have no problems with data theft. And I leave it to your imagination who that could be.

Doesn't this harm the NSA—and the United States—in intelligence gathering and processing?

I will digress a bit: it will not cause us any great harm. We need to work around it and I will consider it.

This is no great loss to our responsibility, which is dealing with nation-state goals. If you look at other applications, it may make it more difficult for some of our domestic intelligence allies. But the intelligence community will need to find a way to use business language models and respect privacy and individual freedoms. [The NSA is prohibited from collecting domestic intelligence, although multiple whistleblowers have warned that it does scoop up US data.]