Stop it with the DeepSeek panic, America. It’s time to get small (language models).
Image by Lars Eriksson from Pixabay
Monday’s extraordinary reaction to DeepSeek’s release of high-performance small language models (SLMs) revealed many of the worst character flaws in our tech industry and financial system.
What happened was this: The Chinese company DeepSeek released an AI system that outperformed the leading American-made models at a fraction of the cost.
For those who prefer to get their tech knowledge from the comedic stylings of T.J. Miller, think of HBO’s Silicon Valley. What DeepSeek did is roughly akin to the Pied Piper breakthrough—a clever whiteboard moment that led to a significant leap forward.
Billionaire tech executives and VC investors immediately declared that the sky had fallen. Investors flooded the markets with sell orders for Nvidia, Microsoft, Oracle, and Google. Marc Andreesen, the tech venture capitalist, called it “AI’s Sputnik moment.”
Andreesen’s comment reinforced the false frame the tech industry has been trying to foist upon the American people, and especially upon our elected leaders. The billionaires tell us this is the modern-day space race. They tell us the United States is in an existential competition with China for AI supremacy. Some argue that America must establish “AI dominance,” as if anything less than an American AI monopoly would diminish us to the status of serfs under the all-powerful lords of China.
That is nonsense.
Don’t fall for the Billionaire panic
We’ve seen this playbook before. In the early 1980s American political rhetoric was full of pumped-up fear over Japan’s rising economic power. Japanese companies were creating better automobiles and inventing nifty electronics like the Sony Walkman. Serious people spoke in grave tones of “The Japaning of America” and an “economic Pearl Harbor.”
What actually happened? Japan’s success forced American industry to innovate and improve. Detroit stopped producing junk like the Chrysler K-Car (look it up and shudder). Its designs came into alignment with consumer needs—the minivan, for instance, which saved Chrysler—and the Big Three automakers revamped their factories to become more efficient. American tech innovators did the same. Steve Jobs took inspiration from Sony and the Japanese principles of simplicity and beauty. And so were born the Mac, the iPod, and the iPhone.
All of which is to say: Take a deep breath. Don’t join fall into the trap set by Andreesen and his tech billionaire chums. They're using panic rhetoric to scare our elected officials into giving tech corporations whatever they want—regulatory free rein, federal subsidies, carte blanche to steal data, control over our energy grid—in exchange for American “AI dominance.”
Deepseek shows the folly of chasing after the ‘biggest’ llm
The significance of DeepSeek’s achievement this week lies in the next-level efficiencies they gained. The American obsession with bigger-is-better thinking works well with some things. Like entertainment spectacles. Las Vegas and the Super Bowl: Nobody does razzle-dazzle better.
Where it doesn’t work is in the world of artificial intelligence. For the past couple years AI developers have released new models with button-busting claims about ever-more compute power and ever-greater parameters. As if AI models were pickup trucks in a neighborhood competition to see who could park the biggest rig in the driveway.
This power ramp-up drove up costs everywhere. Building and training new AI models ran into the billions of dollars. The energy required to power the massive data centers that drive AI systems is so great that tech companies are now investing in nuclear power plants.
What DeepSeek did was park a shiny new Toyota among those gas-guzzling Ford F-350s. And then beat the pickups in fuel efficiency and zero-to-sixty acceleration.
Well-known AI models like OpenAI’s ChatGPT and Meta’s Llama are large language models (LLMs) requiring massive computing power and enormous datasets. DeepSeek is a small language model (SLM). It requires a much smaller volume of data and far fewer parameters. Compared to LLMs, DeepSeek’s SLM is next-level efficient. Company officials claim it performs as well as the biggest LLMs for a fraction of the cost. DeepSeek says it needed only $6 million in raw computing power to train its model. Cost estimates for the biggest American LLMs run into the hundreds of millions of dollars.
How were DeepSeek’s developers able to do this? They didn’t cry for billions of dollars in funding to go bigger. They went smaller and got creative. In 2022, the United States severely restricted the sale of Nvidia’s top-of-the-line AI chips to China. Forced to do more with less, DeepSeek’s tech team spread its model’s data analysis across several specialized models, realizing enormous efficiencies.
They may also have infringed on OpenAI’s intellectual property, about which Sam Altman is now crying foul. Which is a topic for another day, OpenAI having built its own models by (allegedly) infringing the intellectual property of tens of thousands of authors, musicians, and other creators.
one ai model will not rule them all.
Can DeepSeek’s SLM do everything the big American LLMs do? Probably not. The Chinese company released data that pitted DeepSeek against LLMs in very specific functions. We can reasonably assume they picked the metrics that showed DeepSeek in the best possible light. SLMs are much more purpose-built than LLMs. They may not be able to answer open-ended questions like an LLM would. And that’s okay.
Here’s where the dominant thinking about AI is wrong. Too many tech executives and policymakers assume the evolution of AI will mimic the search engine competition of the late 1990s and early 2000s: One Google will do all and rule all, while the also-ran AltaVistas will fail.
That’s not necessarily how this will play out. All-encompassing LLMs like ChatGPT are popular with the general public today, but they require massive amounts of legally problematic data, very expensive computing power, and endless megawatts of energy. They also have a nasty tendency to hallucinate, which can be literally deadly when it happens within, say, a transcription tool used in hospitals.
In many cases the best use of AI is within purpose-built SLMs, which are trained using legal, licensed, carefully curated high-quality datasets. As the British like to say: Fit for purpose.
we already build these in america
Here’s a helpful note to quell this week’s DeepSeek panic: Many of the best SLMs have been, and are being, built here in America. In fact, Microsoft released a hugely promising but little-hyped small language model, phi-4, just last month.
Phi-4 is said to outperform ChatGPT-40 and Gemini 1.5 Pro in mathematics, using a fraction of the LLMs parameters. Microsoft gave phi-4 little marketing support, releasing it on the cusp of the holiday season with near-zero fanfare. The tidy little high-performance model was noted only here and there in the specialty tech press.
Which brings us back to DeepSeek and its big splash. No small part of the Chinese company’s genius here lies in its timing: Releasing two high-quality AI products just as the American tech industry is reaching peak hype in its bid to create a false “AI race” that the U.S. is now supposedly losing.
Calm down. Don’t let the panicking billionaires fool you. DeepSeek’s breakthrough is an admirable advance. But it’s just that. Other companies—based in the U.S., China, the E.U., and elsewhere—will build on their discoveries. American innovation will continue.
Pivot to purpose. embrace efficiency. innovate.
If this were 1980, the Marc Andreesens of the world would have us ban Toyotas, subsidize the K-Car, and poison Americans with leaded fuel. That’s just foolish.
Instead, we should be giving far more resources and attention to purpose-built small language models. Stop throwing more scale and data at the problem. Quit trying to build One Big Machine To Rule Them All. Create AI models for specific industries and focused applications. Use carefully curated, legally licensed training data. Stop wasting billions of dollars trying to create one auto design to fulfill every purpose. Design sports cars. Build rugged off-road SUVs. Sell functional pickup trucks.
The answer to China's DeepSeek success isn't going to come out of a wholesale regulatory surrender to America's tech corporations. It's going to require American innovation, creativity, ingenuity, and out-of-the-box thinking. Those are things we've always been pretty good at.