Deep Seek – a classification

The semester is over, the lectures are over and yet the world keeps turning. And fast. So here’s a brief introduction…

What has happened is probably well known. Deep Seek (China) is catching up with the big language models with a fraction of the effort. Share prices plummet, 1000s of testimonials, a “new era”, a new hype, and so on…

But how should this be classified?

1 Deep Seek has shown that AI does not only work via a pure “material battle”. Thanks to a clever load distribution (mixture-of-experts architecture) in the model, the areas that are not currently important are switched off, which leads to less resource consumption with comparable results.
2. Deep Seek is open source (well, sort of, anyway). Anyone can download Deep Seek, but you don’t get any (!) information about training data etc. It is therefore better to speak of open weight.
3. If you want to install Deep Seek on “your own hardware”, you should still pay attention to the system requirements… A cluster of 16 NVIDIA H100 cards with 80 GB memory each is recommended for the large V3 671B model (€30,000 each); the online version of Deep Seek is currently still free, but will probably soon switch to a subscription model
4. Training data and censorship play a major role. For example, Deep Seek does not publish information on certain topics. Deep Seek is also a good example of how models are in fact “colored” and shape the values of users through slight (or major) misinformation or non-information. (To be honest – the large language models don’t do this any differently, only we notice it less there because we are in the same cultural area and because it is a little more cleverly hidden.
5. In my opinion, a feature comparison is not yet worthwhile. There is still a lot of movement. However, you can already see that Deep Seek can keep up with the big language models.

Deep Seek is a “game changer” in that someone has finally shown that it is possible to develop a large language model with significantly less hardware.

President Trump has just announced $500 billion in investments, and NVIDIA is probably delighted – Deep Seek has cast a light to medium shadow on this. However, in my opinion this should not be dramatic, as AI is still incredibly computationally intensive. And, of course, all major providers are trying to optimize their models, algorithms and architectures anyway. This time the Chinese were faster – so what.

Deep Seek is exciting because – similar to Meta’s LLama model – it can be installed on your own hardware and also used in a meaningful way. This could be an interesting alternative for many companies. However, there is still a lack of experience with Chinese models…