6 Comments
User's avatar
praxis22's avatar

There is a certain sense of psyop theatre to the rush of coverage, I doubt many people/journalists outside the industry understood the details however. Mostly I think it sold well as a story, as on it's face, it proved that you could do more with less. Regardless of whether you believe the numbers. It's obvious that DeepSeek has innovated in a space and place that few expected. The very nature of the Chinese tech giants seems to be low paid drudge work and rigid structures. Compared to the collegiate atmosphere spoken about at DeepSeek. That and they gave away the details of how they did what they did in the paper they published. You don't need the code, as the real innovation was in process. At least from my understanding of it. V3 came out to almost no fanfare in December, it was R1 that caused the ruckus. Decent model as it goes, I have a version distilled into Llama 3.1 70B at 5Q_KM running on a 3090 it's very verbose and sea change compared to what came before.

https://thesequence.substack.com/p/the-sequence-opinion-489-crazy-how (paid)

Hand coding PTX and NCCL on GPU allowed DeepSeek R1 to efficiently train its massive model on a cluster of 2,048 H800 GPUs over just two months.

Expand full comment
Riccardo Vocca's avatar

Thank you so much for specifying this technical aspect for your opinion, it is of great value! I think it is very interesting this continuous comparison between the "technical" dimension and technological development with respect to how these technologies arouse emotions, sensations and behaviors of various kinds in users. And what you have specified is very interesting, especially in the theme of the atmosphere around DeepSeek. Thank you very much

Expand full comment
Michael Spencer's avatar

DeepSeek in China is the ah-ha moment some of us experienced with ChatGPT. Young people using it for therapy for example, is not uncommon. It more or less disrupted other apps there, not just globally.

I'm not sure Western people can fully understand what DeepSeek means tbh. DeepSeek is a rally cry for China AI, open-source and many other things including China's homegrown semi independence and national strategy.

Expand full comment
Riccardo Vocca's avatar

In fact, one of the really interesting things Michael as you also noted in the various articles is the potential transversality of its impact from many points of view. And this does not only concern the strategic and technical aspects, but also the perception of people, of users, how they feel and how they evaluate certain elements. Thank you very much for this reflection.

Expand full comment
Michael Spencer's avatar

Since the world doesn't have a legitimate media, what happens is we are simply amplifying a lot of American centric and eurocentric viewpoints. For me the key point about DeepSeek is that it could Mark a change in how builders and developers in China dominate the application layer via open source accessibility.

Expand full comment
Riccardo Vocca's avatar

Thanks for pointing this out Michael. I think the point is really what people perceive and how the media are slowly "shaping" the perception of tools, technologies and how this could influence the use of technologies. Looking forward to reading analyses on this!

Expand full comment