Pretty rich coming from Proton, who shoved a LLM into their mail client mere months ago.
Open Source
All about open source! Feel free to ask questions, and share news, and interesting stuff!
Useful Links
- Open Source Initiative
- Free Software Foundation
- Electronic Frontier Foundation
- Software Freedom Conservancy
- It's FOSS
- Android FOSS Apps Megathread
Rules
- Posts must be relevant to the open source ideology
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
- !libre_culture@lemmy.ml
- !libre_software@lemmy.ml
- !libre_hardware@lemmy.ml
- !linux@lemmy.ml
- !technology@lemmy.ml
Community icon from opensource.org, but we are not affiliated with them.
wait, what? How did I miss that? I use protonmail, and I didn't see anything about an LLM in the mail client. Nor have I noticed it when I check my mail. Where/how do I find and disable that shit?
Thank you. I've saved the link and will be disabling it next time I log in. Can't fucking escape this AI/LLM bullshit anywhere.
The combination of AI, crypto wallet and CEO's pro-MAGA comments (all within six months or so!) are why I quit Proton. They've completely lost the plot. I just want a reliable email service and file storage.
I'm considering leaving proton too. The two things I really care about are simplelogin and the VPN with port forwarding. As far as I understand it, proton is about the last VPN option you can trust with port forwarding
Happily using AirVPN for port forwarding.
I'm strongly considering switching to them! How do you like it?
The interface - GUI and website - is straight out of 2008 and documentation could be better, but otherwise it works just fine for torrenting and browsing. No complaints there.
DeepSeek is open source, meaning you can modify code(new window) on your own app to create an independent — and more secure — version. This has led some to hope that a more privacy-friendly version of DeepSeek could be developed. However, using DeepSeek in its current form — as it exists today, hosted in China — comes with serious risks for anyone concerned about their most sensitive, private information.
Any model trained or operated on DeepSeek’s servers is still subject to Chinese data laws, meaning that the Chinese government can demand access at any time.
What???? Whoever wrote this sounds like he has 0 understanding of how it works. There is no "more privacy-friendly version" that could be developed, the models are already out and you can run the entire model 100% locally. That's as privacy-friendly as it gets.
"Any model trained or operated on DeepSeek's servers are still subject to Chinese data laws"
Operated, yes. Trained, no. The model is MIT licensed, China has nothing on you when you run it yourself. I expect better from a company whose whole business is on privacy.
To be fair, most people can't actually self-host Deepseek, but there already are other providers offering API access to it.
What???? Whoever wrote this sounds like he has 0 understanding of how it works. There is no "more privacy-friendly version" that could be developed, the models are already out and you can run the entire model 100% locally. That's as privacy-friendly as it gets.
Unfortunately it is you who have 0 understanding of it. Read my comment below. Tldr: good luck to have the hardware
I understand it well. It's still relevant to mention that you can run the distilled models on consumer hardware if you really care about privacy. 8GB+ VRAM isn't crazy, especially if you have a ton of unified memory on macbooks or some Windows laptops releasing this year that have 64+GB unified memory. There are also websites re-hosting various versions of Deepseek like Huggingface hosting the 32B model which is good enough for most people.
Instead, the article is written like there is literally no way to use Deepseek privately, which is literally wrong.
Is it Open Source? I cannot find the source code. The official repository https://github.com/deepseek-ai/DeepSeek-R1 only contains images, a PDF file, and links to download the model. But I don't see any code. What exactly is Open Source here?
I don't see the source either. Fair cop.
Thanks for confirmation. I made a top level comment too, because this important information gets lost in the comment hierarchy here.
How apt, just yesterday I put together an evidenced summary of the CEOs recent absurd comments. Why are Proton so keen to throw away so much good will people had invested in them?!
This is what the CEO posting as u/Proton_Team stated in a response on r/ProtonMail:
Here is our official response, also available on the Mastodon post in the screenshot:
Corporate capture of Dems is real. In 2022, we campaigned extensively in the US for anti-trust legislation.
Two bills were ready, with bipartisan support. Chuck Schumer (who coincidently has two daughters working as big tech lobbyists) refused to bring the bills for a vote.
At a 2024 event covering antitrust remedies, out of all the invited senators, just a single one showed up - JD Vance.
By working on the front lines of many policy issues, we have seen the shift between Dems and Republicans over the past decade first hand.
Dems had a choice between the progressive wing (Bernie Sanders, etc), versus corporate Dems, but in the end money won and constituents lost.
Until corporate Dems are thrown out, the reality is that Republicans remain more likely to tackle Big Tech abuses.
Source: https://archive.ph/quYyb
To call out the important bits:
- He refers to it as the "official response"
- Indicates that JD Vance is on their side just because he attended an event that other invited senators didn't
- Rattles on about "corporate Dems" with incredible bias
- States "Republicans remain more likely to tackle Big Tech abuses" which is immediately refuted by every response
That was posted in ther/ProtonMail sub where the majority of the event took place: https://old.reddit.com/r/ProtonMail/comments/1i1zjgn/so_that_happened/m7ahrlm/
However be aware that the CEO posting as u/Proton_Team kept editing his comments so I wouldn't trust the current state of it. Plus the proton team/subreddit mods deleted a ton of discussion they didn't like. Therefore this archive link captured the day after might show more but not all: https://web.archive.org/web/20250116060727/https://old.reddit.com/r/ProtonMail/comments/1i1zjgn/so_that_happened/m7ahrlm/
Some statements were made on Mastodon but these are subsequently deleted, but they're capture by an archive link: https://web.archive.org/web/20250115165213/https://mastodon.social/@protonprivacy/113833073219145503
I learned about it from an r/privacy thread but true to their reputation the mods there also went on a deletion spree and removed the entire post: https://www.reddit.com/r/privacy/comments/1i210jg/protonmail_supporting_the_party_that_killed/
This archive link might show more but I've not checked: https://web.archive.org/web/20250115193443/https://old.reddit.com/r/privacy/comments/1i210jg/protonmail_supporting_the_party_that_killed/
There's also this lemmy discussion from the day after but by that point the Proton team had fully kicked in their censorship so I don't know how much people were aware of (apologies I don't know how to make a generic lemmy link) https://feddit.uk/post/22741653
Until corporate Dems are thrown out, the reality is that Republicans remain more likely to tackle Big Tech abuses.
What a fucking dumbass. Yes, dems suck. But at least Lina Khan was head of the FTC and starting to change how antitrust laws are enforced. Did he delete this post after Trump was inaugurated with 3 of the richest tech billionaires?
People got flack for saying Proton is the CIA, Proton is NSA, Proton is a joint five-eyes country intelligence operation despite the convenient timing of their formation and lots of other things.
Maybe they're not, maybe their CEO is just acting this way.
But consider for a moment if they were. IF they were then all of this would make more sense. The CIA/NSA/etc have a vested interest in discrediting and attacking Chinese technology they have no ability to spy or gather data through. The CIA/NSA could also for example see a point to throwing in publicly with Trump as part of a larger agreed upon push with the tech companies towards reactionary politics, towards what many call fascism or fascism-ish.
My mind is not made up. It's kind of unknowable. I think they're suspicious enough to be wary of trusting them but there's no smoking gun, yet there wasn't a smoking gun that CryptoAG was a CIA cut-out until some unauthorized leaks nearly a half century after they gained control and use of it. We know they have an interest in subverting encryption, in going fishing among "interesting" targets who might seek to use privacy-conscious services and among dissidents outside the west they may wish to vet and recruit.
True privacy advocates should not be throwing in with the agenda of any regime or bloc, especially those who so trample human and privacy rights as that of the US and co. They should be roundly suspicious of all power.
OpenAI, Google, and Meta, for example, can push back against most excessive government demands.
Sure they "can" but do they?
“Pushing back against the government” doesn’t even make sense. These people are oligarchs. They largely are the government. Who attended Trump’s inauguration? Who hosted Trump’s inauguration party? These US tech oligarchs.
Why do that when you can just score a deal with the government to give them whatever information they want for sweet perks like foreign competitors getting banned?
They cannot. When big daddy FBI knocks on the door and you get that forced NDA you, will build in backdoors and comply with anything the US government tells you.
Even then the US might want to you to shut down because they want to control your company.
TikTok.
1978 US Automotive Companies: If we make a product that locks our customers in, they'll be our customers forever!
1978 Japanese Automotive Companies: The US gave us their required parameters. If we make a product that works then customers will keep buying our stuff.
2025 US Tech Companies: If we make our products contingent on proprietary software and hardware, we'll lock them in.
2025 Chinese Tech Companies: The US gave us their required parameters. If we make a product that works and they can utilize freely, they'll keep buying our stuff.
Not our first rodeo.
How is this Open Source? The official repository https://github.com/deepseek-ai/DeepSeek-R1 contains images only, a PDF file, and links to download the model. I don't see any code. What exactly is Open Source here? And if so, where to get the source code?
Open-Source in AI usually posted to HuggingFace instead of GitHub: https://huggingface.co/deepseek-ai/DeepSeek-R1
In deep learning generally open source doesn't include actual training or inference code. Rather it means they publish the model weights and parameters (necessary to run it locally/on your own hardware) and publish academic papers explaining how the model was trained. I'm sure Stallman disagrees but from the standpoint of deep learning research DeepSeek definitely qualifies as an "open source model"
Just because they call it Open Source does not make it. DeepSeek is not Open Source, it only provides model weights and parameters, not any source code and training data. I still don't know whats in the model and we only get "binary" data, not any source code. This is not Libre software.
There is a nice (even if by now already a bit outdated) analysis about the openness of different "open source" generative AI projects in the following article: Liesenfeld, Andreas, and Mark Dingemanse. "Rethinking open source generative AI: open washing and the EU AI Act." The 2024 ACM Conference on Fairness, Accountability, and Transparency. 2024.
So "Open Source" to AI is just releasing a .psd file used to export a jpeg, and you need some other proprietary software like Photoshop in order to use it.
What other proprietary software is necessary to use model weights?
im not an expert at criticism, but I think its fair from their part.
I mean, can you remind me what are the hardware requirements to run deepseek locally?
oh, you need a high-end graphics card with at least 8 GB VRAM for that*? for the highly distilled variants! for more complete ones you need multiple such graphics card interconnected! how do you even do that with more than 2 cards on a consumer motherboard??
how many do you think have access to such a system, I mean even 1 high-end gpu with just 8 GB VRAM, considering that more and more people only have a smartphone nowadays, but also that these are very expensive even for gamers?
and as you will read in the 2nd referenced article below, memory size is not the only factor: the distill requiring only 1 GB VRAM still requires a high-end gpu for the model to be usable.
so my point is that when talking about deepseek, you can't ignore how they operate their online service, as most people will only be able to try that.
I understand that recently it's very trendy, and cool to shit on Proton, but they have a very strong point here.
Just because the average consumer doesn’t have the hardware to use it in a private manner does not mean it’s not achievable. The article straight up pretends self hosting doesn’t exist.
The 1.5B version that can be run basically on anything. My friend runs it in his shitty laptop with 512MB iGPU and 8GB of RAM (inference takes 30 seconds)
You don't even need a GPU with good VRAM, as you can offload it to RAM (slower inference, though)
I've run the 14B version on my AMD 6700XT GPU and it only takes ~9GB of VRAM (inference over 1k tokens takes 20 seconds). The 8B version takes around 5-6GB of VRAM (inference over 1k tokens takes 5 seconds)
The numbers in your second link are waaaaaay off.
this is obviously talking about their web app, which most people will be using. In this special instance, it was clearly not the LLM itself censoring the Tiananmen Square, but a layer on top.
i have not bothered downloading and asking deepseek about Tiananmen Square. so i cannot know what the model would have generated. however, it is possible that certain biasses are trained into any model.
i am pretty sure, this blog is aimed at the average user. while i wouldn't trust any LLM company with my data, i certainly wouldn't want the chinese government to have them. anyone that knows how to use (ollama)[https://github.com/ollama/ollama] should know these telemetry data don't apply to running locally. but for sure, pointing it out in the blog would help.
@ToxicWaste @JOMusic the censorship is trained into the ollama models too. But of course the self-hosted model cannot send anything to China, so at least the whole tracking issue is avoided.
Jesus fuckin Christ, just marry Trump at this point, Mister proton CEO.
There are many llms you can use offline
Including DeepSeek: https://huggingface.co/deepseek-ai
Deepseek works reasonably well, even at cpu only in ollama. I ran the 7b and 1.5b models and it wasn't awful. 7b slowed down as the convo went on, but the 1.5b model felt pretty passable while I was playing with it
I want to preface this question by saying that I'm not trolling and I'm not defending Proton. I'm genuinely confused at the reaction to this article.
I'm also upset with Proton's recent comments, specifically the December tweet and subsequent responses, and I'm evaluating my use of Proton.
Near as I can tell, this article (which I did read) lays out the facts about Deepseek as an LLM originating in China and the implications of that.
Why is this article a reason to pile on proton?
It might be that they're equating the name with the app and company, not the open source model, based on one of the first lines:
AI chat apps like ChatGPT collect user data, filter responses, and make content moderation decisions that are not always transparent.
Emphasis mine. The rest of the article reads the same way.
Most people aren't privacy-conscious enough to care who gets what data and who's building the binaries and web apps, so sounding the alarm is appropriate for people who barely know the difference between AI and AGI.
I get that people are mad at Proton right now (anyone have a link? I'm behind on the recent stuff), but we should ensure we get mad at things that are real, not invent imaginary ones based on contrived contexts.
Here is a general write up about the CEO showing their maga colors.
More happened in the reddit thread though that added some more elements, like the ceo opting for a new user name with "88" in it (a common right wing reference), his unprompted use of the phrase "didnt mean to trigger you," him evasively refusing to clarify what his stance actually was because "that would be more politics," on and on. You can read through that thread here, although proton corporate are mods, so i have no idea what they may have deleted at this point.
The thread was full of "mask on" behavior that is pretty transparent to anyone experienced with the alt right on the internet.
it is certainly that. but recently its become very trendy to hate Proton, so its just easier to do that instead of thinking. I'm really disappointed in this community
Guys I know OpenAI is not clear, its as bad as deepseek and even worse, BUT you have to realize, that most people don't give a fuck about running deepseek locally, they just download deepsek app and use it, which is more privacy intrusive even than ClosedAI. Giving information to China, when you live on the west is like giving russians information, when you live in Ukraine. We are on constant war with China, because we are democratic, they are communism, and we cannot just give them our data for free, therefore I have to admit PROTON IS RIGHT about deepseek being "deepsneak"
As a queer person I don't really care at this point if China or Russia is tracking me. They aren't the ones who are currently stripping me and others of rights and so many other things.
I don't trust any governments on this front, but the government I live under is way more of a concern.
Is China in the room with you right now?
Propaganda got ya good bud. Sure. It's important but Jesus. Lol. ChatGPT does the same shit but doesn't let me run it locally. Fuck ChatGPT