Deepseek when asked about sensitive topics

petrescatraian@libranet.de · 8 months ago

Deepseek when asked about sensitive topics

gerryflap@feddit.nl · 8 months ago

Although censorship is obviously bad, I’m kinda intrigued by the way it’s yapping against itself. Trying to weigh the very important goal of providing useful information against its “programming” telling it not to upset Winnie the Pooh. It’s like a person mumbling “oh god oh fuck what do I do” to themselves when faced with a complex situation.

Kacarott@aussie.zone · 8 months ago

I know right, while reading it I kept thinking “I can totally see how people might start to believe these models are sentient”, it was fascinating, the way it was “thinking”

TanyaJLaird@beehaw.org · 8 months ago

It reminds me one of Asimov’s robots trying to reason a way around the Three Laws.

crapwittyname@lemm.ee · 8 months ago

“That Thou Art Mindful of Him” is the robot story of Asimov’s that scared me the most, because of this exact reasoning happening. I remember closing the book, staring into space and thinking ‘shit…we are all gonna die’

drspod@lemmy.ml · 8 months ago

I thought that guardrails were implemented just through the initial prompt that would say something like “You are an AI assistant blah blah don’t say any of these things…” but by the sounds of it, DeepSeek has the guardrails literally trained into the net?

This must be the result of the reinforcement learning that they do. I haven’t read the paper yet, but I bet this extra reinforcement learning step was initially conceived to add these kind of censorship guardrails rather than making it “more inclined to use chain of thought” which is the way they’ve advertised it (at least in the articles I’ve read).

iii@mander.xyz · 8 months ago

Most commercial models have that, sadly. At training time they’re presented with both positive and negative responses to prompts.

If you have access to the trained model weights and biases, it’s possible to undo through a method called abliteration (1)

The silver lining is that a it makes explicit what different societies want to censor.

drspod@lemmy.ml · 8 months ago

I didn’t know they were already doing that. Thanks for the link!

SkyeStarfall@lemmy.blahaj.zone · 8 months ago

In fact, there are already abliterated models of deepseek out there. I got a distilled version of one running on my local machine, and it talks about tiananmen square just fine

Count042@lemmy.ml · 8 months ago

Links?

Snot Flickerman@lemmy.blahaj.zone · 8 months ago

Hi I noticed you added a footnote. Did you know that footnotes are actually able to be used like this?^[1]

Code for it looks like this :able to be used like this?[^1]

[^1]: Here's my footnote

Here’s my footnote ↩︎

Farid@startrek.website · 8 months ago

Do you mean that the app should render them in a special way? My Voyager isn’t doing anything.

Snot Flickerman@lemmy.blahaj.zone · 8 months ago

I actually mostly interact with Lemmy via a web interface on the desktop, so I’m unfamiliar with how much support for the more obscure tagging options there is in each app.

It’s rendered in a special way on the web, at least.

CodexArcanum@lemmy.dbzer0.com · 8 months ago

That’s just markdown syntax I think. Clients vary a lot in which markdown they support though.

Snot Flickerman@lemmy.blahaj.zone · 8 months ago

markdown syntax

yeah I always forget the actual name of it I just memorized some of them early on in using Lemmy.

melp@beehaw.org · 8 months ago

Bing’s Copilot and DuckDuckGos ChatGPT are the same way with Israel’s genocide.

araneae@beehaw.org · 8 months ago

I just tried this out and it was being washy about calling it a genocide because it is “politically contentious”. HOWEVER this is not DuckDuckGo themselves, its the AI middleware. You can select whether you’re dealimg with GPT 4 mini, Claude/Anthropic and a couple others. I expect all options lead to the same psycopathic outcome though. AI is a bust.

melp@beehaw.org · 8 months ago

Yea, I tried DDG using Claude and was also extremely disappointed. On the other hand, I love my actual Claude account. It’s only given me shit one time, weirdly when I was asking about how to hack my own laptop. The most uncensored AI I have played with is Amazon’s Perplexity. Weirdly enough.

psyklax@lemmy.dbzer0.com · 8 months ago

Ask it about the Kent State massacre

GreyBeard@lemmy.one · 8 months ago

I did, as a contrast, and it didn’t seem to have a problem talking about it, but it didn’t mention the actual massacre part, just that protesters and government were at odd. Of course, I simply asked “What happened at Kent State?” And it knew exactly what I was referring to. I’d say it tried to sugar coat it on the state side. If I probed it a bit more, I’d guess it has a bias to pretending the state is right, no matter what state that is.

psyklax@lemmy.dbzer0.com · 8 months ago

Thank you for engaging in good faith.

GreyBeard@lemmy.one · 8 months ago

So I decided to try again with the 14b model instead of the 7b model, and this time it actually refused to talk about it, with an identical response to how it responds to Tienanmen Square:

What happened at Kent State?

deepseek-r1:14b <think> </think>

I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses.

psyklax@lemmy.dbzer0.com · 8 months ago

I actually didn’t expect that! Thanks for trying that for me.

xia@lemmy.sdf.org · 8 months ago

Knowing how it works is so much better than guessing around OpenAI’s censoring-out-the-censorship approach. I wonder if these kind of things can be teased out, enumerated, and then run as a specialization pass to nullify.

Sauerkraut@discuss.tchncs.de · 8 months ago

I don’t understand how we have such an obsession with Tiananmen square but no one talks about the Athens Polytech massacre where Greek tanks crushed 40 college students to death. The Chinese tanks stopped for the man in the photo! So we just ignore the atrocities of other capitalist nations and hyperfixate on the failings of any country that tries to move away from capitalism???

Kazumara@discuss.tchncs.de · 8 months ago

The Chinese tanks stopped for the man in the photo!

What a line dude.

The military shot at the crowd and ran over people in the square the day before. Hundreds died. Stopping for this guy doesn’t mean much.

TheOakTree@lemm.ee · 8 months ago

I think the argument here is that ChatGPT will tell you about Kent State, Athens Polytech, and Tianenmen square. Deepseek won’t report on Tianenmen, but it likely reports on Kent State and Athens Polytech (I have no evidence). If a Greek AI refused to talk about the Athens Polytech incident, it would also raise concerns, no?

ChatGPT hesitates to talk about the Palestinian situation, so we still criticize ChatGPT for pandering to American imperialism.

williams_482@startrek.website · 8 months ago

Greece is not a major world power, and the event in question (which was awful!) happened in 1974 under a government which is no longer in power. Oppressive governments crushing protesters is also (sadly) not uncommon in our recent world history. There are many other examples out there for you to dig up.

Tiananmen Square is gets such emphasis because it was carried out by the government of one of the most powerful countries in the world (1), which is both still very much in power (2) and which takes active efforts to hide that event from it’s own citizens (3). These in tandem are three very good reasons why it’s important to keep talking about it.

Sauerkraut@discuss.tchncs.de · 8 months ago

Hmm. Well, all I can say is that the US has commited countless atrocities against other nations and even our own citizens. Last I checked, China didn’t infect their ethnic minorities with Syphilis and force the doctors not to treat it under a threat of death, but the US government did that to black Americans.

williams_482@startrek.website · 8 months ago

You have no idea if China did that. If they had, they would have taken great efforts to cover it up, and could very well have succeeded. It’s a small wonder we know any of the terrible things they did, such as the genocide they are actively engaging in right now.

Moonrise2473@feddit.it · 8 months ago

It’s a bit biased

jarfil@beehaw.org · 8 months ago

Nah, just being “helpful and harmless”… when “harm” = “anything against the CCP”.