Most likely there is a separate censor LLM watching the model output. When it detects something that needs to be censored it will zap the output away and stop further processing. So at first you can actually see the answer because the censor model is still “thinking.”
When you download the model and run it locally it has no such censorship.
Technically just a little bit different from fracking as used in the oil/gas industry, since it doesn’t create new fractures in the rock, it only expands existing ones. However it carries basically the same risks with at most a difference in magnitude.
There’s an interesting case in Switzerland where they tried to drill one over an historically active fault line, without first doing a seismic risk assessment.