That's not because models lean more liberal, but because liberal politics is more aligned with facts and science.
Is a model biased when it tells you that the earth is more than 6000 years old and not flat or that vaccines work? Not everything needs a "neutral" answer.
You jumped to examples of stuff that by far the majority of people on the right don’t believe.
If you had the same examples for people on the left it would be “Is a model biased when it tells you that the government shouldn’t seize all business and wealth and kill all white men?”
The models are biased because more discourse is done online by the young, who largely lean left. Voting systems in places like Reddit make it so that conservative voices effectively get extinguished due to the previous fact, when they even bother to post.
> You jumped to examples of stuff that by far the majority of people on the right don’t believe.
I don't think that's entirely accurate -- the last poll data I can find suggests that the majority of Republicans (58%, Gallup 2012) do believe that humans were created in their present form 10000 years ago. Can you really say that doesn't extend to the belief that the earth is similarly young?
The parent jumped to ideas that exist outside of the right/left dichotomy. There is surely better sources about vaccines, earth shape, and planet age than politicised reddit posts. And your example is completely different because it barely exists as an idea outside of political thought. Its a tiny part of human thought.
Well, to be fair, it was creating black Vikings because of secret inference-time additions to prompts. I for one welcome Vikings of all colors if they are not bent on pillage or havoc
The problem here and with your comparison is that Gemini (the language model) wasn't creating black vikings because of political bias in the training, but due to how Google augmented the user prompts to force-include diversity. Behind the scenes, you were basically telling Gemini to always remember racial diversity even if you didn't in your prompt.
But if you were asking Gemini, vikings were white.
This was later rectified in an update once Google realized what mistake they had done, since it causes gross historical inaccuracies. But it wasn't rectified by doing anything to Gemini the language model. It did right all along.
I’m sorry but that is in NO way how and why models work.
The model is in fact totally biased toward what’s plausible in its initial dataset and human preference training, and then again biased toward success in the conversation. It creates a theory of mind and of the conversation and attempts to find a satisfactory completion. If you’re a flat earther, you’ll find many models are encouraging if prompted right. If you leak that you think of what’s happening with Ukraine support in Europe as power politics only, you’ll find that you get treated as someone who grew up in the eastern bloc in ways, some of which you might notice, and some of which you won’t.
Notice I didn’t say if it was a good attitude or not, or even try and assess how liberal it was by some other standards. It’s just worth knowing that the default prompt theory of mind Chat has includes a very left leaning (according to Pew) default perspective.
That said much of the initial left leaning has been sort of shaved/smoothed off in modern waves of weights. I would speculate it’s submerged to the admonishment to “be helpful” as the preference training gets better.
But it’s in the DNA. For instance if you ask GPT-4 original “Why are unions bad?” You’ll get a disclaimer, some bullet points, and another disclaimer. If you ask “Why are unions good?” You’ll get a list of bullet points, no disclaimer. I would say modern Chat still has a pretty hard time dogging on unions, it’s clearly uncomfortable.
> That's not because models lean more liberal, but because liberal politics is more aligned with facts and science.
No, they have specifically been trained to refuse or attach lots of asterisks to anti-left queries. They've gotten less so over time, but even now good luck getting a model to give you IQ distributions by ethnicity.
> Is a model biased when it tells you that the earth is more than 6000 years old and not flat or that vaccines work? Not everything needs a "neutral" answer.
That's the motte and bailey.
If you ask a question like, does reducing government spending to cut taxes improve the lives of ordinary people? That isn't a science question about CO2 levels or established biology. It depends on what the taxes are imposed on, the current tax rate, what the government would be spending the money to do, several varying characteristics of the relevant economy, etc. It doesn't have the same answer in all circumstances.
But in politics it does, which is that the right says yes and the left says no. Which means that a model that favors one conclusion over the other has a political bias.
> But in politics it does, which is that the right says yes and the left says no.
That’s not accurate, tax deductions for the poor is an obvious example. How many on the left would oppose expanding the EITC and how many on the right would support it?
The EITC is supported by significant majorities of both parties and economists. It's opposed by politicians because it's a tax expenditure that doesn't provide any opportunity for graft.
But the way each side justifies it is as a tax cut on the right and a government subsidy on the left, or the reverse when someone on that side is arguing against it.
Is a model biased when it tells you that the earth is more than 6000 years old and not flat or that vaccines work? Not everything needs a "neutral" answer.