> *Hover or tap to switch between ground truth (green) and Gemini predictions (b...

Cubre · 2025-07-10T14:49:56 1752158996

Are you implying it "ain't" ground truth because it's not perfect? Ground truth is simply a term used in machine learning to denote a dataset's labels. A quote extracted from the link that you sent acknowledges that ground truth may not be perfect: "inaccuracies in the ground truth will correlate to inaccuracies in the resulting spam/non-spam verdicts".

nomel · 2025-07-10T23:09:55 1752188995

What they have is not ground truth, it's bad data. Why is it bad data? Because any model that uses, or any metric based on it, will be worse. That's in opposition to the definition and purpose of ground truth data: it's not supposed to make things worse.

You're both right. Perfection isn't possible or practical. But their "ground truth" (in that example) is obviously shite, that nobody should be using for training or any sort of metric, since it will make them worse. You're also right that you can name a dataset "ground truth", but names don't mean much when they're in opposition to the intent.

chrismorgan · 2025-07-10T15:52:17 1752162737

Tell me with a straight face that the car labeling is okay. It’s clearly been made by a dodgy automated system, with no human confirmation of correctness. That ain’t ground truth.

ajcp · 2025-07-10T17:54:17 1752170057

You're conflating "truthiness" with "correctness". I realize this sounds like an oxymoron when talking about something called ground "truth", but when we're building ground truth to measure how good our model outputs are, it does not matter what is "true", rather what is "correct".

Our ground truth should reflect the "correct" output expected of the model in regards to it's training. So while in many cases "truth" and "correct" should algin, there are many many cases where "truth" is subjective, and so we must settle for "correct".

Case in point: we've trained a model to parse out addresses from a wide-array of forms. Here is an example address as it would appear on the form.

Address: J Smith 123 Example St

City: LA State: CA Zip: 85001

Our ground truth says it should be rendered as such:

Address Line 1: J Smith

Address Line 2: 123 Example St

City: LA

State: CA

ZipCode: 85001

However our model outputs it thusly:

Address Line 1: J Smith 123 Example St

Address Line 2:

City: LA

State: CA

ZipCode: 85001

That may be true, as there is only 1 address line and we have a field for "Address Line 1", but it is not correct. Sure, there may be a problem with our taxonomy, training data, or any other number of other things, but as far as ground truth goes it is not correct.

chrismorgan · 2025-07-10T18:11:07 1752171067

I fail to see how your example is applicable.

Are you trying to tell me that the COCO labelling of the cars is what you call correct?

ajcp · 2025-07-10T21:32:24 1752183144

I'm trying to help you understand what "ground truth" means.

If, as it seems in the article, they are using COCO to establish ground truth, i.e. what COCO says is correct, then whatever COCO comes up with is, by definition "correct". It is, in effect, the answer, the measuring stick, the scoring card. Now what you're hinting at is that, in this instance, that's a really bad way to establish ground truth. I agree. But that doesn't change what is and how we use ground truth.

Think of it another way:

- Your job is to pass a test.

- To pass a test you must answer a question correctly.

- The answer to that question has already been written down somewhere.

To pass the test does your answer need to be true, or does it need to match what is already written down?

When we do model evaluation the answer needs to match what is already written down.

chrismorgan · 2025-07-11T02:27:25 1752200845

So, it sounds like you’re saying that the ML field has hijacked the well-defined and -understood term “ground truth”, to mean something that should be similar, but which is fundamentally unrelated, and in cases like this is in no way similar. Even what it is to be “correct” is damaged.

I am willing to accept that this is how they are using the terms; but it distresses me. They should choose appropriate terms rather than misappropriating existing terms.

(Your address example I still don’t get, because I expect your model to do some massaging to match custom, so I wouldn’t consider an Address Line 1 of “J Smith 123 Example St” with empty Address Line 2 to be true or correct.)

ghurtado · 2025-07-10T21:44:49 1752183889

You're trying so hard not to learn something new in this thread, that it's almost impressive.

Legend2440 · 2025-07-14T21:27:12 1752528432

MS-COCO labels were all created by humans. But it is common to have 2-3% mislabeled examples, especially because they use cheap overseas labor to label images.

chrismorgan · 2025-07-16T09:08:44 1752656924

I refuse to believe that an unaided human labelled that parking lot. It’s just not plausible. The part where the entire lower half of the image, the entire parking lot, is labelled “car”, sure. That’s the right sort of stupid for humans. But the 13 actual cars and 2 trucks? No way.