FGS it's nuts!
Now, to some extent, used correctly, the precise form of the questions doesn't matter, because those things would have been intended to use as comparators. It's a standardised test and you would compare two individuals' responses on the same test.
If the test has a "well, everyone female would say yes to that" question, it doesn't really matter so much, if used as intended. It's the different responses that matter.
But to actually compare scores on the two different tests is indeed nuts.
It would already be nuts if you concluded "girls have more or less gender dysphoria than boys" based on comparing male and female population scores on two completely tests. The two tests are not comparable, and clearly were never intended to be.
To then top that misuse by comparing the results of one person on the correct-sex test with their results on the incorrect-sex test is double-triple super nuts.
IIRC, the Dutch researchers muttered something in the Wider Lens podcast about them using this tool because it's the standard tool and it's all they had.
But this is Jack Monroe opening-a-tin-with-a-knife-and-a-mallet level of incompetence.
Now, they should have known better, because these weren't just "dysphoria" measures in general, they were specifically to assess people for treatment. In this source, you can see them marked as "female-to-male version" and "male-to-female version". They're specifically assessing people for treatment for "transition" from their actual sex.
It appears they forgot somewhere along the way that they were "-to-female" and "-to-male" tests, and thought they could be used as a general "dysphoria" measure, so the "transman" ends up doing the "male-to-female" test.
(The article above references a replacement scale getting rid of the sex dependence on the test, to be "nonbinary-inclusive". That one probably would be comparable pre- and post- "transition", as it locks down the actual and desired sex).