MSOCS - The high level of classification is part of the problem from both sides. The skeptics say that it prevents open discussion (and indeed prevents discussion at less-than-high class levels). The proponents complain that the skeptics are ill-informed (but insist that they be kept that way).
What we do all know is that the effectiveness of RCS reduction and LPI-LPD are parameters that have a colossal influence on force-on-force simulations. Turn those up to 11 and, yes, you'll be 400 or 600 per cent better in air-to-air than a Su-35 and eight times more effective than most things at ground attack.
Dial them back to 5 (for instance, assume some VHF detect/track and that you won't be able to achieve a high-Pk AMRAAM launch in conditions of total surprise) and the results will be different. If detection is mutual before weapon launch, then the old-skool metrics return to play.
With the same more modest assumptions, and dealing with an IADS (Busdriver), the comparison to a "legacy jet" (actually a modern aircraft using a combo of modest RCS reduction, high-end EW, SEAD/DEAD and standoff) is also different.
As for F-16 comparisons - it might be worth remembering that the F-16 IOC was around 40 years before the F-35 and the Hurricane was about 40 years before the F-16. So even if you get your numbers half right (which those cited above are not) the comparison has little validity,