The procedure for all assessment is supposed to be that results are analysed with reference to function e.g. reports from people around the child, observation in settings etc.
Mostly, the tests are not that bad to be honest. There are a wide range of subtests on the CELF-4, for example, that are linked to research on which domains are most impaired in different disorders. It's a much more involved test than the Preschool CELF, actually, which really suffers from a lack of an extended speaking task (though this is not particularly fantastic on the CELF-4 either). There are adjusted norms for ASD and HI (you might want to check the relevant stats were used).
In terms of CELF-4 not picking up on conversational difficulties, there is no real reason it would. It is a language assessment, its purpose is not to analyse conversation. For a great many children, structural language will be impaired alongside conversation but in spectrum disorders this is not always the case (and isn't necessary for a spectrum diagnosis etc). If the CELF-4 assessment protocol was followed as per the manual, and combined with a full observation/consultation with other professionals etc and additional testing using other measures where appropriate, there shouldn't be an issue.
With reference to HI, it would make a lot of sense if there was a gap between performance on an assessment in a 1:1 with structure/visual support and "real life". I'm not a specialist in this area, but you might want to discuss pursuing assessment for auditory processing difficulties/disorder if results are "normal" in a quiet/structured environment. On the other hand the score in the RAPT is not that great if done at the age of 5! Don't accept age equivalents, only standard scores. A score of 5/6 on the subtest you mentioned is below the lower end of normal (range 7-12) so does indicate difficulties.
The other question you have to ask is what are school doing now about this? If the results are showing that she can perform quite well in 1:1 structured situations and/or with visuals, what visuals are they putting in place? How are they planning and differentiating schemes of work with reference to quality first teaching for all/targeted support for identified children? Have they identified what support they need from SLT with reference to this - ToD could certainly do this? Sometimes there can be a "wait and see" approach with reference to funding - but actually, that can be a non-issue. Say your daughter has a statement, tomorrow, with SLT twice a week.. she will still spend most of her time in class which is where a lot of the support needs to be, with appropriate adjustments made for full iclusion. A private SALT/second opinion from another NHS SALT looking solely at her performance in class/these issues identified by other professionals may help pick up areas where support could be more targeted/training could be offered, but even intensive 1:1 isn't going to necessarily make the difference you need it to. Not saying it wouldn't help, but if I were you, I would like to know what is being done NOW by those who identify her issues in class and how THEY are going to chase it up (vs leaving it up to you).
Unfortunately, many SLTs today do not have hands-on experience of this type with children and young people which can skew their understanding of how results "look" in real life. Many SLTs are expected to be educational consultants without having worked in schools on an ongoing/intensive basis and without compulsory training on literacy and the demands of the curriculum (though many do pursue this themselves). In a certain sense, many SLTs are not fit for purpose. God knows I wasn't when I qualified. I don't think I had even had adequate training on interpreting psychometric (standardised) assessments and only truly grasped the full implications at MSc level (and I got a first in my undergrad!).
In this country, the typical package of support for children with speech, language and communication needs doesn't mesh well with the potential of the profession. There is nowhere NEAR enough time to do an adequate assessment in most posts/roles - even doing a full CELF-4 is pushing it, let alone also observing and undertaking further measures as described above. We have weak follow-through by the professional bodies on this, who lay down guidelines on things like training/clinical supervision/caseload size but won't comment if you alert them to the fact that instead of a caseload of 60, you have one of 220! The NHS has all this stuff about "patient care" and "patient safety" and "improving patient experience" but commitment to actually meeting the evidence base is poor in terms of resourcing. There is a massive gap between research level evidence in SLT and what happens on the ground and it is nearly all resource based.
In reality in many clinical/school consultative posts, you get used to working a certain way and, actually, you can't offer intensive interventions even if you want to so you have to dilute. This affects your skill development. Again, behind the scenes, there are many campaigning for better.. but we are not seen as a priority in NHS organisations because - to put it bluntly - our clients won't die if we don't offer quality care.
There are so many issues.. but if your daughter needs help, persist.
I would also press for quite an indepth narrative assessment, something like the Strong Narrative Assessment Procedure (SNAP) to look at language structure in extended speaking tasks.
I will stop now, though I'm sure I could say more!