@MRex
I think I get what you are saying (or not, we'll see).
Chain and cluster are different. Cluster=group where infection is contained. Eg get's tested on day 1 of symptoms but has been almost nowhere in 7 days.
Chain = what it says on the tin
If you want to do prediction on that level than it's a minimum of the combination of the following data:
- UID for the infected person
- PErson data (age, gender, race, occupation, comorbidities at least)
- how infectious was the person (if day-1 before testing and if viral load=very high then infectious score=6)
- geo tags of movement
- movement type
- vehicles used for movement
- Vehicle score should factor in all vehicle details: no of ppl on vehicle, ventilation (assign a number on a scale based on irl validity - eg double deckers don't have openable windows=9, small bus fully opened windows=2,etc... ), driver confinment (if plexi, then infection possibility=low; if proper mas then infection possibility =low,...etc)
- events (this is basically all the places that that person went to)
- event score
this is made up by with similar logic to vehicle. School type A=19; School type B=3, wedding=55,...
- number of ppl interacted with filtered by certain criteria
- number of items touched (eg a store assistant will leave more viruses by just doing their job than me going in and touching almost only the things I actually buy. A checkout person touches everything, but briefly, so passing big viral load is unlikely, but passing to hundreds=likely)
- self assigned score on introvert/extrovert scale
- mask wearing by patient
- mask wearing by others
- time (as in what happened/happens when) - to be used for when creating chains
now thinking through what it would show and how. And how would that be useful.
- some of this data should be available through TTR, Google, schools, tfl, venues, NHS
- Feed in multiple known patients that are actually known to be super spreaders as well
- Feed in x months of data
- create a visual chain of infection with potential infection points (if unknown) and if unknown % of likeliness.
- Train the model: if super spreaders and known chains have been
fed to the model than confirm those. Then run the model again.
Results: it should come up with clusters potential chains.
These can be then analysed to see spreader patterns.
This can then be used to run scenarios where if parameters change the cluster patterns and chains would also change.