In the [[RareDiseaseQBNRequirements|previous post]], we outlined the data types needed to build a Quantum Bayesian Network for diagnosing rare diseases:
- Genetic data to root the model in biology
- Phenotypes to connect the seen to the unseen
- Clinical findings to trace the path from mutation to symptom
- Association graphs to define structure
- Expert knowledge to guide inference when data runs short
It’s a clear list. But getting this data is anything but clear.
There is no single download button that gives us all the data we need. Instead, there are many, many buttons. They lead to massive files, complex formats, and subtle differences between datasets. Choosing the wrong one could derail the entire effort. Before we even started
It almost feels like we’d need a quantum computer just to decide which dataset to download before we can even use one for diagnosis.
So where do we begin?