fbpx Inferring Genotype from Clinical Phenotype through a Knowledge Based Algorithm | Harvard Kennedy School

HKS Authors

See citation below for complete author information.

Daniel Paul Professor of the Practice of Government and Technology, HKS and FAS


Genomic information is becoming increasingly useful for studying the origins of disease. Recent studies have focused on discovering new genetic loci and the influence of these loci upon disease. However, it is equally desirable to go in the opposite direction – that is, to infer genotype from the clinical phenotype for increased efficiency of treatment. This paper proposes a methodology for such inference. Our method constructs a simple knowledge-based model without the need of a domain expert and is useful in situations that have very little data and/or no training data. The model relates a disease’s symptoms to particular clinical states of the disease. Clinical information is processed using the model, where appropriate weighting of the symptoms is learned from observed diagnoses to subsequently identify the state of the disease presented in hospital visits. This approach applies to any simple genetic disorder that has defined clinical phenotypes. We demonstrate the use of our methods by inferring age of onset and DNA mutations for Huntington’s disease patients.


Malin, Bradley and Latanya Sweeney. "Inferring Genotype from Clinical Phenotype through a Knowledge Based Algorithm." Pacific Symposium on Biocomputing 2002 (2002).