Human complex diseases like diabetes, neuro-degenerative diseases and cancers are major challenges because they still affect many people today but are difficult to decipher. Complex diseases are caused by a combination of genetic, environmental and life style factors, and describing the etiology of such diseases is not an easy undertaking (Noble, 2012). The genomics of a complex disease needs to be discovered and understood, in order to delve into the intricacies of the disease. We suggest that following identification of genetic loci that confer susceptibility to a particular trait, it is necessary to build a clear picture of the genomic makeup of these susceptibility loci and the special features that characterize them.
Using Type 1 Diabetes (T1D) as a model, we collected and analysed data on structural and functional genomic features so as to characterize the susceptibility regions associated with this disease. The aim was to find out if these regions differ strikingly in genomic content and how, and if so whether this difference is reflected in other autoimmune diseases with which they are associated. T1D region names and genomic coordinates were collected from T1Dbase.org (an online database for T1D specific genomic information). Feature data for each susceptibility region were obtained from the Ensembl genome browser; these included coordinates of gene transcripts, exons, introns, UTR and intergenic sequences. Features lengths were computed, normalized with relativity to total susceptibility regions sizes, and then used for hierarchical cluster analysis.
Our results revealed three main clusters of T1D regions. The first cluster consisted of regions having significantly large sequences (bps) of intronic DNA and non-coding RNA. These regions are also loci for other autoimmune diseases with more relative percentage counts for Multiple Sclerosis, Crohn’s Disease and Primary Biliary Cirrhosis than regions in other clusters. The second cluster, which included the Human leukocyte Antigen locus, comprised of gene dense regions with large Single Nucleotide Polymorphism (SNP) counts. We also recorded a large number of SNPs in experimentally verified transcription factor binding sites. The regions, also loci for other autoimmune diseases, had more relative counts for Rheumatoid Arthritis and Celiac Disease. The third cluster had no particular outstanding attributes but also showed a relatively high count for Celiac Disease.
Recent studies by Burton et.al. (2007) and Ward & Kellis (2012) have shown that the genetic determinants of complex diseases be sought in problems associated with gene regulation rather than gene coding. Our results suggest focal points for studying the effects of regulatory variation in T1D susceptibility regions. Mutations in non-coding RNA molecules involved in regulation (cluster 1), as well as in vital binding motifs (cluster 2) lead to problems such as formation of faulty proteins or obstruction of important biological networks thereby causing disease.
Noble, J. & Erlich, H., 2012. Genetics of Type 1 Diabetes. Cold Spring Harbor Perspectives in Medicine, 2(1), p. a007732.
Burton, P., Clayton, D. & Cardon, L., 2007. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 447(7145), pp. 661-678.
Ward L.D., & Kellis M., 2012. Interpreting noncoding genetic variation in complex traits and human disease. Nature Biotechnology, 11(1095), p.106.