T1 - Do rare variant genotypes predict common variant genotypes?

T1 - Simulations provide support for the common disease-common variant hypothesis

The ‘Common Disease-Common Variant’ Hypothesis …

N2 - The investigation of heritable susceptibility to disease is an effort to associate disease phenotype with underlying genotype. Such genotype: phenotype associations have been demonstrated for a large number of monogenetic disorders. The usual strategy has been to use linkage mapping in affected families to identify chromosomal loci from which candidate genes and genotypes can be tested for association with disease. This strategy has not been similarly successful for common heritable disease susceptibilities including hypertension that involve multiple genes and gene-environment interactions. Development of extensive collections of single nucleotide polymorphisms (SNPs) raises the possibility that these SNPs can be used as markers in genome-wide association mapping studies to identify hypertension susceptibility loci. In this approach, large numbers of markers are typed in cases and controls with the expectation that markers interrogating SNPs that are involved in inheritance of disease susceptibility will emerge through their association with this trait in the affected population. Essential hypertension is a common disorder. The term "common" has 2 implications: first, that the disease is prevalent; and, second, that it is widespread. Such frequency and distribution characteristics could arise if the susceptibility alleles for hypertension were prevalent in the founding population of contemporary human beings and became distributed with human global dispersal. This common disease: common variant concept is attractive because it suggests that the genetic heterogeneity underlying hypertension susceptibility could be relatively small. It also allows the possibility that nonrandom association of alleles (linkage disequilibrium, LD) can be used to reduce the number of SNP markers required to identify disease susceptibility alleles because a single marker can act as a surrogate for variation flanking it. The influence of a number of important factors on the detectability of hypertension susceptibility alleles by SNP mapping approaches is not yet fully defined. These factors include the locus and allelic diversity of hypertension, the weaker relationship (compared with Mendelian traits) between genotype and phenotype, the accuracy of high throughput genotyping techniques, the extensive role of nongenetic factors, and the extent and heterogenous nature of LD across the genome.

T1 - Hypertension genetics, single nucleotide polymorphisms, and the common disease

The common disease common variant concept (Genetics)

Unraveling the genetic basis of human diseases represents a major challenge in human genetics. This is especially the case for complex diseases, which comprise the bulk of the disease burden in industrialized societies. Complex genetic disease traits arise as a consequence of genetic and environmental contributions to disease susceptibility (see Article 58, Concept of complex trait genetics, Volume 2). The genetic component is split across many loci, each contributing a small effect to the overall susceptibility. Thus, the identification of the causal genetic variants in these diseases presents a major challenge to human geneticists. The problems of achieving this goal are further magnified by the complexities that result from the gene-gene and gene-environment interactions.

Cardon LR and Bell JI (2001) Association study designs for complex diseases. Nature Reviews. Genetics 2: 91–99.

Another important factor in formulating the CDCV hypothesis is the strength of the selection pressure on allele diversification. For alleles to attain a high-equilibrium frequency in a population, that is, retain low diversity and be useful markers in LD mapping of complex diseases, they must be selectively neutral or be under low-selective pressure. This leads to a lower turnover in the population and decreases the effect of new mutations. However, selectively neutral alleles are likely to make a small contribution to the overall disease risk, giving weak associations to disease, which will be another factor in the requirement for large samples size. This situation might be the case for alleles implicated in disorders such as for late onset diseases such as type 2 diabetes (T2D) or hypertension (Wright and Hastie, 2001). Furthermore, it is necessary to take into account stratification issues when interpreting association study results. An allele may be at a high frequency in a subpopulation, due to urbanization or geographical isolation, so a positive or negative association may not be applicable to the population as a whole. The influence of the rapid expansion of human populations may also be seen in relatively isolated populations. In such circumstances, there may be a marked influence of the chromosomal composition of the founders of that population. Indeed, as can be seen in Figure 1, if this “founding” effect occurred relatively recently, then even low-frequency disease-susceptibility alleles may be disproportionately represented in the contemporary population.

…makes me wonder about the utility of stark verbal hypotheses in the (coming) era of fine grained data.

The 'common disease-common variant' hypothesis …

Not all monogenic diseases demonstrate marked allelic diversity. Several factors may explain this situation. Most simply is the case of heterozygous advantage. In this scenario, alleles whose pathologic effect is biased toward recessivity can confer a selective advantage when present as a single copy. There are several well-characterized examples of polymorphisms that exhibit some degree of protection from malaria, for example, G6PD deficiency at the G6PD locus, and HbC and HbS at the HBB locus in West African populations; the estimated allele frequencies being 0.20, 0.09, and 0.10, respectively. Selection may operate during the phase of population expansion, and it may increase the frequency of certain alleles in the population in the preexpansion phase. For example, the mutations that give rise to cystic fibrosis (CF) do not immediately appear to comply with the arguments set out above. More than 900 alleles of the CF transmembrane conductance (CFTR) gene have been associated with CF, however, approximately 70% of cases are due to one single deletion, AF508. It has been postulated that heterozygotes carrying CFTR mutations may have some selective resistance to Salmonella typhi (Pier et al., 1998). Irrespective of the mechanism, it can be shown that for an allele as frequent as AF508 in the preexpanded population this simple ancestral spectrum will persist following expansion with a half-life of 39 000 to 390 000 years, depending on the assumed mutation rate (Reich and Lander, 2001).

The rise and fall of the common disease-common variant …

This argument would seem to favor the CDRV hypothesis. Not so. The key concept for explaining why is one borrowed from epidemiology called the –essentially, the number of cases in a population that can be attributed to a given risk factor. An example: imgaine smoking cigarettes gives you a 5% chance of developing lung cancer, while working in an asbestos factory gives you a 70% chance. You might argue that working in an asbestos factory is a more important risk factor than cigarette smoking, and you would be correct–on an individual level. On a population level, though, you have to take into account the fact that millions more people smoke than work in asbestos factories. If everyone stopped smoking tomorrow, the number of lung cancer cases would drop precipitously. But if all asbestos factory workers quit tomorrow, the effect on the population level of lung cancer would be minimal. So you can see where I’m going with this: common susceptibility alleles contribute disproportinately to the population attributable risk for a disease. In type II diabetes, for example, a single variant with a rather small effect but a moderate frequency accounts for 21% of all cases[].