Linkage Disequilibrium Datasets

The properly rendered version of this document can be found at Read The Docs.

If you are reading this on github, you should instead click here.

Linkage disequilibrium was run separately for each super population and sub population within 1,000 Genomes phase 3 variants using the method defined in Box 1 of:

LD was computed for all pairs of variants within a window of 1,000,000 bp (1 megabase) and all pairs with absolute allelic correation of 0.4 are retained. See Compute Linkage Disequilibrium on a Variant Set for more detail.

The output files were split by chromosome with output columns indicating the identity of each pair of values and the resulting LD value. The output files have also been loaded into BigQuery with the same columns. Examples of using BigQuery to analyze LD are available as Datalab notebooks.

Google Cloud Platform data locations


Have feedback or corrections? All improvements to these docs are welcome! You can click on the “Edit on GitHub” link at the top right corner of this page or file an issue.

Need more help? Please see https://cloud.google.com/genomics/support.