Relatedness Estimator

Project Member

David Chien

The goal of this quarter

To create an analysis tool to identify the best conditions to maximize the accuracy of the Relatedness Likelihood Ratio Method. This will involve computing the preliminary joint probabilities for both related and unrelated individuals. Lastly, we want to be able to simulate the relatedness estimator method for a given minor allele frequency and number of SNPs compared. The results of the simulation will tell us the accuracy of the method and allow us to find the optimal conditions to minimize costs while maximizing accuracy.

The Schedule for the quarter

Week 4: Pick project + Setup project wiki.
Week 5: Research the joint probabilities for related and unrelated individuals + Research the Relatedness Likelihood Ratio Method.
Week 6: Code the high level program and compute the preliminary computations.
Week 7: Compute the joint probabilities of related and unrelated individuals as well as the relatedness likelihood ratios of each SNP allele combination.
Week 8: Add simulation code (allowing user to simulate 1000 unrelated individuals and 1000 related individuals) for a given number of SNPs and minor allele frequency.
Week 9: Make any last minute changes + Make presentation slides + Final Project Presentation!
Week 10: Type Report + Improve Presentation Slides + Update Wiki

Project Description

Given the genomes of two individuals, how can we determine if they are related? Parents transmit one chromosome to each of their children. Thus, siblings share approximately 50% of their DNA while first cousins share about 25% of their DNA. By analyzing the genomes of two individuals, one can estimate whether or not two individuals are related and by how much. However, two individuals may share DNA by chance. This is something that will need to be taken into account.

By calculating the joint probabilities of each allele combination at a given SNP for both related and unrelated individuals, one can calculate the relatedness likelihood ratio which gives insight into whether two individuals are more likely to be related or unrelated. In order to do this, we need to know the minor allele frequency in order to compute all the preliminary probabilities to perform our analysis. Hopefully, by the end of the project, we will be able to simulate the relatedness estimator given user inputted minor allele frequency and number of SNPs to analyze. Through these simulations, we will be able to estimate the accuracy of the method given these conditions in order to maximize accuray while minimizing costs.

Related Papers

April 21-27 2008 (Week 4)

  1. What I did this week: I picked a final project topic and set up my project wiki page.
  2. What I plan to do next week: Download the data, learn about analyzing genomes, and choose an approach for the relatedness estimator.
  3. How what you did compared to what you planned to do: Everything was completed on time.
  4. What grade you think you deserve for your work on the project for the week: A+.

April 28 - May 4 2008 (Week 5)

  1. What I did this week: Started looking into how to estimate whether two individuals are related given their genomes.
  2. What I plan to do next week: Begin coding the preliminary computations necessary to perform the analysis.
  3. How what you did compared to what you planned to do: Everything was completed on time.
  4. What grade you think you deserve for your work on the project for the week: A

May 5 - May 11 2008 (Week 6)

  1. What I did this week: Coded the high level program (user input, functions, etc…) and computed the preliminary probabilities (minor/major allele frequency, individual SNP probabilities.
  2. What I plan to do next week: Speak to professor about joint probabilities.
  3. How what you did compared to what you planned to do: Everything was completed on time.
  4. What grade you think you deserve for your work on the project for the week: A

May 12 - May 18 2008 (Week 7)

  1. What I did this week: Talked to professor about what was needed to perform the relatedness likelihood ratio method and began coding the joint probabilities of related and unrelated individuals as well as the relatedness likelihood ratios.
  2. What I plan to do next week: Add the simulation code to allow testing of different conditions.
  3. How what you did compared to what you planned to do: Everything was completed on time.
  4. What grade you think you deserve for your work on the project for the week: A

May 19 - May 25 2008 (Week 8)

  1. What I did this week: Wasn't able to do much work during the week due to midterm and other projects. Was in NY over weekend as well.
  2. What I plan to do next week: Add the simulation code to allow testing of different conditions + Make my power point slides + Presentation.
  3. How what you did compared to what you planned to do: One week behind!
  4. What grade you think you deserve for your work on the project for the week: BAD :(

May 26 - June 1 2008 (Week 9)

  1. What I did this week: Coded the simulation function which allows the user to simulate 1000 related and 1000 unrelated individuals using the relatedness likelihood ratio method. The simulation allows the user to choose the minor allele frequency and the number of SNPs analyzed per individual. Created powerpoint slides and presented in class.
  2. What I plan to do next week: Type the final report and update my slides/wiki.
  3. How what you did compared to what you planned to do: Everything was completed on time.
  4. What grade you think you deserve for your work on the project for the week: A

June 2 - June 8 (Week 10)

  1. What I did this week: Typed the final report and updated my slides/wiki.
  2. What I plan to do next week: Study for Finals!
  3. How what you did compared to what you planned to do: Everything was completed on time.
  4. What grade you think you deserve for your work on the project for the week: A
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License