Inversions

Project Member

Phillip Tao

Project Description

Inversions can be found in using SNPs by analyzing LD data.

Given breakpoints B1 and B2, with SNPs S1 on the left of B1, S2 on the right of B1, S3 on the left of B2, and S4 on the right of B2, S1 will often be found to have higher correlation with S3 than S2, and likewise S4 will correlate more strongly with S2 than S3.

An inverted section will also tend to recombine less, therefore it will be more strongly correlated than the surrounding SNPs.

Using these two properties, inversions of a certain length range can be found.

Goal for the quarter

Use simple r^2 correlation to detect inversions between 10kb to 1mb in the HapMap data.

The Schedule for the quarter

Week 4: Decide on project, read Bafna paper
Week 5: Do more research, plan out schedule for project
Week 6: Get small sample of HapMap data, try to replicate some of Bafna's results
Week 7: Start developing algorithm to systematically analyze correlation between very large numbers of SNPs
Week 8: Refine code, generate data with newest available HapMap data
Week 9: Find inverted sections
Week 10: Final talks

June 6 2008

  1. Project completed

June 2 — June 8 2008 (Week 10)

  1. Goal for this week:
    • Give talk
  2. Achievements for last week:
    • Minor tweaks to make program a bit faster
    • Started generating results
  3. Grade for last week
    • A
    • Followed the plan, although results are taking a while to generate

May 26 — June 1 2008 (Week 9)

  1. Goal for this week:
    • Speed up program
    • Generate results
    • Prepare presentation
  2. Achievements for last week:
    • Wrote function to generate sets of four snps
    • Wrote function to group sets
    • Pretty much done with program
  3. Grade for last week
    • A
    • More or less followed my plan

May 19 — May 25 2008 (Week 8)

  1. Goal for this week:
    • Write a function to take data generated this week and find four snps which satisfy the conditions listed in project description
    • Add in function to look for just areas of generally high correlation
  2. Achievements for last week:
    • Wrote functions to calculate differences in correlation with snp s for all other snps a and b.
    • Wrote function to find snp pairs a and b which have an unusual correlation with snp s
  3. Grade for last week
    • A
    • Again, didn't quite follow my plan, but I did make pretty good progress.

May 12 — May 18 2008 (Week 7)

  1. Goal for this week:
    • Replicate some of Bafna's findings
    • Detect areas of abnormally high r-values
    • Think about a way to find r-value patterns indicative of inversions, such as the 4 SNP method
  2. Achievements for last week:
    • Got HapMap data
    • Wrote functions to parse data, calculate r values
  3. Grade for last week
    • A
    • Decided to change schedule a bit, so although I didn't replicate Bafna's results yet, I made progress in other areas

May 5 — May 11 2008 (Week 6)

  1. Goal for this week:
    • Get HapMap data
    • Replicate some of Bafna's results using r^2
    • Start planning for calculating correlation between many SNPs
  2. Achievements for last week:
    • Did a bit more research
  3. Grade for last week
    • B-
    • Overall, not much progress was made, partially because of midterms

April 28 — May 4 2008 (Week 5)

  1. Goal for this week:
    • Do additional research
  2. Achievements for last week:
    • Decided on project
    • Read Bafna's Paper
  3. Grade for last week
    • A

April 21 — April 27 2008 (Week 4)

  1. Goal for this week:
    • Read Bafna's paper
  2. Achievements for last week:
    • N/A
  3. Grade for last week:
    • N/A

Related Papers

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License