Multi-Stage, Multi-Locus Design

Project Member

Brian Hackel

The goal of this quarter

Research, design and test statistical models for optimal power multi-stage association studies and investigate the possibilities of optimizing power for multi-locus, multi-stage designs.

The Schedule for the quarter

  • Week 4: Decide on a project topic and create a page on the project wiki
  • Week 5: Go through the lecture material from the multi-stage lecture and independently re-derive the statistics presented therein. Read the relevant research papers.
  • Week 6: Run through the mathematics of multi-stage designs and work toward optimization of power; perhaps work with an emphasis on three-stage designs.
  • Week 7: Adapt mathematical models into computer code to run data for verification of work so far with available extension for future results.
  • Week 8: Depending on successes up to this point, work through models with emphasis either on abstraction to N-stage optimal power designs or with the possibility of multi-locus design.
  • Week 9: Iron-out kinks all successful work to date and outline directions for future study and application. Prepare presentation and/or write-up of final project.
  • Week 10: Wrap-up and present project to class.

Project Description

In large studies, we often find that after a fraction of the data has been collected and analyzed, the data provided so far suggests changing the direction of the study or refining it maximize possible outcomes and balance with cost-effectiveness. In a large SNP association study, we might decide at certain stage that we can refine the SNPs that are showing possibilities of causality, and thus balance the amount of work with the cost by collecting only a fraction of the SNPs during the next stage. We also might want to investigate the potential of using Linkage Disequilibrium to actually collect different SNPs in subsequent stages of study (rather than just selecting a subset of the original SNPs). In this way we can hopefully iteratively converge to a set of causal SNPs for the study.

Related Papers

Skol, et al. "Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies"
Skol, et al. "Optimal designs for two-stage genome-wide association studies"
"A Calculus for Design of Two-Stage Adaptive Procedures"

April 25 2008

Wiki page created. Project started!

April 20 2008 - April 26 2008

  1. What I did this week: Selected a project topic, looked up some of the related literature, and created my wiki page.
  2. What I plan to do next week: Go through the lecture material from the multi-stage lecture and independently re-derive the statistics presented therein. Read the relevant research papers.
  3. How what I did compared to what I planned to do: Finishing the wiki page right now, so I got it all done for this week.
  4. My grade for myself this week: A. Good work!

April 27 2008 - May 3 2008

  1. What I did this week: Went through the slides from the lecture presenting multi-stage analysis to the class and independently worked out some of the mathematics therein. And read the paper on joint analysis.
  2. What I plan to do next week: Meet with professor to discuss facets of the particular project and present ideas, discuss 3-stage designs.
  3. How what I did compared to what I planned to do: Wasn't able to meet with the professor last week, but otherwise got a good general idea of direction. So, mostly.
  4. My grade for myself this week: B+.

May 4 2008 - May 10 2008

  1. What I did this week: Read through the Joint Multi-Stage paper as well as the follow-up paper by Skol et al. including the supplemental material online regarding their derivations. Talked with professor about project design and Multi-Stage statistics.
  2. What I plan to do next week: Work on 3-stage designs statistics and compare with 4-stage designs in terms of power and try to work in cost considerations. Write simple R or Matlab functions to calculate and compare power versus cost (or similar) diagrams and statistics for the above.
  3. How what I did compared to what I planned to do: Almost exactly.
  4. My grade for myself this week: A.

May 11 2008 - May 18 2008

  1. What I did this week: Translated the mathematics of joint statistics from the Skol papers. Verified my derivations and translations with the professor in office hours. Coded the majority of the joint statistic power calculations (including figuring out how to do double integration in R). Attempted to narrow and refine the scope of my overall project.
  2. What I plan to do next week: Work on 3-stage designs statistics and compare with 4-stage designs in terms of power and try to work in cost considerations. Write simple R or Matlab functions to calculate and compare power versus cost (or similar) diagrams and statistics for the above. Think about an extension to indirection association or "multi-locus" study design.
  3. How what I did compared to what I planned to do: The setback of the poor mathematical and statistical notation in the papers made it difficult to get as much done as I wanted to last week. Ideally, I wanted to be on 3 stage design at this point, but instead I am just starting it.
  4. My grade for myself this week: Teasing out the math of the joint statistic and duplicating the paper results proved to be more difficult than I anticipated, but I'm pleased with what I came up with, and should still have time to complete the majority of my project (barring any more unforseen events)…thus, since last week's setback was minor, and not mostly my fault, I give a B+/A-.

May 19 2008 - May 25 2008

  1. What I did this week: Finished the R implementation of power calculation of 2 stage studies and began to develop methods for 3 stage.
  2. What I plan to do next week: Work on 3-stage designs statistics and compare with 4-stage designs in terms of power and try to work in cost considerations. Work on simplification of the power calculations to determine how certain approximations affect the outcome. Think about an extension to indirection association or "multi-locus" study design.
  3. How what I did compared to what I planned to do: Had a busy weekend with other homework and projects to work on, but so far I'm keeping up with my original schedule fairly well considering the lost week due to the lack of mathematical details in the original papers.
  4. My grade for myself this week: B+/A-

May 26 2008 - June 1 2008

  1. What I did this week: Finished the R implementation of power calculation of 3 stage studies and wrote up the slides on the theory and introduction. Compared the independence assumption approximation to the actual calculations.
  2. What I plan to do next week: Fine tune my results and work on optimization of maintaining single stage power in multistage designs. Finish my presentation and give it to the class on Wednesday.
  3. How what I did compared to what I planned to do: Decided to leave the "multi-locus" designs as "future work" and focus more on 3 stages (rather than 4) based on the outcomes i was getting in my calculations.
  4. My grade for myself this week: A

June 2 2008 - June 7 2008

  1. What I did this week: Completed my project by doing all the analysis of 2 stage and 3 stage studies and their respective approximations, finished the presentation and presented it in class and posted it on the Wiki, and wrote the corresponding paper.
  2. What I plan to do next week: RELAX!
  3. How what I did compared to what I planned to do: Ended up writing a bit more in the paper than I anticipated, but it works out, and I am proud of it.
  4. My grade for myself this week: A+
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License