HapMapデータを用いた、polyphyletic SNP率の 推定

Evaluation of power for linkage
disequilibrium mapping
2
p-value
LD mapping for case-control association
studies.
N: No. samples
M: No. SNPs
N x M table is observed.
M tests are applied to the N x M table.
Observations in the shape of NxM table
can be expressed in df-dimensional
space.
All df=1 tests on the table are directions
in the space.
When cut-off value of
null-hypothesis test is
equal to the value of
alternative hypothesis
5
a
Infinite ... ie. df=2
6
Power =Prob of blue area < 0.5
Prob of blue area = 0.5
Statistics
3
Power is higher, when df is larger:
in case of df=2
Alternative
hypothesis
Cut-off
Surrogate
Power
Angle
Power =Prob of blue area > 0.5
0.5Prob of blue area = 0.5
Truth
Angle between “TRUTH” and
“SURROGATE”
Dimensions = dfs and power and best angle to maximize power
a
Two surrogates
Direction of
alternative
hypothesis
b-1
π
2
Power is measured geometrically,
when surrogate marker is used.
Power is the sum of
probability of observations
outside of the 3dimensional figures.
Visual image of the borders when df=3
b
One surrogate
When a surrogate marker is
used to test, the cut-off line
is oblique to the direction
to the true alternative
hypothesis.
power = 0.5
Statistics
TAKE-HOME MESSAGES
(1) Dimension of data affects on the way surrogate markers influences
on the power of tests.
(2) When the dimension is higher, weaker correlation between surrogate
markers and the true marker increases the power.
(3) When surrogate markers correlate with the true marker in higher
variation in terms of the strength and the direction, the power gets
higher.
Comparison of powers when 1, 2 or infinite
surrogate markers are tested together with
the test in the direction of alternative
hypothesis.
Power when the null
hypothesis is tested
compared with an
alternative hypothesis of
df=1.
Cumulative Prob.
1
eg. Additive test on 2x3 tables in case-control association studies with SNPs.
Statistics
Probability
Genetic factors of various diseases have been studied with
SNP chips and next generation sequencing technology is
now being utilized for the purpose. Although polymorphic
and potentially affecting markers distribute along the
chromosomes with relatively even density in the studies
with SNP chips and sequence data, the number of markers
and their density as well as linkage disequilibrium pattern
in each gene or locus varies substantially.
In this poster we propose a novel method which handles
multiple marker tests for a set of categorical phenotypes in
the context of geometric statistics so that we are abele to
estimate power when multiple tests are applied to identify
association between a locus with multiple markers and a
phenotype that might have multiple surrogating phenotyperelated criteria. First of all, we introduced a method to
define multiple tests in a higher dimensional space.
Secondly, we studied power when one surrogating marker
was applied with our geometric approach. Thirdly, we
designed models of condition of multiple tests and
evaluated the effect of relation between the power and
pattern of linkage disequilibrium. The result suggested
some loci with higher variation in linkage disequilibrium
might have several times higher power than less variable
loci.
Statistics
Basics 1: Multiple df=1 tests on
NxM table
Power when surrogate marker is used.
When a table of df=2 is tested with a test of
df=1;
Probability
we appreciate any comments or questions on this poster : [email protected]
4
power
Probability
Yamada, R., Terao C., Kawaguchi T., Narahara, M.
(2)Unit of Statistical Genetics, Center for Genomic Medicine, Kyoto University, Kyoto, Japan
September 18-20, 2011 Strasbourg, Germany(IGES 2011)
SUMMARY
Basics 2: p-value, power, df=1
b-2
a
b-2
Horizontal lines :
Curves :
dfs:
64
32
16
8
4
2
df=4
Finer coverage
with more markers
increases power.
b-3
0
Surrogate
The best angle is
wider when No.
marker gets more.
Power
0
Angle
Angle
π
Angle
With higher df = dimension,
power increases and the best
angle gets wider.
π
2
Truth
7
Angle between “TRUTH” and
“SURROGATE”
π
2
2
Multi-marker coverage and LD plot
Best angle to
increase power
b-2
b-3