5/22/2011

Validation of an AIDS dataset

There is an recent post on Anti-AIDS drugs "Anti-AIDS drugs cut HIV transmission by 96 percent". This is a standard discrete data configuration. So I did some analysis based on the data from the paragraph.

"The study enrolled 1,763 couples in which only one partner was infected. Those infected included 890 men and 873 women; 97 percent of the couples were heterosexual. "


"Among the 28 partners who developed new infections, 27 were in the study arm where treatment was deferred."


This is a classical multinomial design that can be formatted into the following table.

Infected
Not infected
Total
Treatment
1
881
882
Defferd Treatnebt
27
854
881
Total
28
1735
1763

using fisher's exact, we can get the following output from R

Fisher's Exact Test for Count Data

data:  data
p-value = 9.125e-08
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.0008805044 0.2190549273
sample estimates:
odds ratio
0.03593919

The odds ratio is 0.03593919 with 95% confidence interval (0.0008805044, 0.2190549273 ) So we can claim that the odds of the AIDS patient treated immediately is 0.03593919 as much as those with differed treatment. Because this odds ratio is very small, we can claim the odds ratio = risk ratio. Therefore, we get the conclusion by the title "the anti-virus drugs after infected by AIDS lower the risk of transmitting to the partner by around 96%"


# TODO: AIDS
# 
# Author: Roger Everett
###############################################################################

setwd('AIDS treatment')
N = 1763 

new_infection = 28
deffed_treatment = 27

data = matrix(c(1,27,882-1,881-27),nrow=2)
colnames(data) = c('infected', 'notinfected')
rownames(data) = c('treatment', 'differedtreatment')

fisher.test(data)

No comments: