R을 이용한 통계 프로그래밍 기초 제10장 분산분석

투정이 2009. 3. 30. 00:35

2009. 3. 30. 00:35

프로그램 소스

# R을 이용한 통계 프로그래밍 기초

# 제10장 분산분석법

# 10.1 일원배치 분산분석

# 예 10.1 세 종류의 건전지 수명에 차이가?

a <- c(100, 96, 98, 96, 92) # 자료의 입력
b <- c(76, 80, 84, 84, 78)
c <- c(108, 100, 101, 98, 96)

life <- data.frame(a, b, c) # 3 벡터를 이용하여 데이터 프레임 만들기
b.life = stack(life) # 세 변수를 한 변수로 쌓기
b.life
jpeg('10_01.jpg')
op = par(mfrow=c(1,2)) # 그래프 그릴 프레임 정의
boxplot(values ~ ind, data = b.life)
stripchart(life)
par(op)
dev.off()
oneway.test(values ~ ind, data = b.life, var.equal = TRUE) # one-way analysis with equal variance

# 예 10.2 : 예10.1의 자료를 aov() 이용하여 분산분석 실시, TukeyHSD()를 이용하여 다중비교

type = c(rep('a',5), rep('b',5), rep('c',5))
y = c(100, 96, 98, 96, 92, 76, 80, 84, 84, 78, 108, 100, 101, 98, 96) # 자료입력
ty = as.factor(type) # type을 요인변수로 만들기
life.aov = aov(y ~ ty) # 일원분산분석
summary(life.aov) # 분산분석 결과 요약보고
life.tukey = TukeyHSD(life.aov, "ty", ordered = TRUE) # 다중비교
life.tukey # 다중비교 결과 보기
plot(life.tukey) # 다중비교 결과 그래프로 보기

pairwise.t.test(y, ty, p.adjust = 'none', pool.sd = TRUE) # LSD 다중비교

# 10.2 이요인 분산분석

# 예 10.3 디자인 종류와 광고 여부에 따른 매출액 차이?

design = as.factor(c('a', 'b', 'c', 'a', 'b', 'c')) # 자료입력, 요인변수로 만들기
ad = as.factor(c(1, 1, 1, 0, 0, 0)) # 자료입력, 요인변수로 만들기
y = c(23, 15, 18, 16, 9, 11) # 측정치
a = aov(y ~ design + ad) # 이요인 분산분석, 1 반복이라 no interaction
a
summary(a)

# 예 10.4 온도와 압력에 따른 반응값의 차이? 2반복 실험

pressure = as.factor(c(200, 220, 240, 200, 220, 240, 200, 220, 240, 200, 220, 240))
temp = as.factor(c(rep('low',6),rep('high',6)))
y = c(90.4, 90.7, 90.2,
90.2, 90.1, 90.4,
92.2, 91.6, 90.5,
93.7, 91.8, 92.8)
jpeg('10_02.jpg')
op = par(mfrow=c(2,2))
plot(y ~ temp)
plot(y ~ pressure)
stripchart(y ~ temp, vertical = TRUE, xlab = "temperature")
stripchart(y ~ pressure, vertical = TRUE, xlab = "pressure")
par(op)
dev.off()

jpeg('10_03.jpg')
op = par(mfrow = c(1,2))
interaction.plot(temp, pressure, y, bty = 'l', main = 'interaction plot')
interaction.plot(pressure, temp, y, bty = 'l')
par(op)
dev.off()

aov_pt = aov(y ~ temp + pressure + temp:pressure)
aov_pt
summary(aov_pt)

프로그램 실행 결과

> # R을 이용한 통계 프로그래밍 기초
>
> # 제10장 분산분석법
>
> # 10.1 일원배치 분산분석
>
> # 예 10.1 세 종류의 건전지 수명에 차이가?
>
> a <- c(100, 96, 98, 96, 92) # 자료의 입력
> b <- c(76, 80, 84, 84, 78)
> c <- c(108, 100, 101, 98, 96)
>
> life <- data.frame(a, b, c) # 3 벡터를 이용하여 데이터 프레임 만들기
> b.life = stack(life) # 세 변수를 한 변수로 쌓기
> b.life
values ind
1 100 a
2 96 a
3 98 a
4 96 a
5 92 a
6 76 b
7 80 b
8 84 b
9 84 b
10 78 b
11 108 c
12 100 c
13 101 c
14 98 c
15 96 c
> jpeg('10_01.jpg')
> op = par(mfrow=c(1,2)) # 그래프 그릴 프레임 정의
> boxplot(values ~ ind, data = b.life)
> stripchart(life)
> par(op)
> dev.off()
null device
1

> oneway.test(values ~ ind, data = b.life, var.equal = TRUE) # one-way analysis with equal variance

One-way analysis of means

data: values and ind
F = 40.1934, num df = 2, denom df = 12, p-value = 4.802e-06

>
> # 예 10.2 : 예10.1의 자료를 aov() 이용하여 분산분석 실시, TukeyHSD()를 이용하여 다중비교
>
> type = c(rep('a',5), rep('b',5), rep('c',5))
> y = c(100, 96, 98, 96, 92, 76, 80, 84, 84, 78, 108, 100, 101, 98, 96) # 자료입력
> ty = as.factor(type) # type을 요인변수로 만들기
> life.aov = aov(y ~ ty) # 일원분산분석
> summary(life.aov) # 분산분석 결과 요약보고
Df Sum Sq Mean Sq F value Pr(>F)
ty 2 1136.13 568.07 40.193 4.802e-06 ***
Residuals 12 169.60 14.13
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> life.tukey = TukeyHSD(life.aov, "ty", ordered = TRUE) # 다중비교
> life.tukey # 다중비교 결과 보기
Tukey multiple comparisons of means
95% family-wise confidence level
factor levels have been ordered

Fit: aov(formula = y ~ ty)

$ty
diff lwr upr p adj
a-b 16.0 9.65669 22.34331 0.0000580
c-b 20.2 13.85669 26.54331 0.0000056
c-a 4.2 -2.14331 10.54331 0.2220442

> plot(life.tukey) # 다중비교 결과 그래프로 보기
>
> pairwise.t.test(y, ty, p.adjust = 'none', pool.sd = TRUE) # LSD 다중비교

Pairwise comparisons using t tests with pooled SD

data: y and ty

a b
b 2.1e-05 -
c 0.10 2.0e-06

P value adjustment method: none
>
> # 10.2 이요인 분산분석
>
> # 예 10.3 디자인 종류와 광고 여부에 따른 매출액 차이?
>
> design = as.factor(c('a', 'b', 'c', 'a', 'b', 'c')) # 자료입력, 요인변수로 만들기
> ad = as.factor(c(1, 1, 1, 0, 0, 0)) # 자료입력, 요인변수로 만들기
> y = c(23, 15, 18, 16, 9, 11) # 측정치
> a = aov(y ~ design + ad) # 이요인 분산분석, 1 반복이라 no interaction
> a
Call:
aov(formula = y ~ design + ad)

Terms:
design ad Residuals
Sum of Squares 58.33333 66.66667 0.33333
Deg. of Freedom 2 1 2

Residual standard error: 0.4082483
Estimated effects may be unbalanced
> summary(a)
Df Sum Sq Mean Sq F value Pr(>F)
design 2 58.333 29.167 175 0.005682 **
ad 1 66.667 66.667 400 0.002491 **
Residuals 2 0.333 0.167
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> # 예 10.4 온도와 압력에 따른 반응값의 차이? 2반복 실험
>
> pressure = as.factor(c(200, 220, 240, 200, 220, 240, 200, 220, 240, 200, 220, 240))
> temp = as.factor(c(rep('low',6),rep('high',6)))
> y = c(90.4, 90.7, 90.2,
+ 90.2, 90.1, 90.4,
+ 92.2, 91.6, 90.5,
+ 93.7, 91.8, 92.8)
> jpeg('10_02.jpg')
> op = par(mfrow=c(2,2))
> plot(y ~ temp)
> plot(y ~ pressure)
> stripchart(y ~ temp, vertical = TRUE, xlab = "temperature")
> stripchart(y ~ pressure, vertical = TRUE, xlab = "pressure")
> par(op)
> dev.off()
windows
2
>

> jpeg('10_03.jpg')
> op = par(mfrow = c(1,2))
> interaction.plot(temp, pressure, y, bty = 'l', main = 'interaction plot')
> interaction.plot(pressure, temp, y, bty = 'l')
> par(op)
> dev.off()
windows
2
>

> aov_pt = aov(y ~ temp + pressure + temp:pressure)
> aov_pt
Call:
aov(formula = y ~ temp + pressure + temp:pressure)

Terms:
temp pressure temp:pressure Residuals
Sum of Squares 9.363333 1.011667 1.171667 4.010000
Deg. of Freedom 1 2 2 6

Residual standard error: 0.8175166
Estimated effects may be unbalanced
> summary(aov_pt)
Df Sum Sq Mean Sq F value Pr(>F)
temp 1 9.3633 9.3633 14.0100 0.009589 **
pressure 2 1.0117 0.5058 0.7569 0.509201
temp:pressure 2 1.1717 0.5858 0.8766 0.463473
Residuals 6 4.0100 0.6683
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
>

'Programming > R' 카테고리의 다른 글

R for SAS and SPSS Users 5. Installing and Updating R (0)	2010.12.02
R for SAS ans SPSS Users 1장 ~ 4장 (0)	2010.12.01
R을 이용한 통계 프로그래밍 기초 제9장 회귀 분석 (4)	2009.03.26
R을 이용한 통계 프로그래밍 기초 제8장 카이제곱 검정 (4)	2009.03.26
R을 이용한 통계 프로그래밍 기초 제7장 유의성 검정 (3)	2009.03.25

Let's Study Animal Breeding

R을 이용한 통계 프로그래밍 기초 제10장 분산분석

'Programming > R' 카테고리의 다른 글

+ Recent posts

티스토리툴바