blupf90을 이용하여 다형질 개체 모형(다변량 개체 모형)의 육종가를 구하는 것을 아래에서 설명한 바 있다.
2014/02/25 - [Animal Breeding/BLUPF90] - blupf90으로 다형질 모형(same model and no missing record)의 육종가 구하기
2014/03/12 - [Animal Breeding/BLUPF90] - blupf90으로 다형질 모형(same model with missing record)의 육종가 구하기
2014/03/12 - [Animal Breeding/BLUPF90] - blupf90으로 다형질 모형(different model)의 육종가 구하기
2014/03/13 - [Animal Breeding/BLUPF90] - blupf90으로 다형질 모형(no environmental covariance)의 육종가 구하기
여기서는 blupf90이 구한 결과를 R로 정리하는 것을 설명한다.
blupf90은 육종가(breeding value)와 육종가의 SE(SEP : Standard Error of Prediction)를 제공하는데 SE를 이용하여 정확도(accuracy)를 구해야 한다. 또한 renumber된 개체 ID이기 때문에 원래 ID를 찾아 주어야 한다.
blupf90이 구한 solutions는 다음과 같다.
trait/effect level solution s.e.
1 1 1 4.36086700 4.87905017
2 1 1 6.79989768 5.55495970
1 1 2 3.39726169 5.65657865
2 1 2 5.88029605 6.20348910
1 2 1 -0.31611757 4.13703273
2 2 1 -0.47898372 5.60031728
1 2 2 -0.01023894 4.06259227
2 2 2 -0.01267064 5.45205614
1 2 3 0.27580827 4.14143092
2 2 3 0.51723842 5.61277604
1 2 4 0.24375552 4.03539120
2 2 4 0.39196151 5.39995298
1 2 5 -0.27033146 4.06707098
2 2 5 -0.47783034 5.46485295
1 2 6 0.15091558 4.31332061
2 2 6 0.27959802 5.99180990
1 2 7 -0.01539252 4.42677932
2 2 7 -0.00761009 6.22636579
1 2 8 -0.07839191 4.22999837
2 2 8 -0.17034149 5.81371170
renumber된 혈통과 원래 ID는 renadd02.ped에 나와 있다.
1 2 5 1 0 2 1 0 0 7
7 0 0 3 0 0 0 0 2 2
2 6 0 2 0 1 1 1 0 4
3 6 7 1 0 2 1 0 1 6
6 0 0 3 0 0 0 2 0 1
4 8 3 1 0 2 1 0 0 8
8 0 0 3 0 0 0 2 0 3
5 8 7 1 0 2 1 0 1 5
1, 2, 3열이 리넘버된 혈통이고 10열이 원래 ID이다.
다형질 개체 모형(multiple trait animal model)에 대한 blupf90의 결과로부터 정확도를 계산하고 원래 ID를 붙이는 R 프로그램이다.
# multiple traits animal model의 blupf90의 분석 결과를 R로 정리하기 - Date :2020-07-02
# blupf90의 육종가와 SE(standard error of prediction)읽어
#정확도를 계산하고, 원래 ID를 붙임
# set working directory
setwd("D:/temp/04_multiple_traits_01_blupf90")
# print working directory
getwd()
# sigma_a
sigma_a_t1 = 20
sigma_a_t2 = 40
sigma_a_t1
sigma_a_t2
# read solutions
solutions = read.table("solutions", skip = 1, sep = "", col.names = c("trait", "effect", "level", "solution", "se"), stringsAsFactors = FALSE)
solutions
# read renumbered pedigree
pedi = read.table("renadd02.ped", sep = "", stringsAsFactors = FALSE)
pedi
# read renumbered data
data = read.table("renf90.dat", sep = "", stringsAsFactors = FALSE)
data
# read fixed solutions from solutions
sol_fixed_t1 = solutions[(solutions$trait == 1 & solutions$effect == 1), 4]
sol_fixed_t1
sol_fixed_t2 = solutions[(solutions$trait == 2 & solutions$effect == 1), 4]
sol_fixed_t2
# read animal solutions from solutions
sol_animal_t1 = solutions[(solutions$trait == 1 & solutions$effect == 2), 3:5]
rownames(sol_animal_t1) = sol_animal_t1[,1]
sol_animal_t1
sol_animal_t2 = solutions[(solutions$trait == 2 & solutions$effect == 2), 3:5]
rownames(sol_animal_t2) = sol_animal_t2[,1]
sol_animal_t2
# reliability
rel_t1 = 1 - (sol_animal_t1$se^2 / sigma_a_t1)
rel_t1
rel_t2 = 1 - (sol_animal_t2$se^2 / sigma_a_t2)
rel_t2
# accuracy
acc_t1 = sqrt(rel_t1)
acc_t1
acc_t2 = sqrt(rel_t2)
acc_t2
# 육종가 분할하기, DYD 구하기 생략
# result
mt01_result = data.frame(animal = sol_animal_t1$level, bv_t1 = sol_animal_t1$solution, rel_t1 = rel_t1, acc_t1 = acc_t1, sep_t1 = sol_animal_t1$se,
bv_t2 = sol_animal_t2$solution, rel_t2 = rel_t2, acc_t2 = acc_t2, sep_t2 = sol_animal_t2$se)
mt01_result
# join original animal id
aid_renum = pedi[,c(1,10)]
colnames(aid_renum) = c("animal", "orig_aid")
aid_renum
# 조인하기 위한 패키지 실행
#install.packages("plyr")
if (!("plyr" %in% installed.packages())){
install.packages('plyr', dependencies = TRUE)
}
library(plyr)
mt01_final = join(x = mt01_result, y = aid_renum, by = "animal", type = "left")
mt01_final = mt01_final[order(mt01_final$orig_aid),]
mt01_final
# 파일로 출력, 분리자는 ",", 따옴표 없고
output_filename = gsub("[ -]", "", paste("mt01_result_", Sys.Date(), ".csv"))
write.table(mt01_final, output_filename, sep=",", row.names = FALSE, quote = FALSE)
다음은 실행 결과이다.
> # multiple traits animal model의 blupf90의 분석 결과를 R로 정리하기 - Date :2020-07-02
>
> # blupf90의 육종가와 SE(standard error of prediction)읽어
>
> #정확도를 계산하고, 원래 ID를 붙임
>
> # set working directory
>
> setwd("D:/temp/04_multiple_traits_01_blupf90")
>
> # print working directory
>
> getwd()
[1] "D:/temp/04_multiple_traits_01_blupf90"
>
> # sigma_a
>
> sigma_a_t1 = 20
> sigma_a_t2 = 40
>
> sigma_a_t1
[1] 20
> sigma_a_t2
[1] 40
>
> # read solutions
>
> solutions = read.table("solutions", skip = 1, sep = "", col.names = c("trait", "effect", "level", "solution", "se"), stringsAsFactors = FALSE)
>
> solutions
trait effect level solution se
1 1 1 1 4.36086700 4.879050
2 2 1 1 6.79989768 5.554960
3 1 1 2 3.39726169 5.656579
4 2 1 2 5.88029605 6.203489
5 1 2 1 -0.31611757 4.137033
6 2 2 1 -0.47898372 5.600317
7 1 2 2 -0.01023894 4.062592
8 2 2 2 -0.01267064 5.452056
9 1 2 3 0.27580827 4.141431
10 2 2 3 0.51723842 5.612776
11 1 2 4 0.24375552 4.035391
12 2 2 4 0.39196151 5.399953
13 1 2 5 -0.27033146 4.067071
14 2 2 5 -0.47783034 5.464853
15 1 2 6 0.15091558 4.313321
16 2 2 6 0.27959802 5.991810
17 1 2 7 -0.01539252 4.426779
18 2 2 7 -0.00761009 6.226366
19 1 2 8 -0.07839191 4.229998
20 2 2 8 -0.17034149 5.813712
>
> # read renumbered pedigree
>
> pedi = read.table("renadd02.ped", sep = "", stringsAsFactors = FALSE)
>
> pedi
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1 1 2 5 1 0 2 1 0 0 7
2 7 0 0 3 0 0 0 0 2 2
3 2 6 0 2 0 1 1 1 0 4
4 3 6 7 1 0 2 1 0 1 6
5 6 0 0 3 0 0 0 2 0 1
6 4 8 3 1 0 2 1 0 0 8
7 8 0 0 3 0 0 0 2 0 3
8 5 8 7 1 0 2 1 0 1 5
>
> # read renumbered data
>
> data = read.table("renf90.dat", sep = "", stringsAsFactors = FALSE)
>
> data
V1 V2 V3 V4
1 4.5 6.8 1 2
2 2.9 5.0 2 5
3 3.9 6.8 2 3
4 3.5 6.0 1 1
5 5.0 7.5 1 4
>
> # read fixed solutions from solutions
>
> sol_fixed_t1 = solutions[(solutions$trait == 1 & solutions$effect == 1), 4]
>
> sol_fixed_t1
[1] 4.360867 3.397262
>
> sol_fixed_t2 = solutions[(solutions$trait == 2 & solutions$effect == 1), 4]
>
> sol_fixed_t2
[1] 6.799898 5.880296
>
> # read animal solutions from solutions
>
> sol_animal_t1 = solutions[(solutions$trait == 1 & solutions$effect == 2), 3:5]
>
> rownames(sol_animal_t1) = sol_animal_t1[,1]
>
> sol_animal_t1
level solution se
1 1 -0.31611757 4.137033
2 2 -0.01023894 4.062592
3 3 0.27580827 4.141431
4 4 0.24375552 4.035391
5 5 -0.27033146 4.067071
6 6 0.15091558 4.313321
7 7 -0.01539252 4.426779
8 8 -0.07839191 4.229998
>
> sol_animal_t2 = solutions[(solutions$trait == 2 & solutions$effect == 2), 3:5]
>
> rownames(sol_animal_t2) = sol_animal_t2[,1]
>
> sol_animal_t2
level solution se
1 1 -0.47898372 5.600317
2 2 -0.01267064 5.452056
3 3 0.51723842 5.612776
4 4 0.39196151 5.399953
5 5 -0.47783034 5.464853
6 6 0.27959802 5.991810
7 7 -0.00761009 6.226366
8 8 -0.17034149 5.813712
>
> # reliability
>
> rel_t1 = 1 - (sol_animal_t1$se^2 / sigma_a_t1)
>
> rel_t1
[1] 0.14424801 0.17476720 0.14242750 0.18578089 0.17294668 0.06976327 0.02018124 0.10535569
>
> rel_t2 = 1 - (sol_animal_t2$se^2 / sigma_a_t2)
>
> rel_t2
[1] 0.21591116 0.25687710 0.21241863 0.27101270 0.25338456 0.10245535 0.03080923 0.15501891
>
> # accuracy
>
> acc_t1 = sqrt(rel_t1)
>
> acc_t1
[1] 0.3798000 0.4180517 0.3773957 0.4310231 0.4158686 0.2641274 0.1420607 0.3245854
>
> acc_t2 = sqrt(rel_t2)
>
> acc_t2
[1] 0.4646624 0.5068304 0.4608890 0.5205888 0.5033732 0.3200865 0.1755256 0.3937244
>
> # 육종가 분할하기, DYD 구하기 생략
>
> # result
>
> mt01_result = data.frame(animal = sol_animal_t1$level, bv_t1 = sol_animal_t1$solution, rel_t1 = rel_t1, acc_t1 = acc_t1, sep_t1 = sol_animal_t1$se,
+ bv_t2 = sol_animal_t2$solution, rel_t2 = rel_t2, acc_t2 = acc_t2, sep_t2 = sol_animal_t2$se)
>
>
> mt01_result
animal bv_t1 rel_t1 acc_t1 sep_t1 bv_t2 rel_t2 acc_t2 sep_t2
1 1 -0.31611757 0.14424801 0.3798000 4.137033 -0.47898372 0.21591116 0.4646624 5.600317
2 2 -0.01023894 0.17476720 0.4180517 4.062592 -0.01267064 0.25687710 0.5068304 5.452056
3 3 0.27580827 0.14242750 0.3773957 4.141431 0.51723842 0.21241863 0.4608890 5.612776
4 4 0.24375552 0.18578089 0.4310231 4.035391 0.39196151 0.27101270 0.5205888 5.399953
5 5 -0.27033146 0.17294668 0.4158686 4.067071 -0.47783034 0.25338456 0.5033732 5.464853
6 6 0.15091558 0.06976327 0.2641274 4.313321 0.27959802 0.10245535 0.3200865 5.991810
7 7 -0.01539252 0.02018124 0.1420607 4.426779 -0.00761009 0.03080923 0.1755256 6.226366
8 8 -0.07839191 0.10535569 0.3245854 4.229998 -0.17034149 0.15501891 0.3937244 5.813712
>
> # join original animal id
>
> aid_renum = pedi[,c(1,10)]
>
> colnames(aid_renum) = c("animal", "orig_aid")
>
> aid_renum
animal orig_aid
1 1 7
2 7 2
3 2 4
4 3 6
5 6 1
6 4 8
7 8 3
8 5 5
>
> # 조인하기 위한 패키지 실행
>
> #install.packages("plyr")
> if (!("plyr" %in% installed.packages())){
+ install.packages('plyr', dependencies = TRUE)
+ }
> library(plyr)
>
> mt01_final = join(x = mt01_result, y = aid_renum, by = "animal", type = "left")
>
> mt01_final = mt01_final[order(mt01_final$orig_aid),]
>
> mt01_final
animal bv_t1 rel_t1 acc_t1 sep_t1 bv_t2 rel_t2 acc_t2 sep_t2 orig_aid
6 6 0.15091558 0.06976327 0.2641274 4.313321 0.27959802 0.10245535 0.3200865 5.991810 1
7 7 -0.01539252 0.02018124 0.1420607 4.426779 -0.00761009 0.03080923 0.1755256 6.226366 2
8 8 -0.07839191 0.10535569 0.3245854 4.229998 -0.17034149 0.15501891 0.3937244 5.813712 3
2 2 -0.01023894 0.17476720 0.4180517 4.062592 -0.01267064 0.25687710 0.5068304 5.452056 4
5 5 -0.27033146 0.17294668 0.4158686 4.067071 -0.47783034 0.25338456 0.5033732 5.464853 5
3 3 0.27580827 0.14242750 0.3773957 4.141431 0.51723842 0.21241863 0.4608890 5.612776 6
1 1 -0.31611757 0.14424801 0.3798000 4.137033 -0.47898372 0.21591116 0.4646624 5.600317 7
4 4 0.24375552 0.18578089 0.4310231 4.035391 0.39196151 0.27101270 0.5205888 5.399953 8
>
> # 파일로 출력, 분리자는 ",", 따옴표 없고
>
> output_filename = gsub("[ -]", "", paste("mt01_result_", Sys.Date(), ".csv"))
>
> write.table(mt01_final, output_filename, sep=",", row.names = FALSE, quote = FALSE)
다음은 저장한 파일이다.
animal,bv_t1,rel_t1,acc_t1,sep_t1,bv_t2,rel_t2,acc_t2,sep_t2,orig_aid
6,0.15091558,0.0697632657674614,0.264127366562917,4.31332061,0.27959802,0.10245535305655,0.320086477465934,5.9918099,1
7,-0.01539252,0.0201812426010169,0.142060700410131,4.42677932,-0.00761009,0.030809226227942,0.175525571436022,6.22636579,2
8,-0.07839191,0.105355689489867,0.32458541170217,4.22999837,-0.17034149,0.155018906732078,0.393724404542159,5.8137117,3
2,-0.01023894,0.174767202386813,0.418051674302128,4.06259227,-0.01267064,0.256877096157208,0.506830441229814,5.45205614,4
5,-0.27033146,0.172946682182092,0.415868587635676,4.06707098,-0.47783034,0.253384555871907,0.503373177545156,5.46485295,5
3,0.27580827,0.142427496743397,0.377395676635806,4.14143092,0.51723842,0.212418628120048,0.460888954217877,5.61277604,6
1,-0.31611757,0.144248009545437,0.379799959907104,4.13703273,-0.47898372,0.215911159083335,0.464662414106559,5.60031728,7
4,0.24375552,0.185780893148128,0.43102307728024,4.0353912,0.39196151,0.271012695344728,0.52058879679141,5.39995298,8
csv 파일이므로 엑셀에서 열면 다음과 같다.