R을 이용하여 다형질 개체 모형(multiple trait animal model)에 대한 blupf90 분석 결과 정리하기

투정이 2020. 7. 2. 17:24

2020. 7. 2. 17:24

blupf90을 이용하여 다형질 개체 모형(다변량 개체 모형)의 육종가를 구하는 것을 아래에서 설명한 바 있다.

2014/02/25 - [Animal Breeding/BLUPF90] - blupf90으로 다형질 모형(same model and no missing record)의 육종가 구하기

2014/03/12 - [Animal Breeding/BLUPF90] - blupf90으로 다형질 모형(same model with missing record)의 육종가 구하기

2014/03/12 - [Animal Breeding/BLUPF90] - blupf90으로 다형질 모형(different model)의 육종가 구하기

2014/03/13 - [Animal Breeding/BLUPF90] - blupf90으로 다형질 모형(no environmental covariance)의 육종가 구하기

여기서는 blupf90이 구한 결과를 R로 정리하는 것을 설명한다.

blupf90은 육종가(breeding value)와 육종가의 SE(SEP : Standard Error of Prediction)를 제공하는데 SE를 이용하여 정확도(accuracy)를 구해야 한다. 또한 renumber된 개체 ID이기 때문에 원래 ID를 찾아 주어야 한다.

blupf90이 구한 solutions는 다음과 같다.

trait/effect level  solution          s.e.
   1   1         1          4.36086700          4.87905017
   2   1         1          6.79989768          5.55495970
   1   1         2          3.39726169          5.65657865
   2   1         2          5.88029605          6.20348910
   1   2         1         -0.31611757          4.13703273
   2   2         1         -0.47898372          5.60031728
   1   2         2         -0.01023894          4.06259227
   2   2         2         -0.01267064          5.45205614
   1   2         3          0.27580827          4.14143092
   2   2         3          0.51723842          5.61277604
   1   2         4          0.24375552          4.03539120
   2   2         4          0.39196151          5.39995298
   1   2         5         -0.27033146          4.06707098
   2   2         5         -0.47783034          5.46485295
   1   2         6          0.15091558          4.31332061
   2   2         6          0.27959802          5.99180990
   1   2         7         -0.01539252          4.42677932
   2   2         7         -0.00761009          6.22636579
   1   2         8         -0.07839191          4.22999837
   2   2         8         -0.17034149          5.81371170

renumber된 혈통과 원래 ID는 renadd02.ped에 나와 있다.

1 2 5 1 0 2 1 0 0 7
7 0 0 3 0 0 0 0 2 2
2 6 0 2 0 1 1 1 0 4
3 6 7 1 0 2 1 0 1 6
6 0 0 3 0 0 0 2 0 1
4 8 3 1 0 2 1 0 0 8
8 0 0 3 0 0 0 2 0 3
5 8 7 1 0 2 1 0 1 5

1, 2, 3열이 리넘버된 혈통이고 10열이 원래 ID이다.

다형질 개체 모형(multiple trait animal model)에 대한 blupf90의 결과로부터 정확도를 계산하고 원래 ID를 붙이는 R 프로그램이다.

# multiple traits animal model의 blupf90의 분석 결과를 R로 정리하기 - Date :2020-07-02 

# blupf90의 육종가와 SE(standard error of prediction)읽어 

#정확도를 계산하고, 원래 ID를 붙임 

# set working directory 

setwd("D:/temp/04_multiple_traits_01_blupf90") 

# print working directory 

getwd() 

# sigma_a 

sigma_a_t1 = 20 
sigma_a_t2 = 40 

sigma_a_t1
sigma_a_t2

# read solutions 

solutions = read.table("solutions", skip = 1, sep = "", col.names = c("trait", "effect", "level", "solution", "se"), stringsAsFactors = FALSE) 

solutions 

# read renumbered pedigree 

pedi = read.table("renadd02.ped", sep = "", stringsAsFactors = FALSE) 

pedi 

# read renumbered data 

data = read.table("renf90.dat", sep = "", stringsAsFactors = FALSE) 

data 

# read fixed solutions from solutions 

sol_fixed_t1 = solutions[(solutions$trait == 1 & solutions$effect == 1), 4] 

sol_fixed_t1

sol_fixed_t2 = solutions[(solutions$trait == 2 & solutions$effect == 1), 4] 

sol_fixed_t2

# read animal solutions from solutions 

sol_animal_t1 = solutions[(solutions$trait == 1 & solutions$effect == 2), 3:5] 

rownames(sol_animal_t1) = sol_animal_t1[,1] 

sol_animal_t1

sol_animal_t2 = solutions[(solutions$trait == 2 & solutions$effect == 2), 3:5] 

rownames(sol_animal_t2) = sol_animal_t2[,1] 

sol_animal_t2

# reliability 

rel_t1 = 1 - (sol_animal_t1$se^2 / sigma_a_t1) 

rel_t1

rel_t2 = 1 - (sol_animal_t2$se^2 / sigma_a_t2) 

rel_t2

# accuracy 

acc_t1 = sqrt(rel_t1) 

acc_t1 

acc_t2 = sqrt(rel_t2) 

acc_t2

# 육종가 분할하기, DYD 구하기 생략 

# result 

mt01_result = data.frame(animal = sol_animal_t1$level, bv_t1 = sol_animal_t1$solution, rel_t1 = rel_t1, acc_t1 = acc_t1, sep_t1 = sol_animal_t1$se,
                                                       bv_t2 = sol_animal_t2$solution, rel_t2 = rel_t2, acc_t2 = acc_t2, sep_t2 = sol_animal_t2$se)


mt01_result 

# join original animal id 

aid_renum = pedi[,c(1,10)] 

colnames(aid_renum) = c("animal", "orig_aid") 

aid_renum 

# 조인하기 위한 패키지 실행 

#install.packages("plyr") 
if (!("plyr" %in% installed.packages())){
  install.packages('plyr', dependencies = TRUE)  
}
library(plyr) 

mt01_final = join(x = mt01_result, y = aid_renum, by = "animal", type = "left") 

mt01_final = mt01_final[order(mt01_final$orig_aid),] 

mt01_final

# 파일로 출력, 분리자는 ",", 따옴표 없고 

output_filename = gsub("[ -]", "", paste("mt01_result_", Sys.Date(), ".csv")) 

write.table(mt01_final, output_filename, sep=",", row.names = FALSE, quote = FALSE)

다음은 실행 결과이다.

> # multiple traits animal model의 blupf90의 분석 결과를 R로 정리하기 - Date :2020-07-02 
> 
> # blupf90의 육종가와 SE(standard error of prediction)읽어 
> 
> #정확도를 계산하고, 원래 ID를 붙임 
> 
> # set working directory 
> 
> setwd("D:/temp/04_multiple_traits_01_blupf90") 
> 
> # print working directory 
> 
> getwd() 
[1] "D:/temp/04_multiple_traits_01_blupf90"
> 
> # sigma_a 
> 
> sigma_a_t1 = 20 
> sigma_a_t2 = 40 
> 
> sigma_a_t1
[1] 20
> sigma_a_t2
[1] 40
> 
> # read solutions 
> 
> solutions = read.table("solutions", skip = 1, sep = "", col.names = c("trait", "effect", "level", "solution", "se"), stringsAsFactors = FALSE) 
> 
> solutions 
   trait effect level    solution       se
1      1      1     1  4.36086700 4.879050
2      2      1     1  6.79989768 5.554960
3      1      1     2  3.39726169 5.656579
4      2      1     2  5.88029605 6.203489
5      1      2     1 -0.31611757 4.137033
6      2      2     1 -0.47898372 5.600317
7      1      2     2 -0.01023894 4.062592
8      2      2     2 -0.01267064 5.452056
9      1      2     3  0.27580827 4.141431
10     2      2     3  0.51723842 5.612776
11     1      2     4  0.24375552 4.035391
12     2      2     4  0.39196151 5.399953
13     1      2     5 -0.27033146 4.067071
14     2      2     5 -0.47783034 5.464853
15     1      2     6  0.15091558 4.313321
16     2      2     6  0.27959802 5.991810
17     1      2     7 -0.01539252 4.426779
18     2      2     7 -0.00761009 6.226366
19     1      2     8 -0.07839191 4.229998
20     2      2     8 -0.17034149 5.813712
> 
> # read renumbered pedigree 
> 
> pedi = read.table("renadd02.ped", sep = "", stringsAsFactors = FALSE) 
> 
> pedi 
  V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1  1  2  5  1  0  2  1  0  0   7
2  7  0  0  3  0  0  0  0  2   2
3  2  6  0  2  0  1  1  1  0   4
4  3  6  7  1  0  2  1  0  1   6
5  6  0  0  3  0  0  0  2  0   1
6  4  8  3  1  0  2  1  0  0   8
7  8  0  0  3  0  0  0  2  0   3
8  5  8  7  1  0  2  1  0  1   5
> 
> # read renumbered data 
> 
> data = read.table("renf90.dat", sep = "", stringsAsFactors = FALSE) 
> 
> data 
   V1  V2 V3 V4
1 4.5 6.8  1  2
2 2.9 5.0  2  5
3 3.9 6.8  2  3
4 3.5 6.0  1  1
5 5.0 7.5  1  4
> 
> # read fixed solutions from solutions 
> 
> sol_fixed_t1 = solutions[(solutions$trait == 1 & solutions$effect == 1), 4] 
> 
> sol_fixed_t1
[1] 4.360867 3.397262
> 
> sol_fixed_t2 = solutions[(solutions$trait == 2 & solutions$effect == 1), 4] 
> 
> sol_fixed_t2
[1] 6.799898 5.880296
> 
> # read animal solutions from solutions 
> 
> sol_animal_t1 = solutions[(solutions$trait == 1 & solutions$effect == 2), 3:5] 
> 
> rownames(sol_animal_t1) = sol_animal_t1[,1] 
> 
> sol_animal_t1
  level    solution       se
1     1 -0.31611757 4.137033
2     2 -0.01023894 4.062592
3     3  0.27580827 4.141431
4     4  0.24375552 4.035391
5     5 -0.27033146 4.067071
6     6  0.15091558 4.313321
7     7 -0.01539252 4.426779
8     8 -0.07839191 4.229998
> 
> sol_animal_t2 = solutions[(solutions$trait == 2 & solutions$effect == 2), 3:5] 
> 
> rownames(sol_animal_t2) = sol_animal_t2[,1] 
> 
> sol_animal_t2
  level    solution       se
1     1 -0.47898372 5.600317
2     2 -0.01267064 5.452056
3     3  0.51723842 5.612776
4     4  0.39196151 5.399953
5     5 -0.47783034 5.464853
6     6  0.27959802 5.991810
7     7 -0.00761009 6.226366
8     8 -0.17034149 5.813712
> 
> # reliability 
> 
> rel_t1 = 1 - (sol_animal_t1$se^2 / sigma_a_t1) 
> 
> rel_t1
[1] 0.14424801 0.17476720 0.14242750 0.18578089 0.17294668 0.06976327 0.02018124 0.10535569
> 
> rel_t2 = 1 - (sol_animal_t2$se^2 / sigma_a_t2) 
> 
> rel_t2
[1] 0.21591116 0.25687710 0.21241863 0.27101270 0.25338456 0.10245535 0.03080923 0.15501891
> 
> # accuracy 
> 
> acc_t1 = sqrt(rel_t1) 
> 
> acc_t1 
[1] 0.3798000 0.4180517 0.3773957 0.4310231 0.4158686 0.2641274 0.1420607 0.3245854
> 
> acc_t2 = sqrt(rel_t2) 
> 
> acc_t2
[1] 0.4646624 0.5068304 0.4608890 0.5205888 0.5033732 0.3200865 0.1755256 0.3937244
> 
> # 육종가 분할하기, DYD 구하기 생략 
> 
> # result 
> 
> mt01_result = data.frame(animal = sol_animal_t1$level, bv_t1 = sol_animal_t1$solution, rel_t1 = rel_t1, acc_t1 = acc_t1, sep_t1 = sol_animal_t1$se,
+                                                        bv_t2 = sol_animal_t2$solution, rel_t2 = rel_t2, acc_t2 = acc_t2, sep_t2 = sol_animal_t2$se)
> 
> 
> mt01_result 
  animal       bv_t1     rel_t1    acc_t1   sep_t1       bv_t2     rel_t2    acc_t2   sep_t2
1      1 -0.31611757 0.14424801 0.3798000 4.137033 -0.47898372 0.21591116 0.4646624 5.600317
2      2 -0.01023894 0.17476720 0.4180517 4.062592 -0.01267064 0.25687710 0.5068304 5.452056
3      3  0.27580827 0.14242750 0.3773957 4.141431  0.51723842 0.21241863 0.4608890 5.612776
4      4  0.24375552 0.18578089 0.4310231 4.035391  0.39196151 0.27101270 0.5205888 5.399953
5      5 -0.27033146 0.17294668 0.4158686 4.067071 -0.47783034 0.25338456 0.5033732 5.464853
6      6  0.15091558 0.06976327 0.2641274 4.313321  0.27959802 0.10245535 0.3200865 5.991810
7      7 -0.01539252 0.02018124 0.1420607 4.426779 -0.00761009 0.03080923 0.1755256 6.226366
8      8 -0.07839191 0.10535569 0.3245854 4.229998 -0.17034149 0.15501891 0.3937244 5.813712
> 
> # join original animal id 
> 
> aid_renum = pedi[,c(1,10)] 
> 
> colnames(aid_renum) = c("animal", "orig_aid") 
> 
> aid_renum 
  animal orig_aid
1      1        7
2      7        2
3      2        4
4      3        6
5      6        1
6      4        8
7      8        3
8      5        5
> 
> # 조인하기 위한 패키지 실행 
> 
> #install.packages("plyr") 
> if (!("plyr" %in% installed.packages())){
+   install.packages('plyr', dependencies = TRUE)  
+ }
> library(plyr) 
> 
> mt01_final = join(x = mt01_result, y = aid_renum, by = "animal", type = "left") 
> 
> mt01_final = mt01_final[order(mt01_final$orig_aid),] 
> 
> mt01_final
  animal       bv_t1     rel_t1    acc_t1   sep_t1       bv_t2     rel_t2    acc_t2   sep_t2 orig_aid
6      6  0.15091558 0.06976327 0.2641274 4.313321  0.27959802 0.10245535 0.3200865 5.991810        1
7      7 -0.01539252 0.02018124 0.1420607 4.426779 -0.00761009 0.03080923 0.1755256 6.226366        2
8      8 -0.07839191 0.10535569 0.3245854 4.229998 -0.17034149 0.15501891 0.3937244 5.813712        3
2      2 -0.01023894 0.17476720 0.4180517 4.062592 -0.01267064 0.25687710 0.5068304 5.452056        4
5      5 -0.27033146 0.17294668 0.4158686 4.067071 -0.47783034 0.25338456 0.5033732 5.464853        5
3      3  0.27580827 0.14242750 0.3773957 4.141431  0.51723842 0.21241863 0.4608890 5.612776        6
1      1 -0.31611757 0.14424801 0.3798000 4.137033 -0.47898372 0.21591116 0.4646624 5.600317        7
4      4  0.24375552 0.18578089 0.4310231 4.035391  0.39196151 0.27101270 0.5205888 5.399953        8
> 
> # 파일로 출력, 분리자는 ",", 따옴표 없고 
> 
> output_filename = gsub("[ -]", "", paste("mt01_result_", Sys.Date(), ".csv")) 
> 
> write.table(mt01_final, output_filename, sep=",", row.names = FALSE, quote = FALSE)

다음은 저장한 파일이다.

animal,bv_t1,rel_t1,acc_t1,sep_t1,bv_t2,rel_t2,acc_t2,sep_t2,orig_aid
6,0.15091558,0.0697632657674614,0.264127366562917,4.31332061,0.27959802,0.10245535305655,0.320086477465934,5.9918099,1
7,-0.01539252,0.0201812426010169,0.142060700410131,4.42677932,-0.00761009,0.030809226227942,0.175525571436022,6.22636579,2
8,-0.07839191,0.105355689489867,0.32458541170217,4.22999837,-0.17034149,0.155018906732078,0.393724404542159,5.8137117,3
2,-0.01023894,0.174767202386813,0.418051674302128,4.06259227,-0.01267064,0.256877096157208,0.506830441229814,5.45205614,4
5,-0.27033146,0.172946682182092,0.415868587635676,4.06707098,-0.47783034,0.253384555871907,0.503373177545156,5.46485295,5
3,0.27580827,0.142427496743397,0.377395676635806,4.14143092,0.51723842,0.212418628120048,0.460888954217877,5.61277604,6
1,-0.31611757,0.144248009545437,0.379799959907104,4.13703273,-0.47898372,0.215911159083335,0.464662414106559,5.60031728,7
4,0.24375552,0.185780893148128,0.43102307728024,4.0353912,0.39196151,0.271012695344728,0.52058879679141,5.39995298,8

csv 파일이므로 엑셀에서 열면 다음과 같다.

저작자표시 (새창열림)

'Animal Breeding > R for Genetic Evaluation' 카테고리의 다른 글

R을 이용한 다형질 개체 모형(same model and missing record)의 육종가 구하기 프로그래밍 (0)	2020.07.26
R을 이용한 다형질 개체 모형(same model and no missing record)의 육종가 구하기 프로그래밍 (0)	2020.07.03
R을 이용한 공통 환경 효과 모형(Common Environmental Effect Model)의 육종가 구하기 프로그래밍 (0)	2020.07.01
R을 이용하여 공통 환경 효과 모형(common environmental effect model)에 대한 blupf90 분석 결과 정리하기 (0)	2020.07.01
R을 이용한 반복 모형(Repeatability Model)의 육종가 구하기 프로그래밍 (0)	2020.06.14

Let's Study Animal Breeding

R을 이용하여 다형질 개체 모형(multiple trait animal model)에 대한 blupf90 분석 결과 정리하기

'Animal Breeding > R for Genetic Evaluation' 카테고리의 다른 글

+ Recent posts

티스토리툴바