在R中有一堆apply, lapply, spply, tapply,...等等的函數,主要是避免使用迴圈函數。
因為R內建的函數都是經過編譯的,所以執行的效率會比自己寫迴圈還好。
apply(array, margin, function, ...) 將矩陣依據行或列把所有的元素代入運算
> x<-array(1:15,c(3,5)) ;x
[,1] [,2] [,3] [,4] [,5]
[1,] 1 4 7 10 13
[2,] 2 5 8 11 14
[3,] 3 6 9 12 15
> apply(x,1,sum)
[1] 35 40 45
> apply(x,2,mean)
[1] 2 5 8 11 14
lapply(list, function, ...) 依據矩陣最上層的分類項目運算
> y<-data.frame(x);y
X1 X2 X3 X4 X5
1 1 4 7 10 13
2 2 5 8 11 14
3 3 6 9 12 15
> lapply(y,mean)
$X1
[1] 2
$X2
[1] 5
$X3
[1] 8
$X4
[1] 11
$X5
[1] 14
> z<-list("a"=t(x),"b"=y);z
$a
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
[4,] 10 11 12
[5,] 13 14 15
$b
X1 X2 X3 X4 X5
1 1 4 7 10 13
2 2 5 8 11 14
3 3 6 9 12 15
> lapply(z,sum)
$a
[1] 120
$b
[1] 120
sapply(list, function, ..., simplify) 和lappy類似,但多了一些選項
> sapply(y,sum)
X1 X2 X3 X4 X5
6 15 24 33 42
> sapply(y,sum,simplify=F)
$X1
[1] 6
$X2
[1] 15
$X3
[1] 24
$X4
[1] 33
$X5
[1] 42
> sapply(z,sum,simplify=T) a b
120 120
tapply(array, indicies, function, ..., simplify) 根據指定的分類運算
> v<-runif(20)
> s<-rep(c("boy","gril"),each=10)
> g<-rep(4:1,times=5)
> vsg<-data.frame(v,s,g);vsg
v s g
1 0.35767679 boy 4
2 0.95739151 boy 3
3 0.01877919 boy 2
4 0.53033287 boy 1
5 0.04086881 boy 4
6 0.61796854 boy 3
7 0.51664062 boy 2
8 0.58096978 boy 1
9 0.87296081 boy 4
10 0.91295169 boy 3
11 0.36905117 gril 2
12 0.31114506 gril 1
13 0.47866735 gril 4
14 0.67597893 gril 3
15 0.63960258 gril 2
16 0.61138231 gril 1
17 0.10823282 gril 4
18 0.70282047 gril 3
19 0.81825796 gril 2
20 0.64503892 gril 1
> tapply(vsg$v,vsg$s,max)
boy gril
0.9573915 0.8182580
> tapply(vsg$v,list(vsg$s,vsg$g),min)
1 2 3 4
boy 0.5303329 0.01877919 0.6179685 0.04086881
gril 0.3111451 0.36905117 0.6759789 0.10823282
沒有留言:
張貼留言