我有两个大数据框架,一个(df1
)有这种结构
chr init
1 12 25289552
2 3 180418785
3 3 180434779
另一个 (df2
)有这个
V1 V2 V3
10 1 69094 medium
11 1 69094 medium
12 12 25289552 high
13 1 69095 medium
14 3 180418785 medium
15 3 180434779 low
我要做的是添加列 V3
的 df2
至 df1
,获取变异的信息
chr init Mut
1 12 25289552 high
2 3 180418785 medium
3 3 180434779 low
我正在尝试将两者加载到R中然后使用匹配进行for循环,但它不起作用。你知道有什么特别的方法吗?我也愿意使用awk或类似的东西
使用 merge
df1 <- read.table(text=' chr init
1 12 25289552
2 3 180418785
3 3 180434779', header=TRUE)
df2 <- read.table(text=' V1 V2 V3
10 1 69094 medium
11 1 69094 medium
12 12 25289552 high
13 1 69095 medium
14 3 180418785 medium
15 3 180434779 low', header=TRUE)
merge(df1, df2, by.x='init', by.y='V2') # this works!
init chr V1 V3
1 25289552 12 12 high
2 180418785 3 3 medium
3 180434779 3 3 low
以您显示的方式获得所需的输出
output <- merge(df1, df2, by.x='init', by.y='V2')[, c(2,1,4)]
colnames(output)[3] <- 'Mut'
output
chr init Mut
1 12 25289552 high
2 3 180418785 medium
3 3 180434779 low
使用 merge
df1 <- read.table(text=' chr init
1 12 25289552
2 3 180418785
3 3 180434779', header=TRUE)
df2 <- read.table(text=' V1 V2 V3
10 1 69094 medium
11 1 69094 medium
12 12 25289552 high
13 1 69095 medium
14 3 180418785 medium
15 3 180434779 low', header=TRUE)
merge(df1, df2, by.x='init', by.y='V2') # this works!
init chr V1 V3
1 25289552 12 12 high
2 180418785 3 3 medium
3 180434779 3 3 low
以您显示的方式获得所需的输出
output <- merge(df1, df2, by.x='init', by.y='V2')[, c(2,1,4)]
colnames(output)[3] <- 'Mut'
output
chr init Mut
1 12 25289552 high
2 3 180418785 medium
3 3 180434779 low
df1 <- read.table(textConnection(" chr init
1 12 25289552
2 3 180418785
3 3 180434779"), header=T)
df2 <- read.table(textConnection(" V1 V2 V3
10 1 69094 medium
11 1 69094 medium
12 12 25289552 high
13 1 69095 medium
14 3 180418785 medium
15 3 180434779 low"), header=T)
# You have to select the values of df2$V3 such as their corresponding V2
# are equal to the values of df1$init
df1$Mut <- df2$V3[ df2$V2 %in% df1$init]
df1
chr init Mut
1 12 25289552 high
2 3 180418785 medium
3 3 180434779 low
是否
df3 <- merge( df1, df2, by.x = "init", by.y = "V2" )
df3 <- df3[-3]
colnames( df3 )[3] <- "Mut"
给你你想要的?