设置/问题:
使用dplyr - 我无法确定返回已过滤行的行索引的最佳方式,而不是返回已过滤行的内容。
问题:
我可以使用dplyr :: filter()从数据帧中提取行...问题是想要提取已过滤行的索引值并将其添加到符合搜索条件的索引条目列表中。
题:
是否有一种简单的方法可以使用dplyr根据特定条件搜索数据帧并返回找到的每一行的数字索引?下面的代码使用r :: which()将索引行提取到列表中......
requiredPackages <- c("dplyr")
ipak <- function(pkg){
new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])]
if (length(new.pkg))
install.packages(new.pkg, dependencies = TRUE)
sapply(pkg, require, character.only = TRUE)
}
ipak(requiredPackages)
if (!file.exists("./week3/data")) {
dir.create("./week3/data")
}
# CSV Download
if (!file.exists("./week3/data/americancommunitySurvey.csv")) {
fileUrl <- "https://d396qusza40orc.cloudfront.net/getdata%2Fdata%2Fss06hid.csv?accessType=DOWNLOAD"
download.file(fileUrl, destfile = "./week3/data/americancommunitySurvey.csv", method = "curl")
}
housingData <- tbl_df(read.csv("./week3/data/americancommunitySurvey.csv"
, stringsAsFactors = TRUE))
Now we have to extract the relevant data
#
# Create a logical vector that identifies the households on greater than 10
# acres who sold more than $10,000 worth of agriculture products. Assign that
# logical vector to the variable agricultureLogical. Apply the which() function
# like this to identify the rows of the data frame where the logical vector is
# TRUE. which(agricultureLogical) What are the first 3 values that result?
#
# ACR 1
# Lot size
# b .N/A (GQ/not a one-family house or mobile home)
# 1 .House on less than one acre
# 2 .House on one to less than ten acres
# 3 .House on ten or more acres ACR == 3
#
# AGS 1
# Sales of Agriculture Products
# b .N/A (less than 1 acre/GQ/vacant/
# .2 or more units in structure)
# 1 .None
# 2 .$ 1 - $ 999
# 3 .$ 1000 - $ 2499
# 4 .$ 2500 - $ 4999
# 5 .$ 5000 - $ 9999
# 6 .$10000+ AGS == 6
#
# Thus, we need to select only the results that have a ACR == 3 AND a AGS == 6
#
agricultureLogical <- which(housingData$ACR == 3 & housingData$AGS == 6)
agricultureLogical
# Now we can display the first three values of the resulting list
head(agricultureLogical[1:3])
上面的代码给了我想要的结果,但我想了解如何使用dplyr执行此操作。这是烦我的...我可以使用dplyr :: filter()如下提取行行 - 如何提取每行的索引?
agricultureLogical <- filter(housingData, ACR == 3 & housingData$AGS == 6)
R设置
版
_
平台x86_64-apple-darwin13.4.0
拱x86_64
os darwin13.4.0
system x86_64,darwin13.4.0
状态
专业3
小1.2
2014年
第10个月
第31天
svn rev 66913
语言R.
version.string R版本3.1.2(2014-10-31)
绰号南瓜头盔
dplyr版本0.3.0.2
设置Mac OS X.
型号名称:MacBook Pro 型号标识符:MacBookPro10,1 处理器名称:Intel Core i7 处理器速度:2.7 GHz 处理器数量:1 核心总数:4 L2缓存(每个核心):256 KB L3缓存:8 MB 内存:16 GB