I am looking for similar problems, this one suggests indexing options.
The following is the method of timing subsets.
#Dummy data
dat <- data.frame(x = runif(1000000, 1, 1000), y=runif(1000000, 1, 1000))
#Subset and time< br />system.time(x <- dat[dat$x> 500, ])
# user system elapsed
# 0.092 0.000 0.090
system.time(x <- dat[which (dat$x> 500), ])
# user system elapsed
# 0.040 0.032 0.070
system.time(x <- subset(dat, x> 500))
# user system elapsed
# 0.108 0.004 0.109
Edit:
As Roland suggested I use microbenchmark. It seems to perform best.
library("ggplot2")
library("microbenchmark")
#Dummy data
dat <- data.frame(x = runif(1000000, 1, 1000), y=runif(1000000, 1, 1000))
#Benchmark
res <- microbenchmark( dat[dat$x> 500, ],
dat[which(dat$ x> 500), ],
subset(dat, x> 500))
#plot
autoplot.microbenchmark(res)
library("ggplot2")
library("microbenchmark")
#Dummy data
dat <- data.frame(x = runif(1000000, 1, 1000), y=runif(1000000, 1, 1000))
#Benchmark
res <- microbenchmark( dat[dat$x> 500, ],
dat[which(dat$x> 500), ],
subset(dat, x> 500))
#plot
autoplot.microbenchmark(res)
Can anyone suggest a more effective way to Group data frames without using SQL/indexing/data.table option?
I am looking for similar problems, this one suggests indexing options.
The following is the method of timing subsets.
#Dummy data
dat <- data.frame(x = runif(1000000, 1, 1000), y=runif(1000000, 1, 1000))
#Subset and time< br />system.time(x <- dat[dat$x> 500, ])
# user system elapsed
# 0.092 0.000 0.090
system.time(x <- dat[which (dat$x> 500), ])
# user system elapsed
# 0.040 0.032 0.070
system.time(x <- subset(dat, x> 500))
# user system elapsed
# 0.108 0.004 0.109
Edit:
As Roland suggested I use microbenchmark. It seems to perform best.
library("ggplot2")
library("microbenchmark")
#Dummy data
dat <- data.frame(x = runif(1000000, 1, 1000), y=runif(1000000, 1, 1000))
#Benchmark
res <- microbenchmark( dat[dat$x> 500, ],
dat[which(dat$ x> 500), ],
subset(dat, x> 500))
#plot
autoplot.microbenchmark(res)
As Roland suggested I use microbenchmark. Which seems to perform best .
library("ggplot2")
library("microbenchmark")
#Dummy data
dat <- data.frame(x = runif(1000000, 1, 1000), y=runif(1000000, 1, 1000))
#Benchmark
res <- microbenchmark( dat[dat $x> 500, ],
dat[which(dat$x> 500), ],
subset(dat, x> 500))
#plot
autoplot.microbenchmark (res)
WordPress database error: [Table 'yf99682.wp_s6mz6tyggq_comments' doesn't exist]SELECT SQL_CALC_FOUND_ROWS wp_s6mz6tyggq_comments.comment_ID FROM wp_s6mz6tyggq_comments WHERE ( comment_approved = '1' ) AND comment_post_ID = 2643 ORDER BY wp_s6mz6tyggq_comments.comment_date_gmt ASC, wp_s6mz6tyggq_comments.comment_ID ASC