I have a simple problem, and a bit more complicated twist at the end.
I have 2 datasets A & B (Separate when imported into R):
Dataset A is pulled from a DAQ that is sampling at 2000 times a second, while dataset B is pulled from a scope at 500 times a second. I have a test that records data from the DAQ and Scope for 5 seconds.
In R Studio I want to time synchronize this data and, for the sake of learning, how can I do it in both of the following ways?
1) Without duplicating values so filtering doesn't stair step:
A B
1 1 1
2 2 NA
3 3 NA
4 4 NA
5 5 2
6 6 NA
7 7 NA
8 8 NA
9 9 3
10 10 NA
11 11 NA
12 12 NA
2) With duplicating numbers if I don't want NA's in the functions I apply to the frame:
A B
1 1 1
2 2 1
3 3 1
4 4 1
5 5 2
6 6 2
7 7 2
8 8 2
9 9 3
10 10 3
11 11 3
12 12 3
Now here is the twist where it becomes a very unique problem I have. Lets say Dataset A records a bit before & after the 5 second test. Dataset A also has an extra column for "Trigger" which is either a 0 or a 1. 1 is a high that represents recording and basically where Dataset B starts. When it switches back to 0, Dataset B has finished recording.
Is there a way I can strategically do the above time sync in Dataset A? The reason I want to keep the data before & after the "true" recording section, is to make sure a filter or a filtfilt sweep will level out before the data truly starts.
Thanks for any help!
Related
I have a column A with various names of areas lets say Area 1 to 10 (repeated throughout the column, one in each cell). Then I have a column B with dates that something was done in that specific area, some cells no date is in yet because nothing was done. I need to create a summary where I count how many times that something was done in that specific area. That means I need to take each area (Area 1, area 2, area 3 etc.) and count how many times I did an action. I will know it was done by the fact that there is a date in column B. I need a formula that can help me calculate this.
Is this what you're looking for?
library(tidyverse)
# create sample data
df <- tibble(A=rep(c(1:10),3), B=rep(c(Sys.Date(), NA),15))
df
A B
1 1 2019-02-06
2 2 NA
3 3 2019-02-06
4 4 NA
5 5 2019-02-06
6 6 NA
7 7 2019-02-06
8 8 NA
9 9 2019-02-06
10 10 NA
...
# grouping and summarising it for column A
df %>%
mutate(count=ifelse(!is.na(B), 1, 0)) %>%
group_by(A) %>%
summarise(count=sum(count,na.rm=T))
A count
1 1 3
2 2 0
3 3 3
4 4 0
5 5 3
6 6 0
7 7 3
8 8 0
9 9 3
10 10 0
If I understand you well:
SELECT area_name, COUNT(action_date) WHERE action_date <> '' GROUP by area_name;
I'm trying to merge 7 complete data frames into one great wide data frame. I figured I have to do this stepwise and merge 2 frames into 1 and then that frame into another so forth until all 7 original frames becomes one.
fil2005: "ID" "abr_2005" "lop_2005" "ins_2005"
fil2006: "ID" "abr_2006" "lop_2006" "ins_2006"
But the variables "abr_2006" "lop_2006" "ins_2006" and 2005 are all either 0,1.
Now the things is, I want to either merge or do a dcast of some sort (I think) to make these two long data frames into one wide data frame were both "abr_2005" "lop_2005" "ins_2005" and abr_2006" "lop_2006" "ins_2006" are in that final file.
When I try
$fil_2006.1 <- merge(x=fil_2005, y=fil_2006, by="ID__", all.y=T)
all the variables with _2005 at the end if it is saved to the fil_2006.1, but the variables ending in _2006 doesn't.
I'm apparently doing something wrong. Any idea?
Is there a reason you put those underscores after ID__? Otherwise, the code you provided will work
An example:
dat1 <- data.frame("ID"=seq(1,20,by=2),"varx2005"=1:10, "vary2005"=2:11)
dat2 <- data.frame("ID"=5:14,"varx2006"=1:20, "vary2006"=21:40)
# create data frames of differing lengths
head(dat1)
ID varx2005 vary2005
1 1 1 2
2 3 2 3
3 5 3 4
4 7 4 5
5 9 5 6
6 11 6 7
head(dat2)
ID varx2006 vary2006
1 5 1 21
2 6 2 22
3 7 3 23
4 8 4 24
5 9 5 25
6 10 6 26
merged <- merge(dat1,dat2,by="ID",all=T)
head(merged)
ID varx2006 vary2006 varx2005 vary2005
1 1 NA NA 1 2
2 3 NA NA 2 3
3 5 1 21 3 4
4 5 11 31 3 4
5 7 13 33 4 5
6 7 3 23 4 5
I would like to import the data into R as intervals, then I would like to count all the numbers falling within these intervals and draw a histogram from this counts.
Example:
start end freq
1 8 3
5 10 2
7 11 5
.
.
.
Result:
number freq
1 3
2 3
3 3
4 3
5 5
6 5
7 10
8 10
9 7
10 7
11 5
Some suggestions?
Thank you very much!
Assuming your data is in df, you can create a data set that has each number in the range repeated by freq. Once you have that it's trivial to use the summarizing functions in R. This is a little roundabout, but a lot easier than explicitly computing the sum of the overlaps (though that isn't that hard either).
dat <- unlist(apply(df, 1, function(x) rep(x[[1]]:x[[2]], x[[3]])))
hist(dat, breaks=0:max(df$end))
You can also do table(dat)
dat
1 2 3 4 5 6 7 8 9 10 11
3 3 3 3 5 5 10 10 7 7 5
I have 11 datum.
Data 1 is made up of several sets of sequence. And each sequence in data 1 is made up of 2 numbers.
It looks like:
[1] 1 2
[1] 1 3
[1] 1 4
Data 2 is made up of several sets of sequence. And each sequence in data 2 is made up of 3 numbers.
It looks like:
[1] 1 2 3
[1] 1 2 4
[1] 1 3 4
[1] 1 2 5
Data 3 is made up of several sets of sequence. And each sequence in data 3 is made up of 4 numbers.
It looks like:
[1] 1 2 3 5
[1] 1 3 4 5
[1] 1 2 3 4
[1] 1 2 4 5
Data 4 is made up of several sets of sequence. And each sequence in data 4 is made up of 5 numbers.
.........
.........
Data 11 is made up of several sets of sequence. And each sequence in data 11 is made up of 12 numbers.
Take any sequence from data 1 then add a number to this sequence to become a new sequence (sequence 2).
And this new sequence (sequence 2) exactly exists in data 2. Then add another number to the sequence 2 to get a new sequence (sequence 3).
And this new sequence (sequence 3) exactly exists in data 3..........
...........
The final step is to take any sequence from data 10 then add a number to this sequence to become a new sequence (sequence 11).And this new sequence (sequence 11) exactly exists in data 11.
ex:
data 5 → [1] 4 5 8 9 14 15 add "10" → get [1] 4 5 8 9 10 14 15 (exist in data 6)
data 6 → [1] 4 5 8 9 10 14 15 add "11" → get [1] 4 5 8 9 10 11 14 15 (exist in data 7)
And I alweady have data matrix from data 1 to data 11. Data 1 has 48 sequences. Data 2 has 48 sequences. Data 3 has 147 sequences. Data 4 has 73 sequences. Data 5 has 12 sequences... Data 11 has 1 sequences
I wander if there is code can help me to fine out the regulation between the 11 datum.
I printed out the summary of a column variables as such:
Please see below the summary table printed out from R:
I would like to generate it into a data.frame. However, there are too many subject names that it's very difficult to list out all, also, the term "OTHER" with number 31 means that there are 319 subjects which appear only 1 time in the original data.frame.
So, the new data.frame I hope to produce would look like below:
Here is one possible solution.
Table<-table(rpois(100,5))
as.data.frame(Table)
Var1 Freq
1 1 2
2 2 11
3 3 9
4 4 18
5 5 13
6 6 20
7 7 14
8 8 8
9 9 3
10 10 1
11 11 1