i need to distribute some days along the year.
I have 213 activities and 247 days.. i need to plan this activities, but i need to cover the maximum time what can be possible.
I am substracting the total days - activities, in this case 34, i divide the total days with the previous result: 247/34= 7,26...
With this number i know what every seven days y have one without activity.
To code this part i doing this:
where day is a "for" variable what its looping a dataframe with dates and integer its the integer part of 7,26, in this case 7
if(day%%integer==0) {
aditional <- 0
} else {
aditional <- 1
}
#
if(day%%7==0) {
aditional <- 0
} else {
aditional <- 1
}
The result will be:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
In bold font the day without activity
This way its cool, but its not so precise how i want.
I know i need to use the decimal part of the result of 7,26... 26, but i dont know how do it.
Can you help me please?
Thanks and sorry for my english
Make these 34 days the non-activity days:
round((247/34) * seq(34))
giving:
[1] 7 15 22 29 36 44 51 58 65 73 80 87 94 102 109 116 124 131 138
[20] 145 153 160 167 174 182 189 196 203 211 218 225 232 240 247
Related
I have to rename several tens of thousands of audio files of 5 seconds each, each of them coming from a file of 5 minutes (5minutes/5secondes = 60 files). To do this I need to define the time (hour, minute and second) of the beginning of the 5 minutes recording and I tried to make a clock that advances from 5 seconds to 5 seconds and that keep the values of seconds, minutes and hours in vectors to rename the files using these vectors like this:
stwd("")
name = "Car041512-2021-Pass1-Z2_20210914_211000_" #file name prefix
hours = 21
minutes = 19
seconde = 9
for (i in 0:59) {
seconde[i+1] = secondes + i*5
if(seconde[i+1] >= 60)
seconde[i+1] = seconde[i+1] - 60
minute[i+1]= minutes+1
if (minute >= 60)
minutes = 0
hour[i+1] = hours + 1
}
time = as.character(paste0(hour,minute,seconde))
list =list.files(all.files=F)
rename = paste0(name,time,".wav")
file.rename(list, rename)
I have a problem at the beginning of the loop. The seconds vector does not exceed 60 seconds but only during 2 cycles and I do not see why. This is the first time I've done loops with R and I must have made a lot of mistakes.
seconde
[1] 9 14 19 24 29 34 39 44 49 54 59 4 9 14 19 24 29 34 39 44 49 54 59 64 69 74 79 84 89 94 99 104 109 114 119 124 129 134 139 144 149 154 159 164 169 174 179 184 189 194 199 204 209 214 219 224 229 234 239 244
The renaming of the files works correctly, it's just the loop that doesn't work correctly. Can you help me?
Thanks in advance.
I have an excel sheet that I imported into RStudio which contains data for every subject of a certain population. Each subject has their own set of data with corresponding dates, but I only want to look at the data and perform statistical analyses on the dates past a unique date for each subject.
I'm assuming I can use the split function to create smaller dataframes, with each corresponding to that of each subject, and then use some function to analyze the data in a loop to run on all of the smaller dataframes I created from the split.
Some of these subjects with have over 1000 data points. My two main questions are:
1) Is there a function I can use to analyze the data for each subject past a specific unique date to each subject?
2) Is the strategy I proposed above a viable one?
I unfortunately have very little experience in data analyses or extensive any background in computer science. Thanks for any help.
Edit: So there was a request about the type of data I was talking about. I was wondering if I had data similar to this, could I still use the above strategy. Where P1 and P2 have their own data sets that I want to analyze after the TxDate.
>data
1 Date BMI Glucose Cholesterol TxDate
2 P1 3/3/15
3 12/1/14 24 145 99
4 3/18/15 26 123 101
5 4/21/15 28 111 85
6 6/2/15 25 133 90
7
8
9 P2 4/6/16
10 1/3/16 33 145 200
11 3/30/16 31 162 178
12 5/13/16 34 190 134
13 6/12/16 34 183 168
14 7/9/16 35 200 189
15 9/10/16 31 175 190
16 11/23/17 27 121 120
17
18
Here are some suggestions to get you started:
1) Tidy your data. To do this you could look into ways to modify your input data so it looks more like this:
ID Date BMI Glucose Cholesterol TxDate
3 P1 12/1/14 24 145 99 3/3/15
4 P1 3/18/15 26 123 101 3/3/15
5 P1 4/21/15 28 111 85 3/3/15
6 P1 6/2/15 25 133 90 3/3/15
10 P2 1/3/16 33 145 200 4/6/16
11 P2 3/30/16 31 162 178 4/6/16
12 P2 5/13/16 34 190 134 4/6/16
13 P2 6/12/16 34 183 168 4/6/16
14 P2 7/9/16 35 200 189 4/6/16
15 P2 9/10/16 31 175 190 4/6/16
16 P2 11/23/17 27 121 120 4/6/16
Notice the ID and TxDate column are filled in with the appropriate value and several rows were dropped. And row for ID, Date, etc. are actually 'headers', and not a data row. Don't be too surprised if the tidying step takes longer than the analysis.
Now, for the purpose of this example lets use this as your data:
df <- data.frame(
ID = c(rep("P1",4), rep("P2", 7)),
Date = as.Date(mdy(c("12/1/14", "3/18/15", "4/21/15" , "6/2/15", "1/3/16", "3/30/16", "5/13/16", "6/12/16", "7/9/16", "9/10/16", "11/23/17"))),
BMI = c(24,26,28,25,33,31,34,34,35,31,27),
Glucose = c(145,123,111,133,145,12,190,183,200,175,121),
Cholesterol = c(99,101,85,90,200,178,134,168,189,190,120),
TxDate = as.Date(mdy(c("3/3/15", "3/3/15","3/3/15","3/3/15","4/6/16", "4/6/16","4/6/16","4/6/16","4/6/16","4/6/16","4/6/16"))),
stringsAsFactors = F)
2) Check to see if your Date and TxDate columns are being represented as a date object. If your data.frame is named 'df' then something like is.date(df$Date) and is.date(df$TxDate) will tell you. Or str(df).
If not, read about ways to convert them to date objects, perhaps with the as.Date() function combined with mdy() from the lubridate package.
3) Once you have the dates represented as date objects you could subset the data frame with a simple logical statement, like this
# subset dataframe
df1 <- df[df$Date > df$TxDate, ]
Now df1 should look like this:
ID Date BMI Glucose Cholesterol TxDate
2 P1 2015-03-18 26 123 101 2015-03-03
3 P1 2015-04-21 28 111 85 2015-03-03
4 P1 2015-06-02 25 133 90 2015-03-03
7 P2 2016-05-13 34 190 134 2016-04-06
8 P2 2016-06-12 34 183 168 2016-04-06
9 P2 2016-07-09 35 200 189 2016-04-06
10 P2 2016-09-10 31 175 190 2016-04-06
11 P2 2017-11-23 27 121 120 2016-04-06
What's left is the data you seem to need for your analysis.
I have the following R script
optioncost =c(5,52,23,15,134,996,2033,18)
options=c(0,1,1,1,1,0,1,1)
cip=c()
for (options_ind in options)
{
if(options_ind==1)
{
cip=append(cip,optioncost[which(options==options_ind)])
}
}
cip
I am trying to get (52 23 15 134 2033 18). Where as when I run the above script I get an output list which is 6 times the length of expected results. My output from the code is as follows for cip " 52 23 15 134 2033 18 52 23 15 134 2033 18 52 23 15 134 2033 18 52 23 15 134 2033 18 52 23 15 134 2033 18 52 23 15 134 2033 18".
Please help me find out where I have gone wrong?
optioncost[which(options==options_ind)] selects the information you want on its own.
The for loop is superfluous and in this case just repeats the process for as many "1"'s as there are in options, which is 6. Which is why your output data is 6 times larger than he data that you want.
optioncost[as.logical(options)]
If you want to work with a for loop then this the way to go
cip=c()
for (i in seq_along(options))
{
if(options[i]==1)
{
cip=append(cip,optioncost[i])
}
}
cip
I am trying to solve the DSC(Differential scanning calorimetry) data with R but it seems that I ran into some troubles. All this used to be done in Origin or Qtiplot tediously in my lab.But I wonder if there is another way to do it in batch.But the result did not goes well. For example, maybe I have used the wrong colnames of my data.frame,the code
dat$0.5min
Error: unexpected numeric constant in "dat$0.5"
can not reach my data.
So below is the full description of my purpose, thank you in advance!
the DSC data is like this(I store the CSV file in my GoogleDrive Link ) :
T1 0.5min T2 1min
40.59 -0.2904 40.59 -0.2545
40.81 -0.281 40.81 -0.2455
41.04 -0.2747 41.04 -0.2389
41.29 -0.2728 41.29 -0.2361
41.54 -0.2553 41.54 -0.2239
41.8 -0.07 41.8 -0.0732
42.06 0.1687 42.06 0.1414
42.32 0.3194 42.32 0.2817
42.58 0.3814 42.58 0.3421
42.84 0.3863 42.84 0.3493
43.1 0.3665 43.11 0.3322
43.37 0.3438 43.37 0.3109
43.64 0.3265 43.64 0.2937
43.9 0.3151 43.9 0.2819
44.17 0.3072 44.17 0.2735
44.43 0.2995 44.43 0.2656
44.7 0.2899 44.7 0.2563
44.96 0.2779 44.96 0.245
in fact I have merge the data into a data.frame and hope I can adjust it and do something further.
the command is:
dat<-read.csv("Book1.csv",header=F)
colnames(dat)<-c('T1','0.5min','T2','1min','T3','2min','T4','4min','T5','8min','T6','10min',
'T7','20min','T8','ascast1','T9','ascast2','T10','ascast3','T11','ascast4',
'T12','ascast5'
)
so actually dat is a data.frame with 1163 obs. of 24 variables.
T1,T2,T3.....T12 means temperature that the samples were tested of DSC although in the same interval they do differ a little due to the unstability of the machine.
And the colname along T1~T12 is Heat Flow of different heat treatment durations that records by the machine and ascast1~ascast5 means nothing done to the sample to check the accuracy of the machine.
Now I need to do something like the following:
for T1~T2 is in Celsius Degrees,I need to change them into Kelvin Degrees whichi means every data plus 273.16.
Two temperature is chosen to compare the result that is Ts=180.25,Te=240.45(all is discussed in Celsius Degrees and I have seen it Qtiplot to make sure). To be clear I list the two temperature and the first 6 columns data.
T1 0.5min T2 1min T3 2min T4 4min
180.25 -0.01710000 180.25 -0.01780000 180.25 -0.02120000 180.25 -0.02020000
. . . .
. . . .
240.45 0.05700000 240.45 0.04500000 240.45 0.05780000 240.45 0.05580000
That all Heat Flow in Ts should be the same that can be made 0 for convenience. So based on the different values Heat Flow of different times like 0.5min,1min,2min,4min,8min,10min,20min and ascas1~ascast5 all Heat Flow value should be minus the Heat Flow value in Ts.
And for Heat Flow in Te, the value should be adjust to make sure that all the Heat Flow data are the same in Te. The purpose is like the following, (1) calculate mean of the 12 heat flow data in Te. Let's use Hmean for the mean heat flow.So Hmean is the value that all Heat Flow should be. (2) for data in column 0.5min,I use col("0.5min") to denote, and the lineal transform formula is like the following:
col("0.5min")-[([0.05700000-(-0.01710000)]-Hmean)/(Te-Ts)]*(col(T1)-Ts)
Actually, [0.05700000-(-0.01710000)] is done in step 2,but I write it for your reference. And this formula is used for different pair of T1~T12 and columns,like (T1,0.5min),(T2, 1min),(T3,1min).....all is 12 pairs.
Now we can plot the 12 pairs of data on the same plot with intervals from 180~240(also in Celsius Degrees) to magnify the details of differences between the different scans of DSC.
I have been stuck on this problems for 2 days , so I return to stackoverflow for help.
Thanks!
I am assuming that your question was right in the beginning where you got the following error,
dat$0.5min
Error: unexpected numeric constant in "dat$0.5"
As I could not find a question in the rest of the steps. They just seemed like a step by step procedure of an experiment.
To fix that error, the problem is the column name has a number in it so to use the column name in the way you want (to reference a column), you should use "`", accent mark, symbol.
>dataF <- data.frame("0.5min"=1:10,"T2"=11:20,check.names = F)
> dataF$`0.5min`
[1] 1 2 3 4 5 6 7 8 9 10
Based on comments adding more information,
You can add a constant to add to alternate columns in the following manner,
dataF <- data.frame(matrix(1:100,10,10))
const <- 237
> print(dataF)
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 1 11 21 31 41 51 61 71 81 91
2 2 12 22 32 42 52 62 72 82 92
3 3 13 23 33 43 53 63 73 83 93
4 4 14 24 34 44 54 64 74 84 94
5 5 15 25 35 45 55 65 75 85 95
6 6 16 26 36 46 56 66 76 86 96
7 7 17 27 37 47 57 67 77 87 97
8 8 18 28 38 48 58 68 78 88 98
9 9 19 29 39 49 59 69 79 89 99
10 10 20 30 40 50 60 70 80 90 100
dataF[,seq(1,ncol(dataF),by = 2)] <- dataF[,seq(1,ncol(dataF),by = 2)] + const
> print(dataF)
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 238 11 258 31 278 51 298 71 318 91
2 239 12 259 32 279 52 299 72 319 92
3 240 13 260 33 280 53 300 73 320 93
4 241 14 261 34 281 54 301 74 321 94
5 242 15 262 35 282 55 302 75 322 95
6 243 16 263 36 283 56 303 76 323 96
7 244 17 264 37 284 57 304 77 324 97
8 245 18 265 38 285 58 305 78 325 98
9 246 19 266 39 286 59 306 79 326 99
10 247 20 267 40 287 60 307 80 327 100
To generalize, we know that the columns of a dataframe can be referenced with a vector of numbers/column names. Most operations in R are vectorized. You can use column names or numbers based on the pattern you are looking for.
For example, I change the name of my first two columns and want to access just those I do this,
colnames(dataF)[c(1,2)] <- c("Y1","Y2")
#Reference all column names with "Y" in it. You can do any operation you want on this.
dataF[,grep("Y",colnames(dataF))]
Y1 Y2
1 238 11
2 239 12
3 240 13
4 241 14
5 242 15
6 243 16
7 244 17
8 245 18
9 246 19
10 247 20
Here is a subset of my data:
Fr Sig Code NumDet Date.Time Aerial
62 150102 102 15 195 2012-09-14 18:28:00 1
63 150102 102 15 189 2012-09-14 18:32:00 1
64 150102 106 15 213 2012-09-14 18:36:00 1
65 150102 102 15 152 2012-09-14 18:40:00 1
66 150102 105 15 190 2012-09-14 18:46:00 1
67 150102 97 15 4 2012-09-14 18:51:00 2
I am trying to calculate time between first detection on Aerial 1 and first detection on Aerial 2. Hence in this data set it would be 23mins
I have tried variations of difftime but can't seem to select specific times based on the Aerial number.
I have tried:
a <- difftime(table$Date.Time[2:length(table$Aerial == "1")],
table$Date.Time[2:length(table$Aerial == "2")])
but it's not even close.
This command using difftime
difftime(table$Date.Time[table$Aerial == "2"][1],
table$Date.Time[table$Aerial == "1"][1])
will return
Time difference of 23 mins