Batch Editing TXT to CSV in R - r

I have a series of .txt files that look like this:
Button,Intensity,Acc,Intensity,RT,Time
0,30,0,0,0,77987.931
1,30,1,13.5,0,78084.57
1,30,1,15,0,78098.624
1,30,1,6,0,78114.132
1,30,1,15,0,78120.669
They have file names like 1531_Day49.txt, 1531_Day50.txt, 1532_Day49.txt, 1532_Day50.txt etc
I want to load all the files in this directory into data frames, append a column that is the difference between the Time in the row above (tdelta), and append two columns that are the first 4 digits (i.e. 1531, 1532) and one column that's the Day code uncoded so the column title would be PrePost and each row would be, if filename Day49, then "Pre" and if filename Day50 then "Post".
So ideal output for a 1531 Day 49 file would be:
Button,Intensity,Acc,Intensity,RT,Time,Tdelta,ID,PrePost
0,30,0,0,0,77987.931,0 ,1531,Pre
1,30,1,13.5,0,78084.57,96.693 ,1531,Pre
1,30,1,15,0,78098.624, 14.054,1531,Pre
So far I have:
#call library
library(data.table)
#batch enter .txt files and put them into a data frame
setwd("~/Documents/PVTPASAT/PVT")
temp = list.files(pattern="*.txt")
list.DFs <- lapply(myfiles,fread)
#view print out to visually check
View(list.DFs)
#add column of time difference
list.DFs <- lapply(list.DFs, cbind, tDelta = c(0, diff(df$Time)))
#Add empty columns for ID and PrePost
list.DFs <- lapply(list.DFs, cbind, ID = c(""))
list.DFs <- lapply(list.DFs, cbind, PrePost = c(""))
#print one to visually check
View(list.DFs[3])

I would create a function to do the processing and then apply it to your list of files like so:
example <- read.delim(textConnection('
Button, Intensity, Acc, Intensity, RT, Time
0,30,0,0,0,77987.931
1,30,1,13.5,0,78084.57
1,30,1,15,0,78098.624
1,30,1,6,0,78114.132
1,30,1,15,0,78120.669'),
header = T,
sep = ','
)
write.table(example, '1531_Day49.txt', row.names = F)
temp <- list.files(pattern="*.txt")
process_txt <- function(x) {
dat <- data.table::fread(x, header = T)
dat$tdelta <- c(0, diff(dat$Time))
dat$ID <- substr(x, 1, 4)
dat$PrePost <- if (grepl('49\\.', x)) {'Pre'} else {'Post'}
dat
}
out <- lapply(temp, process_txt)

#Heather, the main guidance is to first solve properly one file. Then, place all that working code into a function.
library(dplyr) ## for lag function
library(stringr) ## for str_detect
# make two test files
dt <- read.csv(text=
'Button,Intensity,Acc,Intensity,RT,Time
0,30,0,0,0,77987.931
1,30,1,13.5,0,78084.57
1,30,1,15,0,78098.624
1,30,1,6,0,78114.132
1,30,1,15,0,78120.669
')
write.csv(dt,"1531_Day49.txt")
write.csv(dt,"1532_Day50.txt")
# function to do the work for one file name - returns a dataframe
doOne <- function (file) {
# read
contents <- fread(file)
# compute delta
contents$Tdelta <- contents$Time - lag(contents$Time)
# prefix up to underscore
contents$ID <- strsplit(file, c("_"))[[1]][[1]]
# add the prepost using ifelse and str_detetct
contents$PrePost <- ifelse(str_detect(file, "Day49"), "Pre", "Post")
return(contents)
}
#test files
files <- c("1531_Day49.txt", "1532_Day50.txt")
# call the function for each file -- result is
# a list of dataframes
lapply(files, doOne)
# better get them all into a single data frame for analysis
do.call(rbind, lapply(files, doOne))
# V1 Button Intensity Acc Intensity.1 RT Time Tdelta ID PrePost
# 1: 1 0 30 0 0.0 0 77987.93 NA 1531 Pre
# 2: 2 1 30 1 13.5 0 78084.57 96.639 1531 Pre
# 3: 3 1 30 1 15.0 0 78098.62 14.054 1531 Pre
# 4: 4 1 30 1 6.0 0 78114.13 15.508 1531 Pre
# 5: 5 1 30 1 15.0 0 78120.67 6.537 1531 Pre
# 6: 1 0 30 0 0.0 0 77987.93 NA 1532 Post
# 7: 2 1 30 1 13.5 0 78084.57 96.639 1532 Post
# 8: 3 1 30 1 15.0 0 78098.62 14.054 1532 Post
# 9: 4 1 30 1 6.0 0 78114.13 15.508 1532 Post
# 10: 5 1 30 1 15.0 0 78120.67 6.537 1532 Post

Related

When a variable switches from 1 to 2, delete some data from the other variables and average what's left?

I am analysing some data and need help.
Basically, I have a dataset that looks like this:
date <- seq(as.Date("2017-04-01"),as.Date("2017-05-09"),length.out=40)
switch <- c(rep(1:2,each=10),rep(1:2,each=10))
O2 <- runif(40,min=21.02,max=21.06)
CO2 <- runif(40,min=0.076,max=0.080)
test.data <- data.frame(date,switch,O2,CO2)
As can be seen, there's a switch column that switches between 1 and 2 every 10 data points. I want to write a code that does: when the "switch" column changes its value (from 1 to 2, or 2 to 1), delete the first 5 rows of data after the switch (i.e. leaving the 5 last data points for all the 4 variables), average the rest of the data points for O2 and CO2, and put them in 2 new columns (avg.O2 and avg.CO2) before the next switch. Then repeat this process until the end.
It's quite easy to do manually on paper or excel, but my real dataset would comprise thousands of data points and I would like to use R to do it automatically for me. So anyone has any ideas that could help me?
Please find my edits which should work for both regular and irregular
date <- seq(as.Date("2017-04-01"),as.Date("2017-05-09"),length.out=40)
switch <- c(rep(1:2,each=10),rep(1:2,each=10))
O2 <- runif(40,min=21.02,max=21.06)
CO2 <- runif(40,min=0.076,max=0.080)
test.data <- data.frame(date,switch,O2,CO2)
CleanMachineData <- function(Data, SwitchData, UnreliableRows = 5){
# First, we can properly turn your switch column into a grouping column (1,2,1,2)->(1,2,3,4)
grouplength <- rle(Data[,"switch"])$lengths
# mapply lets us input vector arguments into typically one/first-element only argument functions.
# In this case we create a sequence of lengths (output is a list/vector)
grouping <- mapply(seq, grouplength)
# Here we want it to become a single vector representing groups
groups <- mapply(rep, 1:length(grouplength), each = grouplength)
# if frequency was irregular, it will be a list, if regular it will be a matrix
# convert either into a vector by doing as follows:
if(class(grouping) == "list"){
groups <- unlist(groups)
} else {
groups <- as.vector(groups)
}
Data$group <- groups
#
# vector of the first row of each new switch (except the starting 0)
switchRow <- c(0,which(abs(diff(SwitchData)) == 1))+1
# I use "as.vector" to turn the matrix output of mapply into a sequence of numbers.
# "ToRemove" will have all the row numbers to get rid of from your original data, except for what happens before (in this case) row 10
ToRemove <- c(1:UnreliableRows, as.vector(mapply(seq, switchRow, switchRow+(UnreliableRows)-1)))
# I concatenate the missing beginning (1,2,3,4,5) and theToRemove them with c() and then remove them from n with "-"
Keep <- seq(nrow(Data))[-c(1:UnreliableRows,ToRemove)]
# Create the new data, (in case you don't know: data[<ROW>,<COLUMN>])
newdat <- Data[-ToRemove,]
# print the results
newdat
}
dat <- CleanMachineData(test.data, test.data$switch, 5)
dat
date switch O2 CO2 group
6 2017-04-05 1 21.03922 0.07648886 1
7 2017-04-06 1 21.04071 0.07747368 1
8 2017-04-07 1 21.05742 0.07946615 1
9 2017-04-08 1 21.04673 0.07782362 1
10 2017-04-09 1 21.04966 0.07936446 1
16 2017-04-15 2 21.02526 0.07833825 2
17 2017-04-16 2 21.04511 0.07747774 2
18 2017-04-17 2 21.03165 0.07662803 2
19 2017-04-18 2 21.03252 0.07960098 2
20 2017-04-19 2 21.04032 0.07892145 2
26 2017-04-25 1 21.03691 0.07691438 3
27 2017-04-26 1 21.05846 0.07857017 3
28 2017-04-27 1 21.04128 0.07891908 3
29 2017-04-28 1 21.03837 0.07817021 3
30 2017-04-29 1 21.02334 0.07917546 3
36 2017-05-05 2 21.02890 0.07723042 4
37 2017-05-06 2 21.04606 0.07979641 4
38 2017-05-07 2 21.03822 0.07985775 4
39 2017-05-08 2 21.04136 0.07781525 4
40 2017-05-09 2 21.05375 0.07941123 4
aggregate(cbind(O2,CO2) ~ group, dat, mean)
group O2 CO2
1 1 21.04675 0.07812336
2 2 21.03497 0.07819329
3 3 21.03967 0.07834986
4 4 21.04166 0.07882221
# crazier, irregular switching
test.data2 <- test.data
test.data2$switch <- unlist(mapply(rep, 1:2, times = 1, each = c(10,8,10,5,3,10)))[1:20]
dat2 <- CleanMachineData(test.data2, test.data2$switch, 5)
dat2
date switch O2 CO2 group
6 2017-04-05 1 21.03922 0.07648886 1
7 2017-04-06 1 21.04071 0.07747368 1
8 2017-04-07 1 21.05742 0.07946615 1
9 2017-04-08 1 21.04673 0.07782362 1
10 2017-04-09 1 21.04966 0.07936446 1
16 2017-04-15 2 21.02526 0.07833825 2
17 2017-04-16 2 21.04511 0.07747774 2
18 2017-04-17 2 21.03165 0.07662803 2
24 2017-04-23 1 21.05658 0.07669662 3
25 2017-04-24 1 21.04452 0.07983165 3
26 2017-04-25 1 21.03691 0.07691438 3
27 2017-04-26 1 21.05846 0.07857017 3
28 2017-04-27 1 21.04128 0.07891908 3
29 2017-04-28 1 21.03837 0.07817021 3
30 2017-04-29 1 21.02334 0.07917546 3
36 2017-05-05 2 21.02890 0.07723042 4
37 2017-05-06 2 21.04606 0.07979641 4
38 2017-05-07 2 21.03822 0.07985775 4
# You can try removing a vector with the following
lapply(5:7, function(x) {
dat <- CleanMachineData(test.data2, test.data2$switch, x)
list(data = dat, means = aggregate(cbind(O2,CO2)~group, dat, mean))
})
Use
test.data[rep(c(FALSE, TRUE), each=5),]
to select always the last five rows from the group of 10 rows.
Then you can use aggregate:
d2 <- test.data[rep(c(FALSE, TRUE), each=5),]
aggregate(cbind(O2, CO2) ~ 1, data=d2, FUN=mean)
If you want the average for every 5-rows-group:
aggregate(cbind(O2, CO2) ~ gl(k=5, n=nrow(d2)/5L), data=d2, FUN=mean)
Here is a generalization for the situation of arbitrary number of rows in test.data:
stay <- rep(c(FALSE, TRUE), each=5, length.out=nrow(test.data))
d2 <- test.data[stay,]
group <- gl(k=5, n=nrow(d2)/5L+1L, length=nrow(d2))
aggregate(cbind(O2, CO2) ~ group, data=d2, FUN=mean)
Here is a variant for mixing the data with the averages:
group <- gl(k=10, n=nrow(test.data)/10L+1L, length=nrow(test.data))
L <- split(test.data, group)
mySummary <- function(x) {
if (nrow(x) <= 5) return(NULL)
x <- x[-(1:5),]
d.avg <- aggregate(cbind(O2, CO2) ~ 1, data=x, FUN=mean)
rbind(x, cbind(date=NA, switch=-1, d.avg))
}
lapply(L, mySummary) # as list of dataframes
do.call(rbind, lapply(L, mySummary)) # as one dataframe

List to dataframe using names as values for column in R

I have 88 tab separated files that I need to import into R.
They are named "Study-1-12"
Study: name of study
1: subject id
[1]2: experimental day (either 1 or 2)
1[2]: trial (either 1 or 2)
The data in each one looks like
START: dd.mm.yyy hh:mm:ss
WAITING 3780 ms REACTION 1230 ms
WAITING 9700 ms REACTION 377 ms
WAITING 5538 ms REACTION 310 ms
WAITING 4599 ms REACTION 361 ms
WAITING 9579 ms REACTION 338 ms
END: dd.mm.yyy hh:mm:ss
So far I imported all of them into a list and summarised each one, so the end results is a table with two columns "waiting" and "reaction" both with a single mean value.
# Load filepaths and names
filepath <- list.files(path = "rawdata/", pattern = "*.dat", all.files = TRUE, full.names = TRUE) # Load full path
filenames <- list.files(path = "rawdata/", pattern = "*.dat", all.files = TRUE, full.names = FALSE) # load names of files
# load all files into list with named col headers
ldf <- lapply(filepath, function(x) read_tsv(file = x, skip = 1,
col_names = c("waiting", "valueW", "ms", "ws", "reaction", "valueR", "ms1")))
names(ldf) <- filenames # rename items in list
# select only relevant cols and do the math
ldf <- lapply(ldf, function(x) x %>%
select(waiting, valueW, reaction, valueR) %>%
filter(waiting == "WAITING") %>%
summarise(waiting = mean(valueW), reaction = mean(valueR))
)
Now what I would like to do is create a data frame with columns based on the file name (as above: study-1-12):
id: the first 1
exp: 1 or 2
trial: 1 or 2
waiting: the value from each data frame in the list
reaction: the value from each data frame in the list
Any way of doing this in R?
library(purrr)
library(stringi)
fils <- list.files("~/Data/so", full.names=TRUE)
fils
## [1] "/Some/path/to/data/studyA-1-12" "/Some/path/to/data/studyB-30-31"
map_df(fils, function(x) {
stri_match_all_regex(x, "([[:alnum:]]+)-([[:digit:]]+)-([[:digit:]])([[:digit:]])")[[1]] %>%
as.list() %>%
.[2:5] %>%
set_names(c("study_name", "subject_id", "experiment_day", "trial")) -> meta
readLines(x) %>%
grep("WAITING", ., value=TRUE) %>%
map(~scan(text=., quiet=TRUE,
what=list(character(), double(), character(),
character(), double(), character()))[c(2,5)]) %>%
map_df(~set_names(as.list(.), c("waiting", "reaction"))) -> df
df$study_name <- meta$study_name
df$subject_id <- meta$subject_id
df$experiment_day <- meta$experiment_day
df$trial <- meta$trial
df
})
## # A tibble: 10 × 6
## waiting reaction study_name subject_id experiment_day trial
## <dbl> <dbl> <chr> <chr> <chr> <chr>
## 1 3780 1230 studyA 1 1 2
## 2 9700 377 studyA 1 1 2
## 3 5538 310 studyA 1 1 2
## 4 4599 361 studyA 1 1 2
## 5 9579 338 studyA 1 1 2
## 6 3780 1230 studyB 30 3 1
## 7 9700 377 studyB 30 3 1
## 8 5538 310 studyB 30 3 1
## 9 4599 361 studyB 30 3 1
## 10 9579 338 studyB 30 3 1

Is there a way stop table from sorting in R

Problem setup: Creating a function to take multiple CSV files selected by ID column and combine into 1 csv, then create an output of number of observations by ID.
Expected:
complete("specdata", 30:25) ##notice descending order of IDs requested
## id nobs
## 1 30 932
## 2 29 711
## 3 28 475
## 4 27 338
## 5 26 586
## 6 25 463
I get:
> complete("specdata", 30:25)
id nobs
1 25 463
2 26 586
3 27 338
4 28 475
5 29 711
6 30 932
Which is "wrong" because it has been sorted by id.
The CSV file I read from does have the data in descending order. My snippet:
dfTable<-read.csv("~/progAssign1/specdata/tmpdata.csv")
ccTab<-complete.cases(dfTable)
xTab3<-as.data.frame(table(dfTable$ID[ccTab]),)
colnames(xTab3)<-c("id","nobs")
And as near as I can tell, the third line is where sorting occurs. I broke out the expression and it happens in the table() call. I've not found any option or parameter I can pass to make something like sort=FALSE. You'd think...
Anyway. Any help appreciated!
So, the problem is in the output of table, which are sorted by default. For example:
> r = sample(5,15,replace = T)
> r
[1] 1 4 1 1 3 5 3 2 1 4 2 4 2 4 4
> table(r)
r
1 2 3 4 5
4 3 2 5 1
If you want to take the order of first appearance, you are going to get your hands a little bit dirty by recoding the table function:
unique_r = unique(r)
table_r = rbind(label=unique_r, count=sapply(unique_r,function(x)sum(r==x)))
table_r
[,1] [,2] [,3] [,4] [,5]
label 1 4 3 5 2
count 4 5 2 1 3
One way to get around this is...don't use table. Here's an example where I create three one-line data sets from your data. Then I read them in with a descending sequence, with read.table and it seems to be okay.
The real big thing here is that multiple data sets should be placed in a list upon being read into R. You'll get the exact order of data sets you want that way, among other benefits.
Once you've read them into R the way you want them, it's much easier to order them at the very end. Ordering of rows (for me) is usually the very last step.
> dat <- read.table(h=T, text = "id nobs
1 25 463
2 26 586
3 27 338
4 28 475
5 29 711
6 30 932")
Write three one-line files:
> write.table(dat[3,], "dat3.csv", row.names = FALSE)
> write.table(dat[2,], "dat2.csv", row.names = FALSE)
> write.table(dat[1,], "dat1.csv", row.names = FALSE)
Read them in using a 3:1 order:
> do.call(rbind, lapply(3:1, function(x){
read.table(paste0("dat", x, ".csv"), header = TRUE)
}))
# id nobs
# 1 27 338
# 2 26 586
# 3 25 463
Then, if we change 3:1 to 1:3 the rows "comply" with our request
> do.call(rbind, lapply(1:3, function(x){
read.table(paste0("dat", x, ".csv"), header = TRUE)
}))
# id nobs
# 1 25 463
# 2 26 586
# 3 27 338
And just for fun
> fun <- function(z){
do.call(rbind, lapply(z, function(x){
read.table(paste0("dat", x, ".csv"), header = TRUE) }))
}
> fun(c(2, 3, 1))
# id nobs
# 1 26 586
# 2 27 338
# 3 25 463
You may try something like this:
t1 <- c(5,3,1,3,5,5,5)
as.data.frame(table(t1)) ##result in ascending order
# t1 Freq
#1 1 1
#2 3 2
#3 5 4
t1 <- factor(t1)
as.data.frame(table(reorder(t1, rep(-1, length(t1)),sum)))
# Var1 Freq
#1 5 4
#2 3 2
#3 1 1
In your case you are complaining about the actions of the table function with a single argument returning the items with the names in ascending order and you wnat them in descending order. You could have simply used the rev() function around the table call.
xTab3<-as.data.frame( rev( table( dfTable$ID[ccTab] ) ),)
(I'm not sure what that last comma is doing in there.) The sort order in the original would not be expected to determine the order of a table operation. Generally R will return results with discrete labels sorted in alpha (ascending) order unless the levels of a factor item have been specified differently. That's one of those R-specific rules that may be difficult to intuit. The other R-specific rule that may be difficult to grasp (although not really a problem here) is that arguments are often expected to be in the form of R-lists.
It's probably wise to think about R-table objects at this point (and what happens with the as.data.frame call. table-objects are actually R-matrices, so the feature that you wanted to sort by was actually the rownames of that table object and are of class character:
r = sample(5,15,replace = T)
table(r)
#r
#2 3 4 5
#5 3 2 5
rownames(table(r))
#[1] "2" "3" "4" "5"
str(as.data.frame(table(r)))
#-------
'data.frame': 4 obs. of 2 variables:
$ r : Factor w/ 4 levels "2","3","4","5": 1 2 3 4
$ Freq: int 5 3 2 5
I just wanna share this homework I've done
complete <- function(directory, id=1:332){
setwd("E:/Coursera")
files <- dir(directory, full.names = TRUE)
data <- lapply(files, read.csv)
specdata <- do.call(rbind, data)
cleandata <- specdata[!is.na(specdata$sulfate) & !is.na(specdata$nitrate),]
targetdata <- data.frame(Date=numeric(0), sulfate=numeric(0), nitrate=numeric(0), ID=numeric(0))
result<-data.frame(id=numeric(0), nobs=numeric(0))
for(i in id){
targetdata <- cleandata[cleandata$ID == i, ]
result <- rbind(result, data.frame(table(targetdata$ID)))
}
names(result) <- c("id","nobs")
result
}
A simple solution that no one has proposed yet is combining table() with unique() function. The unique() function does the behaviour that you are looking (listing unique IDs in order of appearance).
In your case it would be something like this:
dfTable<-read.csv("~/progAssign1/specdata/tmpdata.csv")
ccTab<-complete.cases(dfTable)
x<-dfTable$ID[ccTab] #unique IDs
xTab3<-as.data.frame(table(x)[unique(x)],) #here you sort the "table()" result in order of appearance
colnames(xTab3)<-c("id","nobs")

column combination from separare data.frames

I have multiple text files that I have imported using
colnames<-c("cellID", "X", "Y", "Area", "AVGFP", "DeviationGFP", "AvgRFP", "DeviationsRFP", "Slice", "GUI-ID")
stats <- apply(data.frame(list.files()), 1, read.table,sep="", header=F, col.names=colnames)
names(stats) <- paste0("slice",seq_along(1:40))
This is what slice1 from stats looks like :
cellID X Y Area AVGFP DeviationGFP AvgRFP DeviationsRFP Slice GUI.ID
1 1 18.20775 26.309859 568 5.389085 7.803248 12.13028 5.569880 0 1
2 2 39.78755 9.505495 546 5.260073 6.638375 17.44505 17.220153 0 1
3 3 30.50000 28.250000 4 6.000000 4.000000 8.50000 1.914854 0 1
4 4 38.20233 132.338521 257 3.206226 5.124264 14.04669 4.318130 0 1
5 5 43.22467 35.092511 454 6.744493 9.028574 11.49119 5.186897 0 1
6 6 57.06534 130.355114 352 3.781250 5.713022 20.96591 14.303546 0 1
7 7 86.81765 15.123529 1020 6.043137 8.022179 16.36471 19.194279 0 1
8 8 75.81932 132.146417 321 3.666667 5.852172 99.47040 55.234726 0 1
9 9 110.54277 36.339233 678 4.159292 6.689660 12.65782 4.264624 0 1
10 10 127.83480 11.384886 569 4.637961 6.992881 11.39192 4.287963 0 1
All of the other data sets look the same except they all have varying row length (some go up to 2000 cells)
I want to take 1 column from each data.frame (slice1....slice40) and put it into a new data.frame. I want the new data.frame to have the column name and I want the column names in the new data.frame to be called slice1...slice40.
To summarize with specifics:
From each slice1-40, I want to take all of the values from AVGFP and put them in a new data.frame
The new data.frame should be called "AVGFP"
There should be 40 columns with headers "slice1, slice2, ... , slice40"
There should be "NA" in each empty cell that arises from one slice being shorter than another.
I really appreciate any and all help. I have been fumbling around with apply, plyr, split, reshape, melt, merge, and aggregate with no luck.
If you want to match by cellID then try this:
L <- lapply(stats, `[`, c("cellID","AVGFP"))
AVGFP <- Reduce(function(x,y)
merge(x,y,by="cellID",all=TRUE,suffixes=c(ncol(x),ncol(x)+1)), L)
names(AVGFP)[-1] <- paste0("slice", 1:40)
If you want to simply paste the columns together, try this:
First get the max length of the dataframes:
maxL <- max(sapply(stats, nrow))
Now create a list where each column is extended with NAs to the maximum length:
L <- lapply(stats, function(x) c(x$AVGFP, rep(NA, maxL-nrow(x))))
Put the columns together in a matrix:
M <- do.call(cbind, L)
Coerce to dataframe:
AVGFP <- as.data.frame(M)
Add the names you want:
names(AVGFP) <- paste0("slice", 1:40)

Open and plot multiple files

I have multiple files of this type (same number of columns, different rows)
A 1 1 1 43.50 12.50
A 1 1 5 44.50 12.50
A 1 1 9 44.50 12.50
A 1 1 13 45.50 12.50
A 1 1 17 45.50 12.50
A 1 1 21 46.50 12.50
A 1 2 1 47.50 12.50
A 1 2 5 47.50 12.50
A 1 2 9 48.50 12.50
I would like to open all those files and plot for every one the last two columns. I managed to open them using lapply
myfiles <- list.files(pattern="*.dat")
myfilesContent <- lapply(myfiles, read.table, header=T, sep = "\t")
but then I am stack..
Many thanks!
You have already used lapply once - why not using it twice?
Reading your description, I guess you are unsure about the size of your data.frames and thusly need to identify the last two columns, that are to be plotted, automatically and hand them to your plot-function.
I would use the following solution:
> myfiles <- lapply(list.files(pattern = "*.dat"),
+ read.table, header = TRUE, sep = "\t"
+ )
> # No check whether dim() will work correctly with your data!
> listplot <- function(x) {
> col1 <- dim(x)[2] - 1
> col2 <- dim(x)[2]
> plot(x[,col1], x[,col2], type = "p")
> }
> lapply(myfiles, listplot)
This will do all the plots in one go; further arguments to plot as well as any other stuff such as saving the images would go into the listplot function.

Resources