I am kind of new to using RStudio...I love analyzing data by viewing it in separate window or tab inside the IDE..I am using RStudio to read specific columns from a dataset....I want to open the value stored in a variable in a separate window or tab...The variable contain only one column...Like for example i want to open the values of Y in a separate Window or Tab just like X...How can i do it?
Here is the complete code i am using
dataset<-read.csv('C:\\Users\\datasets\\landprice1.csv')
X = dataset[ 1:14, (1:3)]
Y = dataset[ 1:14,(4)]
Note : The variable Y contain one one column have several observations while X contain 3 columns have several observations. Also i searched a lot on Google before coming here but could find my answer
If you want to open a matrix or a data frame or what else you can just use View() function.
dataset<-read.csv('C:\\Users\\datasets\\landprice1.csv')
X = dataset[ 1:14, (1:3)]
Y = dataset[ 1:14,(4)]
View(X)
View(Y)
Related
I have an Xarray Dataset which looks like
I would like to be able to select data variables (which i already know how to do) and plot a quadmesh of the data variable selected. The problem is that different data variables have different numbers and types of coordinates which means the closest I got is having a really long hard coded switch statement to handle each possible data variable and have a sel on it with the appropriate number and type of coordinates (for ES I would need to select on pres1 and time). How would I approach this in order to be able to allow the user to visualize the geospatial data no matter the data variable selected?
Here is my current attempt :
select_field = pnw.Select(name="Field", options=list(xds.data_vars))
select_time = pnw.Select(name="Time", options=list(xds.coords['time'].values))
select_pres1 = pnw.Select(name="Pressure 1", options=list(xds.coords['pres1'].values))
def fieldFiltered(select_field):
return xds[select_field]
xdsi = hvplot.bind(fieldFiltered, select_field).interactive(sizing_mode='stretch_both')
wid = xdsi.sel( pres1=select_pres1 , time=select_time ).widgets()
ploti = xdsi.sel(time=select_time).hvplot()
pn.Row(
wid[1],
wid[2],
ploti
)
which gives me something similar to
but breaks for any other variable
I am using the package qualtRics in TERR in Spotfire to pull in data directly from specific surveys in Qualtrics. The code I am using is:
registerApiKey(API.TOKEN = "xxxx")
df <- getSurvey(surveyID = "xxxx",
root_url = "https://az1.qualtrics.com", verbose = TRUE)
My output df is a data table. I have 2 different surveys that I am pulling in 4 different times, 2 of those times I am unpivoting data, for a total of 4 data tables.
I want to be able to refresh this data. If I click Reload Data or try to refresh each table individually, nothing happens. I'm assuming I need to add some code that refreshes the data function (?), and I am trying to avoid replacing the data tables each time because, for 2 of those, I have to manually select which columns I am unpivoting (and I have 75+ columns).
Is there a way I can accomplish what I'm looking for? I am a beginner Spotfire/R user, so I am learning as I go!
I am not able to reply to your question as i dont have enough permission so keeping it as separate answer.
Replacing table each time is good idea,
By This you can fix your no of columns for pivoting/UnPivoting.
------R Code
row <- data.frame(Data_Points = nrows,
Col1 = col1, Col2 = col2, YStart = y1, YEnd = y2)
row <- cbind(df, row)
return(row)
And also you can list your fix columns into DocumentProperty and loop it into your DataFunction.
Instead of using spotfire's pivot/unpivot, you can try doing the unpivot within the R code of the data function.
The simplest description of what I am trying to do is that I have a column in a data.frame like 1,2,3,..., n, 1,2,3,...n,.... and I want group the first 1...n as 1 the second 1...n as 2 and so on.
The full context is; I am using the R spcosa package to do equal area stratification composite sampling on parcels of land. I start with a shape file from a GIS that contains a number of polygons (land parcels). The end result I want is a GIS file with each of the strata and sample locations in a GIS file format with each stratum and sample location labeled by land parcel, stratum and sample id. So far I can do all this except one bit which is identifying the stratum that the samples belongs too and including it in the sample label. The sample label needs to look like "parcel#-strata#-composite# (where # is the number). In practice I don't need this actual label but as separate attributes in GIS file.
The basic work flow is a follows
For each individual polygon using spcosa::stratify I divide it into a number of equal area strata like
strata.CSEA <- stratify(poly[i,], nStrata = n, nTry = 1, equalArea = TRUE, nGridCells = x)
Note spcosa::stratify generates a CompactStratificationEqualArea object. I cocerce this to a SpatialPixelData then use rasterToPolygon to be able to output it as a GIS file.
I then generate the sample locations as follows:
samples.SPRC <- spsample(strata.CSEA, n = n, type = "composite")
spcosa::spsample creates a SamplingPatternRandomComposite object. I coerce this to a SpatialPointsDataFrame
samples.SPDF <- as(samples.SPRC, "SpatialPointsDataFrame")
and add two columns to the #data slot
samples.SPDF#data$Strata <- "this is the bit I can't do yet"
samples.SPDF#data$CEA <- poly[i,]$name
I can then write samples.SPDF as a GIS file ( ie writeOGE) with all the wanted attributes.
As above the part I can't sort out is how the sample ids relate to the strata ids. The sample points are a vector like 1,2,3...n, 1,2,3...n,.... How do I extract which sample goes with which strata? As actual strata number are arbitrary, I can just group ( as per my simple question above) but ideally I would like to use the numbering of the actual strata so everything lines up.
To give any contributors access to a hands on example I copy below the code from the spcosa documentation slightly modified to generate the correct objects.
# Note: the example below requires the 'rgdal'-package You may consider the 'maptools'-package as an alternative
if (require(rgdal)) {
# read a vector representation of the `Farmsum' field
shpFarmsum <- readOGR(
dsn = system.file("maps", package = "spcosa"),
layer = "farmsum"
)
# stratify `Farmsum' into 50 strata
# NB: increase argument 'nTry' to get better results
set.seed(314)
myStratification <- stratify(shpFarmsum, nStrata = 50, nTry = 1, equalArea = TRUE)
# sample two sampling units per stratum
mySamplingPattern <- spsample(myStratification, n = 2 type = "composite")
# plot the resulting sampling pattern on
# top of the stratification
plot(myStratification, mySamplingPattern)
}
Maybe order() function can help you
n <- 10
dat <- data.frame(col1 = rep(1:n, 2), col2 = rnorm(2*n))
head(dat)
dat[order(dat$col1), ]
I did not get where the "ID" (1,2,3...n) is to be found; so let's assume you have your SpatialPolygonsDataFrame called shpFarmsum with a attribute data column "ID". You can access this column via shpFarmsum$ID. Therefore, if you want to create individual subsets for each ID this is one way to go:
for (i in unique(shpFarmsum$ID)) {
tempSubset shpFarmsum[shpFarmsum$ID == i,]
writeOGR(tempSubset, ".", paste0("subset_", i), driver = "ESRI Shapefile")
}
I added the line writeOGR(... so all subsets are written to your working direktory. However, you can change this line or add further analysis into the for-loop.
How it works
unique(shpFarmsum$ID) extracts all occuring IDs (compareable to your 1,2,3...n).
In each repetition of the for loop, another value of this IDs will be used to create a subset of the whole SpatialPolygonsDataFrame, which you can use for further analysis.
I have a .CSV file contains multiple data frames. It looks like this:
# A;Date;Price;Volume;Country
# B;Company;Available;StartDate;EndDate;Published;Modified
# C;ID;Timestamp;Capacity
# D;Rownumbers
#
A;2016-01-01 00:00:00;75.18;2500;DK
A;2016-01-01 00:00:00;55.25;8500;DE
A;2016-01-01 00:00:00;125.00;6500;UK
A;2016-01-01 01:00:00;65.28;2400;DK
# A; etc....
B;PRETZELS;TRUE;2016-01-01;2016-01-02;YES;2016-01-03
B;FAKES;FALSE;2016-01-01;2016-01-02;NO;2016-01-03
# B; etc....
C;11;2016-01-01 23:00:00;25
C;16;2016-01-01 22:00:00;15
# C; etc....
D;1175
So, the first part of the file contains information about the data in the file. From this you can see that depending on the information - there is a different number of columsn. In this case from A - D.
I've tried doing :
df <- read.table(file = x.csv, sep = ";", fill = TRUE)
But fill can't take care of a different number of columns - if you increase the number of columns later on for example.
Ideally, I would either create a number of data frames - based on the row name (such as A, B, C and D) in this case.
Or just have data frame with column-numbers = max(ncols(df)) with a lot of NA values I could then filter out to indivdual data frames later. Ie. just read everything in, with a specification of number of columns.
df <- read.delim(file.choose(),header=F,sep=";",fill=TRUE) # choose x.csv from you PC.
file.choose() opens up a dialog box for selecting the input file. Hope this helped.
I am looking for the most convenient way of creating boxplots for different values and groups read from a CSV file in R.
First, I read my Sheet into memory:
Sheet <- read.csv("D:/mydata/Table.csv", sep = ";")
Which just works fine.
names(Sheet)
gives me correctly the Headlines of the different columns.
I can also access and filter different groups into separate lists, like
myData1 <- Sheet[Sheet$Group == 'Group1',]$MyValue
myData2 <- Sheet[Sheet$Group == 'Group2',]$MyValue
...
and draw a boxplot using
boxplot(myData1, myData2, ..., main = "Distribution")
where the ... stand for more lists I have filled using the selection method above.
However, I have seen that using some formular could do these steps of selection and boxplotting in one go. But when I use something like
boxplot(Sheet~Group, Sheet)
it won't work because I get the following error:
invalid type (list) for variable 'Sheet'
The data in the CSV looks like this:
No;Gender;Type;Volume;Survival
1;m;HCM;150;45
2;m;UCM;202;103
3;f;HCM;192;5
4;m;T4;204;101
...
So i have multiple possible groups and different values which I'd like to represent as a box plot for each group. For example, I could group by gender or group by type.
How can I easily draw multiple boxes from my CSV data without having to grab them all manually out of the data?
Thanks for your help.
Try it like this:
Sheet <- data.frame(Group = gl(2, 50, labels=c("Group1", "Group2")),
MyValue = runif(100))
boxplot(MyValue ~ Group, data=Sheet)
Using ggplot2:
ggplot(Sheet, aes(x = Group, y = MyValue)) +
geom_boxplot()
The advantage of using ggplot2 is that you have lots of possibilities for customizing the appearance of your boxplot.