Column data types classification in R - r

I have a database. How to get all of colums types, and save it to a file. Distinctive types:
- Float
- Integer
- BigInteger
- String
My code is:
library(foreign)
library(memisc)
data <- read.spss("data.sav", use.value.labels = FALSE, max.value.labels = 100)
write.table(summary(data), "out.txt")
But, this code only distinguishes between two types of data... (numeric, String)
out sample:
Length Class Mode
SubsID 20582 -none- numeric
SubsID_RN 20582 -none- character
responseid 20582 -none- numeric
required output:
SubsID BigInteger
SubsID_RN String
responseid Integer

In R, the type system works differently from many of the other common languages. First of, everything in R is an object and one of the basic object types is the vector. The type of the vector itself is defined by the data that it contains. There are six atomic vector types which can be accessed by the typeof function. In the R documentation you can find the following table
+------------+----------+--------------+
| typeof | mode | storage.mode |
+------------+----------+--------------+
| logical | logical | logical |
| integer | numeric | integer |
| double | numeric | double |
| complex | complex | complex |
| character | character| character |
| raw | raw | raw |
+------------+----------+--------------+
As you can see, there is no difference between float and double or Integer and BigInteger. Also a String is just a character in R.
So in your case, if you want to know the specific basic type of each of your variables, you could use
lapply(data, typeof)
The R documentation has more information about objects and basic types:
http://cran.r-project.org/doc/manuals/r-release/R-lang.html#Objects

you can call the class or the type of your columns like this:
lapply(your_data_frame, class)
lapply(your_data_frame, typeof)
there's no such thing as 'BigInteger' in R. cf. data structures in hadley's adv-r for a more detailed explanation

Related

How to get average of last N numbers in a stream with static memory

I have a stream of numbers and in every cycle I need to count the average of last N of them. This can be, of course, solved using an array where I store the last N numbers and in every cycle I shift it, add the new one and count the average.
N = 3
+---+-----+
| a | avg |
+---+-----+
| 1 | |
| 2 | |
| 3 | 2.0 |
| 4 | 3.0 |
| 3 | 3.3 |
| 3 | 3.3 |
| 5 | 3.7 |
| 4 | 4.0 |
| 5 | 4.7 |
+---+-----+
First N numbers (where there "isn't enough data for counting the average") doesn't interest me much, so the results there may be anything/undefined.
My question is, can this be done without using an array, that is, with static amount of memory? If so, then how?
I'll do the coding myself - I just need to know the theory.
Thanks
Think of this as a black box containing some state. If you control the input stream, you can draw conclusions on the state. In your sliding window array-based approach, it is kind of obvious that if you feed a bunch of zeros into the algorithm after the original input, you get a bunch of averages with a decreasing number of non-zero values taken into account. The last one has just one original non-zero value, so if you multiply that my N you get the last input back. Using that and the second-to-last output which accounts for two non-zero inputs, you can reconstruct the second-to-last input, and so on.
So essentially your algorithm needs to maintain sufficient state to reconstruct the last N elements of input, at least if you formulate it as an on-line algorithm. I don't think an off-line algorithm can do any better, except if you consider it reading the input multiple times, but I don't have as strong an agument for that.
Of course, in some theoretical models you can avoid the array and e.g. encode all the state into a single arbitrary length integer, but that's just cheating the theory, and doesn't make any difference in practice.

Sequence numbers for varchar data type?

This is used to generate for integer data type
CREATE SEQUENCE sequence
[INCREMENT BY n]
[START WITH n]
[{MAXVALUE n | NOMAXVALUE}]
[{MINVALUE n | NOMINVALUE}]
[{CYCLE | NOCYCLE}]
[{CACHE n | NOCACHE}];
I want to genrate a sequence id of varchar data type.
Sequence numbers for varchar data type?
Can anyone help me?

How to subtract the Number in Robot framework?

How to subtract the number in a Robot Framework?
What is the command for it?
For example, if I am getting a count, I want to subtract -1 and map keywords with the resulting value.
If your variable contains an actual number, you can use extended variable syntax. For example, this test will pass:
*** Variables ***
| ${count} | ${99} | # using ${} syntax coerces value to number
*** Test cases ***
| Example
| | Should be equal as numbers | ${count-1} | 98
You can also use the Evaluate keyword to create a python expression. For example:
*** Variables ***
| ${count} | 99
*** Test cases ***
| Example
| | ${count}= | Evaluate | ${count} - 1
| | Should be equal as numbers | ${count} | 98
Note: using Evaluate will work whether ${count} is a number or the string representation of a number.
You could use Evaluate keyword:
*** Test Cases ***
Stackoverflow
${x} = Set Variable 1
${y} = Evaluate ${x} - 1
An expression like this should work:
${token_expire_time} = Evaluate ${token_generate_time}-${expires_in}
If for some reason the conversion with ${} doesn't seem to work, then feel free to use:
Convert to integer keyword
or
Convert to number keyword

Constructing an object using the genoset package in R

The genoset R package has a function for building a GenoSet by putting together several matrices and a RangedData object that specifies co-ordinates.
I have the following objects - three matrices, all with the same name, and a RangedData object of the following format (called locData).
space ranges |
<factor> <IRanges> |
cg00000957 1 [ 5937253, 5937253] |
cg00001349 1 [166958439, 166958439] |
cg00001583 1 [200011786, 200011786] |
cg00002028 1 [ 20960010, 20960010] |
cg00002719 1 [169396706, 169396706] |
cg00002837 1 [ 44513358, 44513358] |
When I try to create a GenoSet, though, I get the following error.
DMRSet=GenoSet(locData,Exprs,meth,unmeth,universe=NULL)
Error in .Call2("IRanges_from_integer", from, PACKAGE = "IRanges") :
cannot create an IRanges object from an integer vector with missing values.
What am I doing wrong? all the objects I'm putting together have the same rownames, except for the IRanges object itself, which I don't think has rownames since it isn't a matrix.
Additionally, the "column" of locData has non-integer characters.
Thank you!
It sounds like your "locData" may not be a RangedData. It can alternatively be a GRanges. Either way, you will want to name all of your arguments.
The underlying eSet class will be upset about that once you get past the locData trouble.
DMRSet=GenoSet(locData=locData,exprs=Exprs,meth=meth,unmeth=unmeth,universe=NULL)
Pete

How to create a matrix with dynamic rows and columns in ASP.NET?

I have to make a control in ASP.NET that allows me to create a matrix. I have a list of strings (obtained from a method) that will be the rows (each string is one row), and I have another list of strings (obtained from other method) that will be the columns (each string is one column). After that, depending on the row-cloumn cross I have to put an image in that position, something like this:
x | y | z
a | OK | OK | BAD|
------------------
b | OK |BAD | OK |
------------------
c |BAD |BAD | BAD|
How can I achieve this? Thanks a lot in advance!
You can use nested Repeaters.
The outer repeater for rows, the inner one for columns/cells.

Resources