Multiple responses in SPSS - count

I have multiple response questions which have 5 categories (values). I want to get respondents who answered only one category.
For example,
Respondents who answered category not 2,3,4,5.
I want only A mentions like, who are all checked A category alone. I need count of this.
Help, Please.

The following solution is assuming the data has 5 dichotomous variables - one for each of the multiple response categories.
* creating some sample data to demonstrate on.
data list list/cat1 to cat5.
begin data
1 0 0 0 1
0 1 1 0 0
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 0 1
1 0 0 0 0
1 1 1 0 0
end data.
* now checking in which cases only category 1 was chosen.
compute NumCats=sum(cat1 to cat5).
if cat1=1 and NumCats=1 onlyCat1=1.
execute.
* if instead you wish to do the same check for each of the 5 categories,
use `do repeat` this way.
do repeat cat=cat1 to cat5/only=only1 to only5.
compute only=(cat=1 and NumCats=1).
end repeat.
execute.

But ditch the EXECUTE commands. They just cause a useless data pass in this case except for immediately updating the Data Editor (instead of updating on the next data pass).

Related

How can I pull player stats from a tabbed ESPN table?

I've been reading through a couple of the other useful guides on pulling player and match data from ESPN using R, however I have come across a problem with tabbed tables. As shown here on the player stats for a recent rugby game, the player statistics table is tabbed into 'Scoring', 'Attacking', 'Defending' and 'Discipline'.
Using the following code (with the help of two lovely packages (RCurl and htmltab), I can pull out the first tab ('Scoring') from that page ...
# install & attach RCurl
if (!base::require(package="RCurl")) utils::install.packages("RCurl")
library(RCurl)
# install & attach htmltab
if (!base::require(package="htmltab")) utils::install.packages("htmltab")
library(htmltab)
# assign URL
theurl <- RCurl::getURL("https://www.espn.co.uk/rugby/playerstats?gameId=294854&league=270557",.opts = list(ssl.verifypeer = FALSE))
# pull tables from url
team1 <- htmltab::htmltab(theurl,which=1)
team2 <- htmltab::htmltab(theurl,which=2)
league <- htmltab::htmltab(theurl,which=3)
... in the following format, which is exactly what I wanted ...
team1
rowID LEINS Tx TA CG PG PTS
2 J LarmourFB 0 0 0 0 0 0
3 H KeenanW 0 0 0 0 0 0
4 G RingroseC 0 0 0 0 0 0
5 R HenshawC 1 0 0 0 0 5
6 J LoweW 1 0 0 0 0 5
7 R ByrneFH 0 0 2 2 0 10
8 J Gibson-ParkSH 0 1 0 0 0 0
9 C HealyP 0 0 0 0 0 0
10 R KelleherH 0 0 0 0 0 0
11 A PorterP 0 0 0 0 0 0
... however I seem unable to pull out any tab other than 'Scoring'. I'm sure I'm missing something really obvious, so would appreciate someone pointing out where I'm going wrong!
Thanks in advance!
if you check the source html-page you will see that the data is not there at the start. You can find a data-reactid-tag that indicates that the data is only loaded once you click on the new tab. So you will need to find a way to make that click on the second tab.
One option for you might be to use Selenium: https://www.rdocumentation.org/packages/RSelenium/versions/1.7.7
This would enable you to make the necessary button click.
A sample can be found here: https://www.r-bloggers.com/2014/12/scraping-with-selenium/

How to fix rows order with pheatmap?

I have generate a heatmap with pheatmap and for some reasons, I want that the rows appear in a predefined order.
I see in previous posts that the solution is to set the paramater cluster_row to FALSE, and to order the matrix in the order we want, like this in my case:
Otu0085 Otu0086 Otu0087 Otu0088 Otu0091
AB200 0 0 0 0 0
2 91 0 2 1 0
20CF360 0 1 0 1 0
19CF359 0 0 0 2 0
11VP12 0 0 0 0 155
11VP04 4 1 0 0 345
However, when I do:
pheatmap(shared,cluster_rows = F)
My rows are sorted alphabetically, like this:
10CF278a
11
11AA07
11CF278b
11VP03
11VP04
11VP05
11VP06
11VP08
11VP09
ANy suggestions would be welcome
Thank's by advance

How do I make a selected table confined to a matrix, rather than a running list?

For my previous lines of code for making tables from column names, they successfully made short and dense matrices for me to readily process data from two questions (from survey results): (2nd example).
However, when I try using the same line of code (above), I don't get that sleek matrix. I end up getting a list of un-linked tables, which I do not want. Perhaps it's due to the new column only having 0's and 1's as numeric characters, vs. the others that have more than 2: (1st example).
[Please forgive my formatting issues (StackOverflow Status: Newbie). Also, many thanks in advance to those checking in on and answering my question!]
>table(select(data_final, `Relationship 2Affected Individual`, Satisfied_Treatments))
Relationship 2Affected Individual 1
1 0
2 0
3 0
6 0
Other (please specify) 0
, , 1 = 1, Response = 10679308122
0
Relationship 2Affected Individual 1
1 0
2 0
3 0
6 0
Other (please specify) 0
, ,
...
> table(select(data_final, `Relationship 2Affected Individual`, Indirect_Benefits))
Indirect_Benefits
Relationship 2Affected Individual 0 1 2 3
1 4 1 0 0
2 42 17 9 3
3 12 1 1 0
6 5 2 2 0
Other (please specify) 1 0 0 0
>#rstudioapi::versionInfo()
>#packageVersion("dplyr")
table(data_final$Relationship 2Affected Individual, data_final$Satisfied_Treatments)
Problem Solved^

Frequency Distribution Plot of Document Term Matrix

I have created a document term matrix that looks something like this:
inspect(dtm[1:4,1:6])
allowed allowing almost alone companyunder companywide
Doc1.txt 1 1 1 0 1 0
Doc2.txt 0 1 1 0 1 1
Doc3.txt 0 0 0 1 0 1
Doc4.txt 1 0 1 0 1 1
After taking it's column sum it gives me.
colSums(dtm)
allowed 2
allowing 2
almost 3
alone 1
companyunder 3
companywide 3
This essentially indicates that these words are found in how many documents (for eg allowed 2 tells me that allowed is found in two documents.).
I'm having difficulty in creating a frequency distribution plot which will have x-axis as the document number and y-axis as the number of words the document contains.
Is this what you're looking for?
dtm = array(c(1,0,0,1,1,1,0,0,1,1,0,1,0,0,1,0,1,1,0,1,0,1,1,1),dim=c(4,6))
dimnames(dtm) = list(c("Doc1","Doc2","Doc3","Doc4"),c("allowed","allowing","almost","alone","companyunder","companywide"))
print(dtm)
plot(rowSums(dtm))

T test to find differentially expressed genes in R

I have a matrix which contains the genes and the mrna.
ID_REF GSM362168 GSM362169 GSM362170 GSM362171 GSM362172 GSM362173 GSM362174
244901_at 5.171072 5.207896 5.191145 5.067809 5.010239 5.556884 4.879528
244902_at 5.296012 5.460796 5.419633 5.440318 5.234789 7.567894 6.908795
I wanted to find the differentially expressed genes from the matrix using t test and i carried out the following.
stat=mt.teststat(control,classlabel,test="t",na=.mt.naNUM,nonpara="n")
and I get the following error
Error in is.factor(classlabel) : object 'classlabel' not found.
I am not sure how I have to assign the classlabels.Is it the right way to find the differentially expressed genes.
The classlabel should be a vector of integers corresponding to observation (column) class labels. I do not understand what that is.
If you open the documentation for mt.teststat:
?mt.teststat
and scroll down to the end, you'll see an example using the "Golub data":
data(golub)
teststat <- mt.teststat(golub, golub.cl)
If you look at golub.cl,it will become clear what the classlabel vector should look like:
golub.cl
[1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1
In this case, 0 or 1 are labels for two classes of sample. There should be as many values in the vector as you have samples, in the same order that the samples appear in the data matrix. You can also look at:
?golub
golub.cl: numeric vector indicating the tumor class, 27 acute
lymphoblastic leukemia (ALL) cases (code 0) and 11 acute
myeloid leukemia (AML) cases (code 1).
So you need to create a similar vector, with labels (0, 1, ...) for however many classes you have for your own data.

Resources