Qualtrics: Have specific subjects receive specific pictures to rate - qualtrics

I created a Qualtrics project to have subjects rate faces. I uploaded faces to the graphics library. I used Loop & Merge to repeat the same question for different faces.
How can I have a specific subject rate specific faces?
For example, consider a comma-separated-value file (or JavaScript code, or Python code, ...), with the following two lines:
123,31,41,59
124,26,31,41
Subject 123 should rate face 31, face 41, and face 59.
Subject 124 should rate face 26, face 31, and face 41.
(A face could have an alternate name, such a Qualtrics-generated image name.)
Thank you in advance.

Add the faces to your contact list, so your csv contact list upload would look like:
ExternalDataReference,face1,face2,face3
123,31,41,59
124,26,31,41
Then pipe the face variables into your loop & merge setup:
1 ${e://Field/face1}
2 ${e://Field/face2}
3 ${e://Field/face3}

Related

numbers are outside valid range when computing MACD and error handling

I'm trying to compute ( MACD - signal ) / signal of prices of Russel 1000 (which is an index of the 1000 US large cap stocks). I keep getting this error message and simply couldn't figure out why :
Error in EMA(c(49.85, 48.98, 48.6, 49.15, 48.85, 50.1, 50.85, 51.63, 53.5, :n = 360 is outside valid range: [1, 198]
I'm still relatively new in R although I'm proficient in Python. I suppose I could've used "try" to just work around this error, but I do want to understand at least what the cause of it is.
Without further ado, this is the code :
N<-1000
DF_t<- data.frame(ticker=rep("", N), macd=rep(NA,N),stringsAsFactors=FALSE)
stock<-test[['Ticker']]
i<-0
for (val in stock){dfpx=bdh(c(val), c("px_last"),start.date=as.Date("2018-1-
01"),end.date=as.Date("2019-12-30"))
macd<- MACD( dfpx[,"px_last"], 60, 360, 45, maType="EMA")
num<-dim(macd)[1]
ma<-(macd[num,][1]-macd[num,][2])/macd[num,][2]
i=i+1
DF_t[i,]<-list(val,ma)
}
For your information,bdh() is a Bloomberg command to fetch historic data.dfpx[] is a dataframe.MACD() is a function that takes a time series of prices and outputs a matrix,where the first column are the MACD values and the second column are the signal values.
Thank you very much! Any advice would be really appreciated. Btw, the code works with a small sample of a few stocks but it will cause the error message when I try to apply it to the universe of one thousand stocks. In addition, the number of data points is about 500, which should be large enough for my setup of the parameters to compute MACD.
Q : "...and error handling"
If I may add a grain of salt onto this, the error-prevention is way better than any ex-post error-handling.
For this, there is a cheap, constant O(1) in both [TIME]- and [SPACE]-Domains step, that principally prevents any such error-related crashes :
Just prepend to the instantiated the vector of TimeSERIES data with that many constant and process-invariant value cells, that make it to the maximum depth of any vector-processing, and any such error or exception is principally avoided :
processing-invariant value, in most cases, is the first know value, to be repeated that many times, as needed back, in the direction of time towards older ( not present ) bars ( yes, not relying on NaN-s and how NaN-s might get us in troubles in methods, that are sensitive to missing data, which was described above. Q.E.D. )
For those who are interested, I have found the cause of the error: some stocks have missing prices. It's simple like that. For instance, Dow US Equity has only about 180 daily prices (for whatever reason) over the past one and a half years, which definitely can't be used to compute a moving average of 360 days.
I basically ran small samples till I eventually pinpointed what caused the error message. Generally speaking, unless you are trying to extract data of above 6,000 stocks or so and you are querying say 50 fields, you are Okay. A rule of thumb for daily usage limit of Bloomberg is said to be around 500,000 for a school console. A PhD colleague working in a trading firm also told me professional consoles of Bloomberg are more forgiving.

Numbers from Google sheet document are different after importing to R

I have a table in Google sheet which I have imported to R with the help of Google sheet package. Number in google sheet table are correct but once I have imported them to R, they have a different structure.
For example, I am analyzing the performance of players and I have data for the minute duration during the game like 93 ( which is that the player 93 minutes). But the data in R are showed like this 9.302000e+01.
Some numbers are correct some are in the format below. I am really struggling to figure it out what can be the problem. I have checked the format in sheets as well as in excel and it is as numeric.
As said, I have a number like in table like 93, 22, 93, 16, 45, 93, 46, 93 ( minutes played)
But in R they are some incorrect and some correct.
for example
93.0000000 2.200000e+01 93.0200000 1.600000e+01 4.500000e+01 93.0200000 46.0000000 9.302000e+01
install.packages("googlesheets")
library(googlesheets)
gs_auth(new_user = TRUE)
be <- gs_title("zilina_player_overview")
My expected result is, of course, to have a table in the format like I have it in google sheets or excel. I mean when a player played 93 minutes I will see there number 93 and not 9.302000e+01.
Any help is welcome, and thanks in advance for any advice

value from unique elements

I'm writing a QC program in R to handle data from an instrument that reports its own error codes. The codes are reported as bit values, so
0
means "all OK", while:-
1, 2, 4, 8, 16, 32, 64, 128
Each represent a unique error. Multiple errors can occur simultaneously, in which case the codes are summed to give a new number, e.g:-
error "2" + error "32" = code "34"
And because these sums are each unique, any given code value can be broken down into its constituent errors. I'm looking for a way to program the identification of errors from these codes. I'm struggling with an approach, but everything I can think of involves either look-up-tables or a big stack of loops... neither of which seems very elegant.
Rather than re-invent the wheel, I'm wondering if there's an R function that already exists to do this.
Has anyone come across this sort of problem before?
You could convert the number to bits, and use that representation to find the errors.
2^(which(intToBits(34)==1)-1)
returns
2 32
Hope this helps!

R set.seed does not produce consistent results

I am using a main R function to call a series of R functions from different scripts. In order to reproduce results, I set.seed in the beginning of my main script. In the code, sample() function to randomly select a couple of rows from a dataframe in function_8, and rand() in function_6. So a simple workflow is like below:
### Main R Function
library(dplyr)
set.seed(111)
### Begin calling other R scripts
output_1 <- function_1(...)
...
output_10 <- function_10(...)
### End Main R Function
Recently, I realized that if I make changes to my function_9 which does not contain any randomization. Random numbers generated from in function_8 changes. For example,
sample() in function_8 will get Row 2, 15, 23, 50, 54 before updating function_9.
sample() in function_8 will get Row 23, 44, 50, 95, 98 after updating function_9
However, results can be reproduced by starting a new R session.
So, I am wondering if anyone can give me some suggestions on how to properly set.seed in this situation? THX in advance!
Update
Per a deleted comment, I change the seed number to 123, which produces a set of consistent results. But I appreciate if someone can provide any in-depth explanation!
Maybe the series 111 is just have same character which doesn't change the function 8, you maybe want to generate a time based random seed, Here is a previous answer, that may help you by using system time.

Data compression scheme, math

I have about 42,000 lists of 24 random numbers, all in the range [0, 255]. For example, the first list might be [32, 15, 26, 27, ... 11]. The second list might be [44, 44, 18, 19, .. 113]. How can I choose a number from each of the lists so that (so I will end up with a new list of about 42,000 numbers) such that this new list is most compressible using ZIP?
-- this question has to do with math, data compression
The ZIP file format uses DEFLATE for its compression algorithm. So you need to consider how that algorithm works and pick data such that the algorithm finds it easy to compress. According to the wikipedia article, there are two stages of compression. The first uses LZ77 to find repeated sections of data and replace them with short references. The second uses Huffman coding to take the remaining data and strip out redundancy across the whole block. This is called entropic coding - if the information isn't very random (has low entropy) the code replaces common things with short symbols, increasing the entropy.
In general, then, lists with lots of repeated runs (i.e., [111,2,44,93,111,2,44,93...]) will compress well in the first pass. Lists with lots of repeated numbers within other random stuff (i.e., [111,34,43,50,111,34,111,111,2,34,22,60,111,98,2], where 34 and 111 show up often) will compress well in the second pass.
To find suitable numbers, I think the easiest thing to do is just sort each list, then merge them, keeping the merge sorted, until you get to 42000 output numbers. You'll get runs as they happen. This won't be optimal, you might have the number 255 in each input list and you'd miss them using this technique, but it would be easy.
Another approach would be to histogram the numbers into 256 bins. Any bins that stand out indicate numbers that should be grouped. After that, I guess you have to search for sequences. Again, sorting the inputs will probably make this easier.
I just noticed you had the constraint that you have to pick one number from each list. So in both cases you could sort each list then remove duplicates.
Additionally, Huffman codes can be generated using a tree, so I wonder if there's some magic tree structure you could put the numbers into that would automatically give the right answer.
This smells NP-complete to me, but I am nowhere near able to prove it. On the outside, there are approximately 7.45e+57968 (!) possible configurations to test. It doesn't seem that you can opt out of a particular configuration early, as an incompressible initial section could be greatly compressible later on.
My best guess for "good" compression would be to count the number of occurrences of each number across the entire million-element set and select from each list the numbers with the most occurrences. For example, if every list has 42 present in it, selecting that only would give you a very-compressible array of 42,000 instances of the same value.

Resources