Is R code available for creating 3D plots of our galaxy or universe? I have searched a few times over the last six months and not found any.
This news article includes some very nice 3D plots that look like they may have been created with R:
A short video can be viewed at the above link, but I do not see a link to R code there. The video was created by people at the University of Lyon and the University of Hawaii. Here is a link to a longer video related to the same project:
I just thought it would be neat to explore space from within a 3D plot in R but I cannot find any relevant code.
Locations for objects likely are found within the Redshift Catalog, and perhaps can be downloaded, but I have no idea whether I would need to adjust those location data in various ways if I tried to create my own 3D map. Here is one possible source of data if I were to try creating my own map:
I have read something to the effect that asking for relevant packages does not make for an appropriate post. Sorry if this post is not appropriate.
The problem is not the modelling but the data. Here's a database made available by . - but probably you'll need to dig around for what you want specifically.
Here's a simple SQL query to pull out some of the star data:
SELECT Positions.OwnerID, Positions.RA_hr, Positions.RA_min, Positions.RA_sec, Positions.Dec_deg, Positions.Dec_arcmin, Positions.Dec_arcsec, Positions.Distance, Spectra.SpectralClass, Spectra.LuminosityClass, qryProps.Name
FROM (Positions LEFT JOIN Spectra ON Positions.OwnerID = Spectra.OwnerID) LEFT JOIN qryProps ON Positions.OwnerID = qryProps.OwnerID
WHERE (((Positions.Distance)>=0));
Then save it as a csv and import it:
Write a function to translate the coords to X, Y, Z
Filter the data.frame
sf<-starcoords[abs(starcoords$Z)<2000 & abs(starcoords$X)<1000,] # apply a filter
Then plot using rgl
You can obviously add more data for luminosity, size, type, etc. if it's available, and then use those parameters to set size, color, etc.
Consider creating exams using the exams package in R.
When using exams2nops there is a parameter showpoints that, when set to TRUE will show the points of each exercise. However, for exams2pdf this parameter is not available.
How to display the points per exercise when using exams2pdf?
(The answer below is adapted from the R/exams forum at
There is currently no built-in solution to automatically display the number of points in exams2pdf(). The points= argument only stores the number of points in the R object that exams2pdf() creates (as in other exams2xyz() interfaces) but not in the individual PDF files.
Thus, if you want the points to be displayed you need to do it in some way yourself. A simple solution would be to include it in the individual exercises already, possibly depending on the kind of interface used, e.g., something like this for an .Rmd exercise:
pts <- 17
pts_string <- if(match_exams_call() == "exams2pdf") {
sprintf("_(%s points)_", pts)
} else {
And then at the beginning of the "Question":
`r pts_string` And here starts the question text...
Finally in the meta-information
expoints: `r pts`
This always includes the desired points in the meta-information but only displays them in the question when using exams2pdf(...). This is very flexible and can be easily customized further. The only downside is that it doesn't react to the exams2pdf(..., points = ...) argument.
In .Rnw exercises one would have to use \Sexpr{...} instead of r .... Also the pts_string should be something like sprintf("\\emph{(%s points)}", pts).
Finally, a more elaborate solution would be to create a suitable \newcommand in the .tex template you use. If all exercises have the same number of points, this is not hard to do. But if all the different exercises could have different numbers of points, it would need to be more involved.
The main reason for supporting this in exams2nops() but not exams2pdf() is that the former has a rather restrictive format and vocabulary. In the latter case, however, the point is to give users all freedom regarding layout, language, etc. Hence, I didn't see a solution that is simple enough but also flexible enough to cover all use-cases of exams2pdf().
Is there a way to include open-ended/free-form questions that are ungraded or skipped by r-exams?
Use case: we want to have an exam with mostly multiple choice questions using the package and its grading capability, but also have 5-10 open ended questions that are printed in the same exam. Ideally, r-exams would provide the grade for the first MCQ section, and we could manually add the grade of the open-ended questions.
I forked the package and made some small changes that allows one to control how many questions are printed on the first page and to remove the string-question pages.
The new parameters are number_of_closed_questions and include_string_pages. It is far away from being ideal, but works for me.
As an example let us have 6 mpc/single-choice questions and one essay question (essayreg):
# install devtools if you do not have it!
# install the fork
myexam <- list(
c("boxplots.Rnw", "scatterplot.Rnw"),
ex1 <- exams2nops(myexam, n = 2,
dir = "nops_pdf", name = "demo", date = "2015-07-29",
number_of_closed_questions = 6, include_string_pages = FALSE)
This will produce only 6 questions on the front page (instead of 7) and will also exclude the string-question pages.
If you want normal behavior, just exclude the new parameters. Obviously, one will have to set the number of closed questions manually, so one should be really careful.
I guess one could automatically detect how many string questions are loaded and from this determine the number of open-ended/closed-ended questions, but I currently do not have the time to write this and the presented solution is usable for my case.
I am not 100% sure that the scans will work this way, but I assume there should not be any bigger problems as I did not really change much. Maybe Achim Zeileis could comment on that? See my commit:
There is built-in support for up to three open-ended "string" questions that are printed on a separate sheet that has to be marked by hand. The resulting sheet can then be scanned and evaluated along with the main sheet using nops_scan() and nops_eval(). It's on the wish list for the package to extend that number but it hasn't been implemented yet.
Another "trick" you could do is to use the pages= argument of exams2nops() to include a separate PDF sheet with the extra questions. But this would have to be handled completely separately "by hand" afterwards.
I have some 4-dimensional MR data in DICOM. The forth dimension can be time, b-value in DWI or whatever. How to determine how many slices and how many series in the forth dimension do I have?
For example, I have 400 images. How can I decide if there are 100 series and 4 slices or vice versa?
I have figured it out by checking the slice position. If a given position repeats, I increment the number of stacks. My Python code below:
def getNumOfStacks(self, someImage):
sliceDict = dict()
for n in range(0, someImage.ImagesInAcquisition):
location = pydicom.dcmread(self.path+self.fileList[n]).SliceLocation
if location in sliceDict:
sliceDict[location] = sliceDict.get(location) + 1
sliceDict[location] = 1
return list(sliceDict.values())[0]
The only way is to inspect the value of SeriesInstanceUID (tag number 0020,000e) for each single instance.
Depending on a tool you are using, the solution may be varied. For example, if you have dcmtk or gdcm, then in bash it would be like this:
find /path/to/dicom/files -exec dcmdump "{}" 2>/dev/null ";" | grep SeriesInstanceUID |sort -u
If you use gdcm, than put gdcmdump instead of dcmdump above.
MRI images in DICOM come in two different flavors:
"Traditional" MR Image Storage (SOP Class UID 1.2.840.10008.
Enhanced MR Image Storage (SOP Class UID 1.2.840.10008.
For both, like Bartlomiej wrote, the Series Instance UID can be used to determine which of the slices belong to the same series, and usually one series represents one stack of images.
For Enhanced MR, the concept of stacks was introduced. That is, a single DICOM object ("file") contains multiple frames ("images") which can be subdivided into stacks. In the Per Frame Functional Groups Sequence (5200,9230) you can find attributes which are specific for individual frames. In this case, you should read the Stack ID (0020,9056) and the In Stack Position Number (0020,9057) to seperate the stacks and order the slices within the stack.
I'm not new to R but am very new to machine learning.
For work I collect data by writing on a datasheet printed on waterproof paper which I then have to transcribe to the database manually. This takes a long time at the end of a long day and is a process prone to mistakes.
The entire datasheet is shown below
What I would like to do is simply take a photo of the sheet and have keras read it and input the results into a database
And the section of the datasheet that I am interested in getting Keras to read is shown here
Each row of the datasheet represents what species of coral was found and each column represents what transect it was found on ie 7 Acroppora was found on T1
Each of these cells are given a unique entry in the database in a format similar to this which would show how the Acropora row is recorded
For each datasheet that we have entered in the past (probably somewhere between 1000 and 2500) there are corresponding database entries which can be exported to csv and linked to each datasheet
Ultimately, what I would like to do is simply take a photo of the sheet and have keras read the part I'm interested in (shown in second image) and input the results into a CSV in a similar format shown in the third image
The questions
What I've been thinking about is getting it to identify the borders of the parts of the datasheet I'm interested in (shown in the second image) and extract it. This would mean that I could then put in coordinates for each cell, ie Acropora T1 (as shown in the image below) and identify the number counted in that cell and export it to a database
Does this process sound possible? If so, would anyone know of any examples I could look up or even what you would call this process so I can look it up
Otherwise I was thinking about scanning each sheet as a whole (As shown in the first image) and simply training from that, however I feel that would be more prone to errors
I really hope this makes sense and would very much appreciate any help and/or suggestions either specifically to the questions that I asked or about my project in general
This uses OpenCV and Python.
According to the chapter on 'Hough Line Transform' you could detect lines like this.
import cv2
import numpy as np
img = cv2.imread('D:/Books/lines1.jpg', cv2.IMREAD_GRAYSCALE)
edges = cv2.Canny(img,50,150,apertureSize = 3)
But based on my simple research I think counting is possible using code like this.
More knowledge of OpenCV is required at this stage. I think this is just dilating and the borders of the lines are more pronounced.
img = cv2.imread('D:/Books/lines1.jpg', cv2.IMREAD_GRAYSCALE)
edges = cv2.Canny(img,50,150,apertureSize = 3)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (4, 4))
dilated_Edges = cv2.dilate(edges, kernel, iterations=1)
cv2.imwrite("D:/Books/dilated_Edges.jpg", dilated_Edges);
lines = cv2.HoughLines(image=dilated_Edges,rho=1,theta=np.pi/180, threshold=100)
print( len(lines))
This prints 8 for me which isn't correct.
I pursued this and this code is based on help from the OpenCV forum(Suleyman TURKMEN).
Images I tested with are these. Prints the correct count.
import cv2
import math
img = cv2.imread('D:/Books/lines1.jpg', cv2.IMREAD_GRAYSCALE)
ret,bw = cv2.threshold(img,0,255,cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
cv2.imshow("bw", bw)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 2))
eroded_Edges = cv2.erode(bw, kernel, iterations=3)
dilated_Edges = cv2.dilate(eroded_Edges, kernel, iterations=4)
im2, contours, hierarchy = cv2.findContours(dilated_Edges, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
print (len(contours) , " horizontal lines")
cv2.imshow("vertical lines", eroded_Edges)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (1, 5))
eroded_Edges = cv2.erode(bw, kernel, iterations=3)
im2, contours, hierarchy = cv2.findContours(eroded_Edges, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
print (len(contours) , " vertical lines")
cv2.imshow("horizontal lines", eroded_Edges)
Is there an easy way to automatize the conversion of a R dataframe to a pretty Word table in APA format for publishing manuscripts? I'm currently doing this by saving the table in a csv, opening that in excel, copying the excel table to Word, and formatting it there, but I'm hoping there would be a way to automatize the formatting in R, so that when I convert it to Word, it would already be in APA format, because Word sucks in automatization.
Basically, I want to continue writing the manuscript itself in Word, while doing my analyses in R. Then gather all the results in R to a table (with manually modifiable formatting) by a script and convert it to whatever format I could then simply copy-paste to Word (so that the formatting actually holds). When I need to modify the table, I would make the changes in R and then just run the script again without the need to do any changes in Word.
I don't want to learn LaTeX, because everyone in my field uses Word with features like track changes, and I use Zotero add-in for citations, so it's simpler to just keep the writing separate from the analyses. Also, I am a psychologist, not a coder, so learning a lot of new technologies just for this is probably not worth the effort for me. Typically with new technologies come new technical problems, and I am aiming to make my workflow quicker, but not at the cost of unpredictability (which may make it slower exactly at the moment when I cannot afford it).
I found a R+knitr+rmarkdown+pander+pandoc solution "with as little overhead as possible", but it seems to be quite heavy still because I don't know any of those technologies apart from R. And I'm not eager to start learning all that, as it seems to be aimed for doing the writing and all in R to the very end, while I want to separate my writing and my code - I never need code in my writing, only the result tables. In addition, based on the examples, it seems to fetch the values directly from R code (e.g., from summary() to create a descriptive table), while I need to be able to tinker with my table manually before converting it, for instance, writing the title and notes (like a specific note to one cell and explaining it in the bottom). I also found R2wd, but it seems to be an older attempt for the same "whole workflow in R" problem as the solution above. SWord does not seem to be working anymore.
Any suggestions?
(Just to let you know, I am the author of the packages I recommend you...)
You can use package ReporteRs to output your table to Word. See here a tutorial (not mine):
Objects FlexTable let you format and arrange tables easily with some standard R code. For example, to set the 2nd column in bold, the code looks like:
myFlexTable[, 2] = textBold()
There are (old) examples here:
These objects can be added to a Word report using the function addFlexTable. The word report can be generated with function writeDoc.
If you are working in RStudio, you can print the object and it will be rendered in the html viewer so you can export it in Word when you are satisfied with its content.
You can even add real Word footnotes (see the link below)
If you need more tabular output, I recommend you also the rtable package that handles xtable objects (and other things I have to develop to satisfy my colleagues or customers) - a quick demo can be seen here:
Hope it helps...
I have had the same need, and I have ended up using the package htmlTable, which is quite 'cost-efficient'. This creates a HTML table (in RStudio it is created in the "Viewer" windows in the bottom right which I just mark using the mouse copy-paste to Word. (Start marking form the bottom of the table and drag the mouse upwards, that way you are sure to include the start of the HTML code.) Word handles these tables quite nicely. The syntax of is quite simple involving just the function htmlTable(), but is still able to make somewhat more complex tables, such as grouped rows and primary and secondary column headers (i.e. column headers spanning more than one column). Check out the examples in the vignette.
One note of caution: htmlTable will not work will factor variables, i.e., they will come out as integer numbers (according to factor levels). So read the data using stringsAsFactors = FALSE or convert them using as.character().
Including trailing zeroes can be done using the txtRound function. Example:
mini_table <- data.frame(Name="A", x=runif(20), stringsAsFactors = FALSE)
txt <- txtRound(mini_table, 2)
It is not completely straightforward to assign formatting soch as bold or italics, but it can be done by wrapping the table contents in HTML code. If you for instance want to make an entire column bold, it can be done like this (please note the use of single and double quotation marks inside paste0):
mini_table <- data.frame(Name="A", x=runif(20), stringsAsFactors = FALSE)
txt <- txtRound(mini_table, 2)
txt$x <- aaply(txt$x, 1, function(x)
paste0("<span style='font-weight:bold'>", x, "</span")
Of course, that would be easier to to in Word. However, it is more interesting to add formatting to numbers according to some criteria. For instance, if we want to emphasize all values of x that are less than 0.2 by applying bold font, we can modify the code above as follows:
mini_table <- data.frame(Name="A", x=runif(20), stringsAsFactors = FALSE)
txt <- txtRound(mini_table, 2)
txt$x <- aaply(txt$x, 1, function(x)
if (as.numeric(x)<0.2) {
paste0("<span style='font-weight:bold'>", x, "</span>")
} else {
paste0("<span>", x, "</span>")
If you want even fancier emphasis, you can for instance replace the bold font by red background color by using span style='background-color: red' in the code above. All these changes carry over to Word, at least on my computer (Windows 7).
The short answer is "not really." I've never had much luck getting well formatted tables into MS Word. The best approach I can offer you requires using Rmarkdown to render your tables into an HTML file. You can copy and paste you results from the HTML file to MS Word, but I make no guarantees about how well the formatting will follow.
To format your tables, you can try something like the xtable package, or the pixiedust package. But again, no guarantees that the formatting will transfer.