Links to local notebooks in Google Colab - jupyter-notebook

I have a notebook which links to other, local notebooks. Something like this:
This is a series of tutorials about X.
Link to [Tutorial 1](tutorial1.ipynb)
Link to [Tutorial 2](tutorial2.ipynb)
...
This works fine in normal jupyter, but not in Google Colab. The links get resolved to URLs like this: https://colab.research.google.com/drive/tutorial1.ipynb which is obviously not where this notebook is.
Is there another way to have local links that works in Google Colab?

If the Notebook is only in Colab, then this works for my use case, maybe for you too:
Right click on the Notebook in the file list view (see below).
Select Get shareable link
Copy the link, and use that:
Link to [Tutorial 1](https://drive.google.com/open?id=12bdw7VdhRhGFw2463wJ)
(That's a made-up link so it doesn't work.)
On the other hand, if it's in GitHub, you can use the link this way — using one of mine (in this repo) as an example:
[My notebook](https://colab.research.google.com/github/agile-geoscience/xlines/blob/master/notebooks/11_Gridding_map_data.ipynb)
Here's that place where you can get a link:

If you shared a Google Drive folder full of notebooks and data files, then it seems like the best way forward is to instruct people to open the other file from the Google Drive tab. That way you are all still working on the same copies of notebook files, which seems like the big advantage of Colab. It's not a link, but it may be the best workflow and it also keeps your notebooks a bit more platform neutral.

Related

Downloading Documents from a password required site

I've learnt about wget and have downloaded a few directories from the web. However, I've hit some roadblocks.
I'm trying to download from a site which requires a password and username, which I have access to.
There are no apparent directories that I could find from inspecting elements. The site was loading up the documents in a reader.js (whatever that is) and it seemed that each page was being fetched as I clicked the arrow button instead of the whole document.
Any ideas would be helpful :)
The site was loading up the documents in a reader.js (whatever that
is) and it seemed that each page was being fetched as I clicked the
arrow button instead of the whole document.
This suggest that JavaScript is used to get document, wget is not right tool in such situation. You would need another tool, namely one which provides browser automation, I suggest giving try PhantomJS (if you are tolerating writing in JavaScript-like language) or selenium (if you are tolerating writing in at least one of officially supported languages).

Problems with Sharing Colab Notebooks

I've just encountered a strange issue with google-colab notebooks I shared: After clicking, instead of opening the notebook in the Colab app, only raw code is displayed:
In some cases one could open the notebook by clicking on the link given on top of the picture. In other cases, the code had to be downloaded and then manually imported into Colab. I find this strange because previously I could simply click on the links and notebooks were correctly opened inside the Colab app. I've set sharing permissions appropriately so that anyone with the share link can view it.
Has there been a change to this or is this some sort of bug? Thanks!
It is because it is displayed using Drive Preview, if you click the 'open with Google Colaboratory', the code will be displayed normally.

Download a file from a permalink URL, and not a direct exe url

So I am using InnoSetup 6 which natively supports downloading files from the internet during installation. I have figured out downloading files given a direct link, from this thread Inno Setup: Install file from Internet
However, I can't for the life of me figure out how to download the latest version of a file given a permalink URL. My specific example is to download the Microsoft Hosting package.
https://dotnet.microsoft.com/permalink/dotnetcore-current-windows-runtime-bundle-installer
Going to this page automatically downloads the latest package.
Inno doesn't like this link (or I don't know how to get Inno to use it) since it doesn't point to the direct file. If I use the direct link (https://download.visualstudio.microsoft.com/download/pr/24847c36-9f3a-40c1-8e3f-4389d954086d/0e8ae4f4a8e604a6575702819334d703/dotnet-hosting-5.0.6-win.exe) this works for obvious reasons.
I'd like to always download the latest, but I'm not sure how to accomplish this. Any suggestions?
Adding super basic code being used...
DownloadPage.Clear;
DownloadPage.Add('https://dotnet.microsoft.com/permalink/dotnetcore-current-windows-runtime-bundle-installer', 'dotnet-hosting.exe', '');
DownloadPage.Show;
You would have to retrieve the HTML page, find the URL in the HTML code and use it in your download code.
See Inno Setup - HTTP request - Get www/web content
It would be quite unreliable. Microsoft can change the HTML any time.
You better setup your own webpage (web service) that will provide an up to date link to your installer. The web page can even do what I suggested: retrieve the URL from the Microsoft's download page. In case Microsoft changes the HTML, you can fix your web page any time. What you cannot do with the installer.
Without realizing it you are asking two different question here. That is because these "permalinks" aren't really permalinks but redirects to some dynamic resource that has a link to what you are looking for.
So first, addressing the Microsoft "permalink", you need to realize that under the hood you are accessing a URL that redirects to some page which will point to the latest. Then under the hood, that page invokes a JavaScript function, IF YOU ACCESSING VIA A WEB BROWSER, to download the installer. Note that both the page pointed to and the code to invoke the installer WILL eventually change. In fact, the code itself logs a "warning" when people attempt to download directly:
If you do a view source you'll see:
<script>
$(function () {
recordDownload('.NET', 'runtime-aspnetcore-5.0.6-windows-hosting-bundle-installer');
window.open("https://download.visualstudio.microsoft.com/download/pr/24847c36-9f3a-40c1-8e3f-4389d954086d/0e8ae4f4a8e604a6575702819334d703/dotnet-hosting-5.0.6-win.exe", "_self");
});
function recordManualDownload() {
ga("send", "event", "Download.Warning", "Direct Link Used", "runtime-aspnetcore-5.0.6-windows-hosting-bundle-installer");
}
</script>
So you can download the HTML from this page and use some regex to get the directo downloadlink but beware, the link is going to change every time Microsoft releases a new version. Furthermore, WHEN (not if but when) MS decides to rebrand this entire process might break. So the best you can do here is try to download the html and try parse the download URL from this "permalink"
As an alternative. you can to download the latest DotNet powershell install script as described here.
If possible, execute that script directly. If not look at the function Get-AkaMSDownloadLink within the install script to see how it builds the url to get the latest version. You would probably be better served using that building and using that URL as opposed to attempting to download from some arbitrary HTML code.
Now, onto the second question you might not have realized your were asking is how to automate this for any random installer. The answer is you can't. Some might have a permalink that directly points to the latest but you are always going to find cases like Microsoft. Best you can down is hard code some links in some service, as #martin-prikryl suggested, and when the break update the links in those services.

Blogdown site pages do not render properly

I'm using RSTudio Blogdown/Github/Netlify to maintain my blog site. I'm using the Acadmic theme. When I push the changed .RMD files to Github the changed pages do not seem to deploy but if I build the entire site and push it then the site deploys on Netlify without any problem. Unfortunately, it takes about three minutes to build the entire site, so I'm looking for a faster solution.
I think that I should be able to build a single directory, which would be super fast, but when I build a directory with this, blogdown::build_dir("content/project/cont_imp"), the HTML document does not build properly. It seems to render as a single long javascript and since all of the metadata in the YAML header is wrapped into the script the page on Netlify does not deploy properly, things like the date and subtitle are missing and it is not formatted like the rest of my site.
I have one bad page that I built with build_dir on GitHub so you can view both the .RMD source and .HTML rendered documents: https://github.com/grself/icochise/tree/master/content/project/cont_imp. You can see this project page on my live site at: https://icochise.com/ (scroll down to the "Projects" section and notice the difference between the "Continuous Improvement" link (no text there, just an image of a hand and a whiteboard) and the "Blogdown and Bookdown" link. I just now noticed that the HTML document seems to be some sort of self-extracting javascript so after a couple of seconds the source code looks normal. Maybe there is some kind of setting on Netlify I need to change so it will extract the javascript as it is deploying the page?
I checked the settings in my "Configure Build Tools" and unchecked "Preview site after building" and "Re-knit current preview..." but that didn't help. I also tried changing the Project build tools dropdown from "Website" to "Custom" and specified the Hugo executable. None of these things helped.
I also tried running "Serve Site" while I worked, thinking that would continuously render the HTML page, but that tool seemed to hang and would not display the site once I made changes to an .RMD file. In fact, it was hung up so badly that I had to kill RStudio with the Windows Task Manager.
Finally, I also tried to update Hugo, hoping that there was something fouled up in my Hugo install, but that did not help.
I suspect that I'm doing some simple thing wrong, but have tried everything I can think of to fix this and would appreciate any suggestions.

How to download source code of website

#include <IE.au3>
#include <Inet.au3>
call("logowanie")
Sleep(4000)
Func Logowanie()
Global $oie= _IECreate("http://pl.ikariam.gameforge.com/")
Local $login= _IEGetObjByName($oie,"name")
Local $haslo= _IEGetObjByName($oie,"password")
Local $przycisk= _IEGetObjById($oie,"loginBtn")
Local $serwer= _IEGetObjByName($oie,"uni_url")
_IEFormElementSetValue($login,"<mylogin>")
_IEFormElementSetValue($haslo,"<mypassword>")
_IEFormElementSetValue($serwer,"s30-pl.ikariam.gameforge.com")
_IEAction($przycisk,"click")
EndFunc
This code logs me in to the website but I don't know how to download the website's source code to do some stuff. Could you help?
You can read the source of a website using _IEDocReadHTML:
$sHtml = _IEDocReadHTML($oie)
ConsoleWrite($sHtml)
If you only want to download a single relatively simple page, the easiest way to get the code is: control-click > view page source. You can also get it by going to the sources tab in inspect element, which will also give you some of the other files in the site’s file structure. If you use this method, make sure to convert any relative links to absolute links in your downloaded code (i.e. convert '../../style.css' to it's full url starting with https://)
This won't give you everything, because code on the back-end is not accessible through any means. For a simple website though, there may not be any code on the back end and this will give you exactly what you're looking for.
This can be very finicky and will not scale to downloading a large website with many pages. The most robust way to get code from any website is by using dedicated web-scrapers rather than trying to go into inspect element and look at the “site sources” tab.
For Mac, SiteSucker is your best option if you don’t care about having all of the site assets (videos, images, etc. hosted on the website) downloaded locally on your computer. Videos especially could take up a lot of space, so this sometimes helpful. (Site Sucker is not free, but pretty cheap)
If you do want all assets to be locally downloaded on your computer (you may want to do this if you want to access a site’s content offline, for example), HTTrack is the best option, in my opinion, for Mac and Windows. (Free)
You could also use wget (Free) to download content, but wget does not have a GUI and has less flexibility, so I prefer HTTrack.

Resources