How to extract video data from video tag? - web-scraping

Hey folks I'm trying to save video from a video tag.
This is part of a scraping exercise on Tiktok and I cannot use the url directly to save the video, either in the browser or server.
I also cannot use canvases as I'm getting a tainted canvas error.
What would be the best way to save the video data?
// Example code
let video = document.querySelector('video');
let src = video.src;
// Do something with the el or the source??

Related

Dynamically update a youtube playlist on a website

OK I'm trying to develop a website that includes a youtube playlist. Later it should be possible for every person (in my local network) who visits the website to add songs to this playlist. I will use the youtube-api and probable php or python for this.
Now I've included the playlist via an <iframe> tag and stumbled across a big problem. If I now add more songs to this playlist (over youtube itself), they will of course only be added to the playlist on my website, when I reload the complete website. And if I just reload the div with the <iframe> for example, the problem is that the playlist starts all over again at the first song.
Is there a way I can dynamically update the playlist on my website? So that the playlist does not start again from the beginning, the current playing remains and new added songs are actually added "live" to the playlist?
Here the <iframe> tag if it helps:
<iframe id="playlist" src="https://www.youtube.com/embed/videoseries?list=PLmdXGEzOwfzb0fq7S2iP-t4i7x19R1ZN2"
frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture"
allowfullscreen>
</iframe>
A way I suggest is:
(Before updating the div which contains the iframe), you need to get both the current playing video and the current time of the playing video.1
(Once you had added the video to the playlist and get the values I mentioned in the previous lines), update the div element (by modifying the src value of your iframe).
A example could be like this:
You have a playlist which id is: PLmdXGEzOwfzb0fq7S2iP-t4i7x19R1ZN2.
The current playing video is: 51iquRYKPbs
When you right-click on the playing video 51iquRYKPbs you'll get the Copy URL video at current time option.
Once selected the Copy URL video at current time option, you'lll get a URL like this:
youtu.be/51iquRYKPbs?list=PLmdXGEzOwfzb0fq7S2iP-t4i7x19R1ZN2&t=80
Check the highlighted values - you'll need them for reload your iframe.
Your current iframe src value is as follows:
https://www.youtube.com/embed/videoseries?list=PLmdXGEzOwfzb0fq7S2iP-t4i7x19R1ZN2
(No modifications yet).
For reload the iframe with the applied changes, set your iframe src as follows:
https://www.youtube.com/embed/51iquRYKPbs?list=PLZCHH_4VqpRhMW1K9I1hbgBRmmlWdyK3k&autoplay=1&start=80
(Once you change the src value of your iframe, it will reload with the changes made and it will continue playing the video in the specified time).
The changes made to the src value are explained next:
Set the videoId value of the video which was playing (before complete the reload).
Set the start value of the video for continue playing at the specified time.
the autoplay=1 is needed for play the video at the specified time.
1 About how get the current playing video and the current time of the playing video is a complete new question and it will depend in which language you're handling all the YouTube stuff in your website - (since you mentioned in your question that you will use php or python).

scraping video links from lazy loaded videos

I am trying to scrape a video from a page using a package called icrawler, but that video is not rendered instantly when a page loads, so when I get the html code of that page, the video tag doesn't exist but it does if I open the page in the browser and inspect.
How do i wait for the page to load the video before crawling it ?
The page most likely loads the video using javascript so, you would need library capable of rendering/executing HTML and javascript.
I took a quick look at icrawler and according to the doc it uses Cheerio which quoting from its doc "does not produce a visual rendering, apply CSS, load external resources, or execute JavaScript".
The same docs mention that you could use something like PhahomJS (seems to be abandoned) or JSDom. Another alternative is to use Selenium.

Passing parameters to swf without ExternalInterface

I'm using a simple audio player in Flex 3 that plays a given mp3 from a given url.
My swf is intended to be embeded in Facebook walls.
<meta property="og:video"
content="http://url/of/my/Player.swf?file=url_of_my_file" />
The player works fine, as I'm getting my parameters from the url:
http://url/of/my/Player.swf?file=url_of_my_file
using
var _queryStringFromUrl:String =
ExternalInterface.call("window.location.search.substring", 1);
My problem is that when the swf is embedded in the facebook wall, the ExternalInterface is disabled, so I can't get my url variables.
Is there an alternative for getting a parameter inside my swf player?
How does youtue handle that in the following url?
<meta property="og:video"
content="http://www.youtube.com/v/dQw4w9WgXcQ?version=3&autohide=1">
Please note that the swf is not embedded in a html page.
EDIT:
It seems that youtube compiles a new swf for every video, with the parameters inside. Can someone confirm?
Thank you
Maybe you can try to use flashvars for this problem? It initialized and passed on swf init. Parse URL-params by JS and add them to flashvars.

youtube videos not playing when embedded with "normal" URL

I'm trying to embed a video on my page, depending on which one the user selects after being presented with a list.
On my page I have:
<div id="vidContent" style="text-align:left">
<object width="550px" height="350px" >
<asp:Literal ID="ltlVideo" runat="server"></asp:Literal>
</object>
</div>
And in the code behind I have:
Dim strVidPath As String = "http://www.youtube.com/v/" & strVidID
ltlVideo.Text = "<embed src='" & strVidPath & "' type='application/x-shockwave-flash' allowscriptaccess='always' allowfullscreen='true' height='350' width='470'></embed>
phVideoBanner.Visible = True
..
which works ok...if the you have the "strVidID"
It only seems to display and play if you have the strVidPath = www.youtube.com/v/_O7iUiftbKU
but not play if strVidPath = www.youtube.com/watch?v=_O7iUiftbKU ....which seems to be the normal URL in the address bar when watching a youtube video.
I want the user to be able to add a video to the page and I was thinking it would be easier if the paste in the URL of the video but now it seems I'll have to get them to paste in the videoID instead as it only seems to play when I use www.youtube.com/v/_O7iUiftbKU
Anyone know why this is?
Rather than trying to parse a YouTube watch page URL and construct an embed code yourself, you can use the oEmbed service to do it for you.
If you need to get back legacy embed codes rather than iframe embed codes, you'd need to pass iframe=0 as one of the URL parameters to the oEmbed service, like: http://www.youtube.com/oembed?url=http%3A//www.youtube.com/watch%3Fv%3DbDOYN-6gdRE&format=json&iframe=0
The URL structure with the word "watch" in it is, as you point out, Youtube's public facing web page, which includes a lot more than the video ... it includes all the other content you see on the page as well. In essence, it's a pointer that resolves to an HTML page, and you can't have an HTML page as the source of an embed element.
The URL structure that is proper (i.e. the one that works) is not a pointer to an HTML page but a pointer that resolves directly to the player itself, and thus can serve as the source of an embed element.
Here's a link to a Stack Overflow question whose answer includes a C# code block that takes a regular YouTube URL (in any of its forms) as an input, does a regex match, and returns just the Youtube ID -- should be pretty simple to modify it for your needs ... thus you can still continue to have your users paste in the whole video URL:
C# regex to get video id from youtube and vimeo by url

Stop iframe autoplaying with mp4 files

I want to stop a .mp4 file in an iframe from autoplaying.
The code is simple:
var siteVidUrl = document.createElement("iframe"); // create iframe
siteVidUrl.src = Obj.URL + "?autoplay=0"; // create src attribute for iframe
link_col.appendChild(siteVidUrl); // append to DOM
An example of Obj.URL would be
media.website.com/theVideo.mp4
The video contines to autoplay, despite my best attempts. As you can see, I am appending "?autoplay=0", to the end of the src attribute URL, but no luck there. I have read a bit about the video HTML5 tag, but no luck there either...
The majority of the videos are .mp4 files.
Can anyone point me in the right direction?
Is it necessary to use an iframe for this? If not have look at the HTML5 video-tag: https://www.w3schools.com/htmL/html5_video.asp
If the iframe is needed (for any reason), you could set the src to a page which receives the URL of the video as a parameter and just produces the HTML containing a simple video-tag in it. This way you'll be able to control the video playback and disable autoplay.

Resources