cheerioJS not grabbing element from page - web-scraping

I'm having issues grabbing an element with cheerio js.
I opened the following website -> https://www.voobly.com/ladder/view/Age-of-Mythology-The-Titans/1v1-Supremacy
and when I do view page source I can see the id -> pagebrowser1, but when I try to grab it using cheerio, it returns null.
const axios = require('axios');
const cheerio = require('cheerio');
axios.get('https://www.voobly.com/ladder/view/Age-of-Mythology-The-Titans/1v1-Supremacy').then((res) => {
const $ = cheerio.load(res.data)
console.log($("#pagebrowser1").html());
});

Related

How to remove data from the item page when moving to another page using nuxt 3?

I just started learning Nuxt3. In my project I get list of movies from an API:
<script setup>
const config = useAppConfig();
let page = ref(1);
let year = 2022;
let url = computed(() => `https://api.themoviedb.org/3/discover/movieapi_key=${config.apiKey}&sort_by=popularity.desc&page=${page.value}&year=${year}`);
const { data: list } = await useAsyncData("list", () => $fetch(url.value));
const next = () => {
page.value++;
refreshNuxtData("list");
};
const prev = () => {
if (page.value > 1) {
page.value--;
refreshNuxtData("list");
}
};
</script>
Then I have a page for each movie where I get information about it:
<script setup>
const config = useAppConfig();
const route = useRoute();
const movieId = route.params.id;
const url = `https://api.themoviedb.org/3/movie/${movieId}api_key=${config.apiKey}`;
const { data: movie } = await useAsyncData("movie", () => $fetch(url));
refreshNuxtData("movie");
</script>
My problem is that when I open a new movie page, I see information about the old one, but after a second it changes. How can I fix it?
And I have doubts if I'm using refreshNuxtData() correctly. If not, can you show me the correct example of working with API in Nuxt3?
OP fixed the issue by using
const { data: movie } = await useFetch(url, { key: movieId })
movieId being dynamic, it will dedupe all the calls as explained here for the key: https://v3.nuxtjs.org/api/composables/use-async-data/#params
key: a unique key to ensure that data fetching can be properly de-duplicated across requests. If you do not provide a key, then a key that is unique to the file name and line number of the instance of useAsyncData will be generated for you

Vue3 - OnMount doesn´t load array

In Vue 3 i need to fill some array with result of store. I import store like this
Imports
import { onMounted, ref, watch } from "vue";
import { useTableStore } from "../stores/table";
Then i declare values and try to fill it
const search = ref(null);
const searchInput = ref("");
const edition = ref([]);
const compilation = ref([]);
const debug = ref([]);
const navigation = ref([]);
const refactoring = ref([]);
const store = useTableStore();
onMounted(() => {
store.fetchTable();
edition.value = store.getEdition;
compilation.value = store.getCompilation;
debug.value = store.getDebug;
navigation.value = store.getNavigation;
refactoring.value = store.getRefactoring;
});
Values doesn´t fill it. Is strange, if use watcher like this
edition.value = store.getEdition.filter((edition: String) => {
for (let key in edition) {
if (
edition[key].toLowerCase().includes(searchInput.value.toLowerCase())
) {
return true;
}
}
});
Array get values.
So, the problem is: How can i get store values when view loads?
Maybe the problem is the store returns Proxy object...
UPDATE 1
I created a gist with full code
https://gist.github.com/ElHombreSinNombre/4796da5bcdcf6bf4f36f009132dd9f48
UPDATE 2
Pinia loads array data, but 'setup' can´t get it
UPDATE 3: SOLUTION
Finally i resolved the problems and upload to my Github. I used computed to get data updated. Maybe other solution was better.
https://github.com/ElHombreSinNombre/vue-shortcuts
Your onMounted lambda needs to be async, and you need to wait the fetchTable function. Edit: Try using reactive instead of ref for your arrays. Rule of thumb is ref for primitive values and reactive for objects and arrays.
const search = ref(null);
const searchInput = ref("");
const edition = reactive([]);
const compilation = reactive([]);
const debug = reactive([]);
const navigation = reactive([]);
const refactoring = reactive([]);
const store = useTableStore();
onMounted(async () => {
await store.fetchTable();
edition.push(...store.getEdition);
compilation.push(...store.getCompilation);
debug.push(...store.getDebug);
navigation.push(...store.getNavigation);
refactoring.push(...store.getRefactoring);
});
If what you need is the component to not be rendered until data is ready, you'll need a flag in your data that works along with a v-if to render the component when everything is ready, something like this:
// in your template
<div v-if="dataReady">
// your html code
</div>
// inside your script
const dataReady = ref(false)
onMounted(async () => {
await store.fetchTable();
dataReady.value = true;
});

How to export data from a Puppeteer script?

So, I got this script that collects titles from a news website.
The result of the scraping is pushed into the x empty array.
const puppeteer = require('puppeteer');
export let x = []
async function scrapeNewsTitles(url){
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(url);
const [el] = await page.$x('/html/body/main/div[1]/div[1]/article/figure/a/img');
const src = await el.getProperty('src');
const srcTxt = await src.jsonValue();
console.log(srcTxt);
const [el2] = await page.$x('/html/body/main/div[1]/div[1]/article/div/h1/a');
const txt = await el2.getProperty('textContent');
const rawTxt = await txt.jsonValue();
const newArticle = {srcTxt, rawTxt};
x.push(newArticle);
browser.close();
console.log(x)
}
scrapeNewsTitles('https://www.lmneuquen.com');
What I want now is to export the x array, which contains the collected data, so I can use it in another script. The problem is that if I do this...
export let x = []
and then I import it into another file like this...
import {x} from './file.js'
...it gives me the following error:
SyntaxError: Cannot use import statement outside a module
Would you point me in the right direction to do it?
Thank you in advance! Have a nice day.
Instead of using "export let x = []", use "module.exports".
So, change your code to:
let x = []
and at the end of the code, write
module.exports = {"x": x};
When you import this array from the new file, use
let x = require("./index.js") //Instead of index.js, write the name of your first file.
console.log(x);
The reason why is that the keywords "export" and "import from" are used in Vanilla JS. However, the code that you are using is Node JS, so the structure will be slightly different.

How can I get my http URL from the PDF I've uploaded in Firebase?

Sample image
I'm creating a project that needs an http URL instead firebase gave me a gs URL, how can I get the HTTP URL for my uploaded PDF files?
In that image, you should see the Name of the file on the right-hand side as a hyperlink.
So long as a valid access token (bottom of that same side menu) exists, you can access it. Just make sure the URL link includes the access token on the end.
Example: ?alt=media&token=53063556-5482-4c09-bd6f-732533b3bfdb
If you wanna access the link manually, just click on the hyperlink of document name on right sidebar.(Ang-katipunan.pdf)
But, if you want to automatically pass that link into a a db doc for example,
You should try make use of getDownloadURL and pass that as a field in the firestore document(if using firestore).
Here's an example of passing a URL of a photo as a "imageURL" field.
const imageUpload = async () => {
const uri = img;
const childPath = `imgs/${Math.random().toString(36)}`; // This math random is not very imp, just to make sure its generating a random link
const resp = await fetch(uri);
const blob = await resp.blob();
const task = firebase
.storage()
.ref()
.child(childPath)
.put(blob);
const onTaskCompleted = () => {
task.snapshot.ref.getDownloadURL().then((snapshot) =>{ // this is imp
saveImage(snapshot);
})
};
const onTaskError = snapshot => {
console.log(snapshot);
};
task.on("state_changed", onTaskError, onTaskCompleted );
};
then
const saveImage = (downloadURL) => {
firebase.firestore()
.collection('allImages')
.add({
downloadURL, //this is the link
name,
})
}
Change according to your docs

Problem scraping a url that has src = embed # async_embed using [cheerio]

Hello devs,
I am trying to take some data related to covid19 in my country from the following website
const url = https://e.infogram.com/dab81851-e3af-4767-b1f5-9b54eb900274?parent_url=https%3A%2F%2Festadisticas.pr%2Fen%2Fcovid-19&src=embed#async_embed
using the cheerio library, but apparently I cannot access the data.
If there is a way in which the data can be accessed, I will appreciate it.
index.js
const cheerio = require('cheerio');
const axios = require('axios').default;
const main = async() =>{
const url = 'https://e.infogram.com/dab81851-e3af-4767-b1f5-9b54eb900274?parent_url=https%3A%2F%2Festadisticas.pr%2Fen%2Fcovid-19&src=embed#async_embed'
const {data} = await axios.get(url, {method: 'GET'});
const $ = cheerio.load(data);
console.log($.html())
}
main();
That data is in a json blob:
let match = data.match(/window.infographicData=(\{.*?\});/)
let parsed = JSON.parse(match[1])

Resources