How to webscrape Trustpilot reviews? - web-scraping

I want to webscrape (with puppeeter) all reviews from a company with trustpilot.com but I have a little problem : I can't get the content of the review!
Here is my code:
const puppeteer = require("puppeteer")
const getData = async () => {
const browser = await puppeteer.launch({
headless: true,
args: ['--no-sandbox'],
slowMo: 1000
})
const page = await browser.newPage()
await page.goto("https://www.trustpilot.fr/review/ovh.com", {waitUntil: 'networkidle2'})
await page.click("#onetrust-accept-btn-handler")
const result = await page.evaluate(() => {
window.scrollBy(0, window.innerHeight)
let score = parseFloat(document.querySelector("body > main > div > div.placeholder > div > div.company-profile-header > section.company-summary > div > div.right-section > div > div.trustscore_container > p").innerText.replace(',', '.'))
let reviewsElements = document.querySelectorAll("body > main > div > div.company-profile-body > section > div.review-list > div.review-card")
let reviews = []
reviewsElements.forEach(reviewElement => {
reviews.push({
title: reviewElement.querySelector("a.link").innerHTML,
content: reviewElement.querySelector("p").innerHTML,
user: reviewElement.querySelector("div.consumer-information__name").innerHTML.split("\n ")[1],
stars: parseInt(reviewElement.innerHTML.match(/(?<=\"stars\"\:)\d+/)[0]),
date: reviewElement.innerHTML.match(/(?<=datetime=")\S+(?=\")/)[0]
})
});
return {
score: score,
reviews: reviews
}
})
browser.close()
return result
}
getData().then(value => {
console.log(value)
})
The problem is at this line :
content: reviewElement.querySelector("p").innerHTML,
Stacktrace:
[2020-08-20 13:17:49]: ERROR Unhandled rejection: Error: Evaluation failed: TypeError: Cannot read property 'innerHTML' of null
at reviewsElements.forEach.reviewElement (<anonymous>:9:58)
at NodeList.forEach (<anonymous>)
at <anonymous>:6:25
Error: Evaluation failed: TypeError: Cannot read property 'innerHTML' of null
at reviewsElements.forEach.reviewElement (<anonymous>:9:58)
at NodeList.forEach (<anonymous>)
at <anonymous>:6:25
at ExecutionContext.evaluateHandle (/home/container/node_modules/puppeteer/lib/ExecutionContext.js:88:13)
at async ExecutionContext.evaluate (/home/container/node_modules/puppeteer/lib/ExecutionContext.js:46:20)
at async getData (/home/container/index.js:95:20)
The problem is that there is the p element when I output the reviewElement.innerHTML
Thank you for your help!
Sincerely,
Arnaud L.

Because of a review from 30 juil. 2020, which does not have any content (thus, does not contain <p> paragraph element).
If you check reviewsElements.forEach(reviewElement => { console.log(reviewElement .querySelector('p')); }); while enabling headfull for a moment, you'll get following output:
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
null
<p class="review-content__text">...</p>
<p class="review-content__text">...</p>
...
So, apparently you have to check for nulls prior to accessing <p>'s inner HTML:
reviewsElements.forEach(reviewElement => {
reviews.push({
title: reviewElement.querySelector("a.link").innerHTML,
content: !!reviewElement.querySelector("p") ? reviewElement.querySelector("p").innerHTML : "",
user: reviewElement.querySelector("div.consumer-information__name").innerHTML.split("\n ")[1],
stars: parseInt(reviewElement.innerHTML.match(/(?<=\"stars\"\:)\d+/)[0]),
date: reviewElement.innerHTML.match(/(?<=datetime=")\S+(?=\")/)[0]
})
});

Related

how can i order by snapshot's child data containing a timestamp index with vue js

I saved the timestamp as a data field inside firebase DB, this way I can now retrieve it like {{ post.timestamp }} to display it, how would I go from where I am at to order the posts by timestamp order regardless of the user object order, for example, what I get in the UI is the posts ordered by the user and not by time.
data on firebase looks like this:
Code looks like this :
<template>
<div>
<div
v-for="post in allPosts.slice().reverse()"
:key="post._key">
<v-card class=" feed-card my-3">
<v-row no-gutters>
<v-col cols="1">
<v-img
class="align-center rounded-xl "
width="30"
:src="post.photoURL">
</v-img>
</v-col>
<v-col cols="10">
<p class="">{{post.postText}}</p>
<p class="blue-grey--text ">{{post.displayName}}</p>
<p class="mb-n1 mt-n5 d-flex flex-row-reverse"> {{post.date}} {{post.time}}</p>
</v-col>
</v-row>
</v-card>
</div>
</div>
</template>
<script>
import firebase from '#/plugins/firebase'
let db = firebase.database();
//let usersRef = db.ref('users');
let postRef = db.ref('posts');
export default {
name: 'FintechSocialFeed',
data: () => ({
authUser: null,
allPosts: [] // initialise an array
}),
methods: {
},
created: function() {
data => console.log(data.user, data.credential.accessToken)
firebase.auth().onAuthStateChanged(user => {
if (user) {
postRef.on('value', snapshot => {
const val = snapshot.val()
if (val) {
this.allPosts = Object.values(val).flatMap(posts =>
Object.entries(posts).map(([ _key, post ]) => ({ _key, ...post})))
}
console.log(snapshot.val())
});
}
})
}
}
</script>
here is the UI showing the latest post at the bottom because it is sorting by the user and not date:
I don't use firebase, but it looks like db reference provides orderByKey, so...
let postRef = db.ref('posts').orderByKey('timestamp');
An alternative would be sorting yourself, after retrieval...
this.allPosts = Object.values(val).flatMap(posts =>
Object.entries(posts).map(([ _key, post ]) => ({ _key, ...post}))
).sort((a, b) => a.timestamp.toMillis() - b.timestamp.toMillis());

Fetch and display lists of an user

I have a profile page that displays the user info. The page shows the user name / email and a button to create a list.
I can also edit the name and email correctly, and it reflects in the firebase instantaneously. Ok. I get the user data and I can edit it.
What I'm trying to do now is to show the lists that the user has created.
Look, this user has created one list, and what is returned to me is that he doesn't have lists.
I'll try to shorten the code as much as possible:
<script>
imports.....
import { db } from '../../firebase.config.js'
let listings = []
let auth = getAuth()
// fetch the user's listings
const fetchUserListings = async () => {
const listingsRef = collection(db, 'listings')
const q = query(
listingsRef,
where('userRef', '==', auth.currentUser.uid),
orderBy('timestamp', 'desc')
)
const querySnap = await getDocs(q)
querySnap.forEach((doc) => {
return listings.push({
id: doc.id,
data: doc.data()
})
})
}
fetchUserListings()
</script>
<!-- display the user's listings -->
<div>
{#if listings.length > 0}
<p class="listingText">My lists</p>
{#each listings as listing}
<ListingItem listing={listing.data} id={listing.id} />
{/each}
{:else}
<p class="noListings">You have no lists</p>
{/if}
</div>
My ListItem component:
<script>
export let listing
export let id
export let handleDelete
import DeleteIcon from '../../static/assets/svg/deleteIcon.svg'
</script>
<li class="categoryListing">
<a href={`/category/${listing.type}/${id}`} class="categoryListingLink">
<img src={listing.imgUrls[0]} alt={listing.name} class="categoryListingImg" />
<div class="categoryListingDetails">
<p class="categoryListingLocation">
{listing.location}
</p>
<p class="CategoryListingName">
{listing.name}
</p>
<p class="categoryListingPrice">
${listing.offer ? listing.discountedPrice : listing.regularPrice}
{listing.type === 'rent' ? '/ por mês' : ''}
</p>
<div class="categoryListingInfoDiv">
<img src="/assets/svg/bedIcon.svg" alt="cama" />
<p class="categoryListingInfoText">
{listing.bedrooms > 1 ? `${listing.bedrooms} camas` : `${listing.bedrooms} cama`}
</p>
<img src="/assets/svg/bathtubIcon.svg" alt="banheiro" />
<p class="categoryListingInfoText">
{listing.bathrooms > 1
? `${listing.bathrooms} banheiros`
: `${listing.bathrooms} banheiro`}
</p>
</div>
</div>
</a>
{#if handleDelete}
<DeleteIcon
class="removeIcon"
fill="rgb(231, 76, 60)"
onClick={() => {
handleDelete(listing.id, listing.name)
}}
/>
{/if}
</li>
Just when you think you've reached the simplest part, it's still tough.
Update:
I think that the problem is in firebase. The "docs" are empty:
Now I am in serious trouble!
querySnap.forEach((doc) => {
return listings.push({
id: doc.id,
data: doc.data()
})
})
I see two things here. The less important: The .forEach() method returns undefined, so the return is redundant. The more important: the .push() alone won't automatically trigger updates. Have a look at this section in the Docs
Did you try logging listings? I assume the data is there, it's just not displayed, so I propose to change this part to
querySnap.forEach((doc) => {
listings = [...listings, {
id: doc.id,
data: doc.data()
}]
})
or
querySnap.forEach((doc) => {
listings.push({
id: doc.id,
data: doc.data()
})
listings = listings
})

vue js realtime chat app without refreshing / firebase

I'm creating a chat app,
If the user enter message and press send button, the app is working fine. The informations going to the database and im taking a datas.
When the user refresh the page, so there is no problem, in mounted() instance im taking the datas from database(firebase) and im showing on the app. If another user comes to the chat, also there is no problem, all messages are appearing.
The problem is that: If the new message is coming, another user can not see it without refresh or without send message button. When the another user send a message then the user see all messages.
I explain the problem with a gif, if you help me i will be glad.
<template>
<div>
<div class="container">
<div class="row">
<div v-if="isLogin" class="offset-3 col-md-6 msg-area">
<header>
<h1>Group Chat</h1>
<p class="sm">Welcome, {{this.username }} </p>
</header>
<div class="msg">
<p class="mssgs" v-for="(message,index) in messages" :key="index">{{ message.msg }} <br> <span> {{ message.name }} - {{ message.time }} </span> </p>
</div>
<div class="sendMsg">
<form #submit.prevent="sendFunc">
<div class="form-group d-flex">
<input type="text" class="form-control" placeholder="Enter message.." v-model="msgInput">
<button class="btn">Send </button>
</div>
</form>
</div>
</div>
<div class="offset-3 col-md-6 login" v-else>
<form #submit.prevent="joinFunc">
<div class="form-group d-flex">
<input type="text" class="form-control" placeholder="Enter username.." v-model="username">
<button class="btn">Join to Group </button>
</div>
</form>
</div>
</div>
</div>
</div>
</template>
<script>
import firebase from "./firebase";
import 'firebase/firestore';
export default {
data() {
return {
db : firebase.firestore(),
isLogin: false,
username: '',
messages : [
],
msgInput: '',
}
},
methods: {
joinFunc() {
this.isLogin = true;
},
sendFunc() {
let date = new Date();
let hour = date.getHours();
let minute = date.getMinutes();
let nowTime = hour + ':' + minute;
this.db.collection("messages")
.add({ message: this.msgInput, name: this.username, time: nowTime, date: date })
.then(() => {
this.db.collection("messages").orderBy("date", "asc")
.get()
.then((querySnapshot) => {
querySnapshot.forEach((doc) => {
this.messages.push({
name: doc.data().name,
msg: doc.data().message,
time: doc.data().time
});
});
})
})
.catch((error) => {
console.error("Error writing document: ", error);
});
},
},
mounted: function() {
this.db.collection("messages").orderBy("date", "asc")
.get()
.then((querySnapshot) => {
querySnapshot.forEach((doc) => {
this.messages.push({
name: doc.data().name,
msg: doc.data().message,
time: doc.data().time
});
});
})
}
}
</script>
You're using get() to read the data from Firestore. As the documentation I linked explains, that reads the value from the database once, and does nothing more.
If you want to continue listening for updates to the database, you'll want to use a realtime listener. By using onSnapshot() your code will get called with a querySnapshot of the current state of the database right away, and will then also be called whenever the database changes. This is the perfect way to then update your UI.
So instead of
...
.get()
.then((querySnapshot) => {
querySnapshot.forEach((doc) => {
Do the following:
...
.onSnapshot((querySnapshot) => {
querySnapshot.forEach((doc) => {

Cypress problems take value and compare it. Scope variable

I have this HTML structure:
<tr id="post-7053" class="iedit author-other level-0 post-7053 type-poi status-publish hentry webmapp_category-corbezzolo" data-id="7053">
<th scope="row" class="check-column">
<label class="screen-reader-text" for="cb-select-7053">
Seleziona 594 </label>
<input id="cb-select-7053" type="checkbox" name="post[]" value="7053">
<div class="locked-indicator">
<span class="locked-indicator-icon" aria-hidden="true"></span>
<span class="screen-reader-text">
“594” è bloccato </span>
</div>
</th>
<td class="5da0bb937bd9f column-5da0bb937bd9f has-row-actions column-primary column-postid" data-colname="ID">7053
I have to take the value of an ID and compare it on another site:
i have to get the first table id i managed to get it with this cypress command:
id = cy.get('tbody#the-list td').first().invoke('val')
only that when I go to compare the value of the variable id. it never enters the if branch. While if I put a value like 7156 or other it enters the if branch and makes the comparison.
below the test code:
describe('Registration', () => {
const email = 'nedo#go.br'
const password = 'pedhu'
var id
it('create new Nedo', () => {
cy.visit('https://test.nedo/wp-admin')
cy.get('input[name=log]').type(email)
cy.get('input[name=pwd]').type(password)
cy.get('input#wp-submit').click()
cy.visit('https://test.nedo/edit.php?post_type=nedo')
id = cy.get('tbody#the-list td').first().invoke('val')
})
it('id', () => {
cy.visit('https://nedostaging.z.hu/login')
cy.get('input[name=email]').type('team#nedo.hi')
cy.get('input[name=password]').type('nedo')
cy.get('button').contains('Login').click()
cy.get('#hometable > tbody > tr > td:nth-child(4)').each(($e, index, $list) => {
const text = $e.text()
cy.log(id)
if (text.includes(id)) {//if I put a number instead of id it works
assert.strictEqual(text, '{"id":'+id+'}', 'id nedo ok')
}
})
})
cy.log(id):
For handling same-origin policies, you can write "chromeWebSecurity": false in your cypress.json file. But this will only work with the chrome browser.
describe('Registration', () => {
const email = 'nedo#go.br'
const password = 'pedhu'
before(() => {
cy.visit('https://test.nedo/wp-admin')
cy.get('input[name=log]').type(email)
cy.get('input[name=pwd]').type(password)
cy.get('input#wp-submit').click()
cy.visit('https://test.nedo/edit.php?post_type=nedo')
cy.get('tbody#the-list td').first().invoke('val').as('id')
})
it('id', () => {
cy.visit('https://nedostaging.z.hu/login')
cy.get('input[name=email]').type('team#nedo.hi')
cy.get('input[name=password]').type('nedo')
cy.get('button').contains('Login').click()
cy.get('#id').then((id) => {
cy.get('#hometable > tbody > tr > td:nth-child(4)').each(($e, index, $list) => {
const text = $e.text()
cy.log(id)
if (text.includes(id)) { //if I put a number instead of id it works
assert.strictEqual(text, '{"id":' + id + '}', 'id nedo ok')
}
})
})
})
})

How to use search in react js and get the result in particular div?

while searching, the results should appear as a div like below :
i use jquery to search in table,how to get the result like above.
my component code is:
<div id="modaldash" style={{ display: (searching ? 'block' : 'none') }}>
<p className="font-weight-medium" id="name"> <img id="logo" className="logo" src={jessica} alt="pam-logo" /> Jessica James </p>
<button id="Addlist" onClick={this.onSubmitdata} className="btn info">{this.state.shown ? "Addded" : "Add to list"}</button>
<p id="mailid">jessicajames#gmail.com </p>
<p id= "address">Mountain view,Ave</p>
</div>
its just a static content for css. how to use search and get results like above.
export default function App() {
// considering the data object to search on name
const [searchedData, setSearchedData] = useState([]);
const users = [
{
name: "abc1",
emailid: "abc1#gmail.com",
address: "23rd main, 2nd street"
},
{
name: "adb2",
emailid: "abc2#gmail.com",
address: "23rd main, 2nd street"
},
{
name: "adc3",
emailid: "abc3#gmail.com",
address: "23rd main, 2nd street"
}
];
const handleSearch = event => {
const data = users.filter(
user => user.name.indexOf(event.target.value) !== -1
);
setSearchedData(data);
};
const showSearchedData = () => {
return searchedData.map(user => (
<div key={user.emailid}>
<p className="font-weight-medium" id="name">
{" "}
<img id="logo" className="logo" src="" alt="pam-logo" />
{user.name}
</p>
<button id="Addlist"> Added/ add to list</button>
<p id="mailid">{user.emailid} </p>
<p id="address">{user.address}</p>
</div>
));
};
return (
<div className="App">
<input type="text" onChange={handleSearch} />
<div id="modaldash">{searchedData.length > 0 && showSearchedData()}</div>
</div>
);
}
You can add CSS to make a look and feel like shown in image attached.
Check the working example here https://codesandbox.io/s/falling-sun-r3rim

Resources