My Scrapy Shell Commands Work but Output is Empty - web-scraping

I tested codes in Scrapy Shell and works fine.
fetch('https://www.livescores.com/?tz=3')
response.css('div.dh')
gununMaclari = response.css('div.dh')
gununMaclari.css('span.hh span.ih span.kh::text').get()
gununMaclari.css('span.hh span.jh span.kh::text').get()
These commands show me home and away teams. If i use getall() I can reach all data for both home and away.
But when I run below code, the output is empty. HAt is the problem I could not solve it. Could someone help me to find the problem? Thanks.
import scrapy
from scrapy.crawler import CrawlerRunner
class LivescoresTodayList(scrapy.Spider):
name = 'todayMatcheslist'
custom_settings = {'CONCURRENT_REQUESTS': '1'}
def start_requests(self):
yield scrapy.Request('https://www.livescores.com/?tz=3')
def parse(self, response):
for gununMaclari in response.css('div.dh'):
yield{
'Home': gununMaclari.css('span.hh span.ih span.kh::text').get(),
'Away': gununMaclari.css('span.hh span.jh span.kh::text').get()
}
runnerTodayList = CrawlerRunner(settings = {
"FEEDS": {
"todayMatcheslist.json": {"format": "json", "overwrite": True},
},
})
runnerTodayList.crawl(LivescoresTodayList)

Read this.
The spider itself is fine. If you're using CrawlerRunner you need to configure the logging and settings, and start the reactor.
Example with CrawlerProcess:
import scrapy
from scrapy.crawler import CrawlerProcess
class LivescoresTodayList(scrapy.Spider):
name = 'todayMatcheslist'
custom_settings = {'CONCURRENT_REQUESTS': '1'}
def start_requests(self):
yield scrapy.Request('https://www.livescores.com/?tz=3')
def parse(self, response):
for gununMaclari in response.css('div.dh'):
yield{
'Home': gununMaclari.css('span.hh span.ih span.kh::text').get(),
'Away': gununMaclari.css('span.hh span.jh span.kh::text').get()
}
process = CrawlerProcess(settings={
"FEEDS": {
"todayMatcheslist.json": {"format": "json", "overwrite": True},
},
})
process.crawl(LivescoresTodayList)
process.start()
Example with CrawlerRunner:
import scrapy
from scrapy.crawler import CrawlerRunner
from scrapy.utils.log import configure_logging
from twisted.internet import reactor
class LivescoresTodayList(scrapy.Spider):
name = 'todayMatcheslist'
custom_settings = {'CONCURRENT_REQUESTS': '1'}
def start_requests(self):
yield scrapy.Request('https://www.livescores.com/?tz=3')
def parse(self, response):
for gununMaclari in response.css('div.dh'):
yield{
'Home': gununMaclari.css('span.hh span.ih span.kh::text').get(),
'Away': gununMaclari.css('span.hh span.jh span.kh::text').get()
}
configure_logging({'LOG_FORMAT': '%(levelname)s: %(message)s'})
runnerTodayList = CrawlerRunner(settings={
"FEEDS": {
"todayMatcheslist.json": {"format": "json", "overwrite": True},
},
})
d = runnerTodayList.crawl(LivescoresTodayList)
d.addBoth(lambda _: reactor.stop())
reactor.run()

Related

'next-session' is not working with 'connect-pg-simple'

Documentation or an example on 'next-session' on how to connect it to Postgres was very dry.
Following the compatibility example on npm did not work.
const session = require("express-session");
const RedisStore = require("connect-redis")(session);
// Use `expressSession` from `next-session/lib/compat` as the replacement
import { expressSession } from "next-session/lib/compat";
const pgSession = require("connect-pg-simple")(expressSession)
export const getSession = nextSession({
cookie: {
maxAge: 432000,
},
store: new pgStore({...config}},
});
Error: ...failed to prune "session"
Not as simple as 'express-session' to get it working but it was a lot simpler than learning how to use jwt 'iron-session' library...
I managed to get it working by doing this:
import nextSession from "next-session";
import pgPool from "./db";
import { promisifyStore } from "next-session/lib/compat";
import { expressSession } from "next-session/lib/compat";
const pgSession = require("connect-pg-simple")(expressSession);
const connectStore = new pgSession({
pool: pgPool,
tableName: "session",
});
export const getSession = nextSession({
cookie: {
maxAge: 432000,
},
store: promisifyStore(connectStore),
});

Testing redux-saga

I am trying to implement a test for redux saga as follows but I ran into a problem. The error I get is cannot read property payload of undefined. The 'message' var that I passed to the saga function is undefined for some reason, can anyone tell me why? Thanks
saga.spec.js
import test from 'tape'
import { put,take,call } from 'redux-saga/effects'
import { onCreateMessage } from '../src/common/sagas/messages'
import { addMessage,createMessage } from '../src/common/reducers/messages'
test('createMessage', assert => {
const gen = onCreateMessage()
var message = {
id: 1234,
channelID: "AA",
text: "text",
user: "user"
}
assert.deepEqual(
gen.next(message).value,
put(addMessage(message)),
'createMessage should dispatch addMessage action'
)
})
saga/index.js
export default function* rootSaga(){
yield [
takeEvery('CREATE_MESSAGE', onCreateMessage),
......
]
When I console logged message below I get 'undefined'
export function* onCreateMessage(message) {
console.log(message)
yield put(addMessage(message.payload))
try {
yield call(fetch,'/api/newmessage',
{
method: 'post',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(message.payload)}
)
} catch (error){
throw error
}
}
I am using actionCreator from redux-actions:
export const addMessage = createAction(ADD_MESSAGE);
change
const gen = onCreateMesaage()
to
const gen = onCreateMessage(message)
Explanation:
message is the arguments passed to generator function rather than the yield result.
Previous code works for yielded result. e.g
export function* onCreateMessage() {
const message = yield
console.log(message)
yield put(addMessage(message.payload))
}

Angular2 http subscribe component

An angular2 app, try to register an email.
import {Component, Directive, provide, Host} from '#angular/core';
import {NG_VALIDATORS, NgForm} from '#angular/forms';
import {ChangeDetectorRef, ChangeDetectionStrategy} from '#angular/core';
import {ApiService} from '../../services/api.service';
import {actions} from '../../common/actions';
import {EmailValidator} from '../../directives/email-validater.directive';
import * as _ from 'lodash';
import * as Rx from 'rxjs';
#Component({
selector: 'register-step1',
directives: [EmailValidator],
styleUrls: ['app/components/register-step1/register.step1.css'],
templateUrl: 'app/components/register-step1/register.step1.html'
})
export class RegisterStep1 {
email: string;
userType: number;
errorMessage: string;
successMessage: string;
constructor(private _api: ApiService, private ref: ChangeDetectorRef) {
this.successMessage = 'success';
this.errorMessage = 'error';
}
submit() {
var params = {
email: this.email,
type: +this.userType
};
params = {
email: '1#qq.com',
type: 3
};
this._api.query(actions.register_email, params).subscribe({
next: function(data) {
if(data.status) {
console.log("success register");
this.successMessage = "ok ,success";
console.log(this.errorMessage, this.successMessage);
}else{
this.errorMessage = data.message;
console.warn(data.message)
}
},
error: err => console.log(err),
complete: () => console.log('done')
});
}
}
my ApiService is simple:
import {Injectable} from '#angular/core';
import {Http, Headers, RequestOptions} from '#angular/http';
import 'rxjs/add/operator/map';
import 'rxjs/add/operator/toPromise';
import {AjaxCreationMethod, AjaxObservable} from 'rxjs/observable/dom/AjaxObservable';
import {logError} from '../services/log.service';
import {AuthHttp, AuthConfig, AUTH_PROVIDERS} from 'angular2-jwt';
#Injectable()
export class ApiService {
_jwt_token:string;
constructor(private http:Http) {
}
toParams(paramObj) {
let arr = [];
for(var key in paramObj) {
arr.push(key + '=' + paramObj[key]);
}
return arr.join('&')
}
query(url:string, paramObj:any) {
let headers = new Headers({'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8'});
let options = new RequestOptions({headers: headers});
return this.http.post(url, this.toParams(paramObj), options).map(res=>res.json())
}
}
this is my html :
<form #f="ngForm">
usertype<input type="text" name="userType" [(ngModel)]="userType"><br/>
<input type="text" name="email" ngControl="email" email-input required [(ngModel)]="email">
<button [disabled]="!f.form.valid" (click)="submit(f.email, f.userType)">add</button>
</form>
{{f.form.errors}}
<span *ngIf="errorMessage">error message: {{errorMessage}}</span>
<span *ngIf="successMessage">success message: {{successMessage}}</span>
I can success send the api to server and received response, I subscribe an observer to the http response which is a Observable object, inner the next function, I console.log() my successMessage, but i got 'undefined', and when I change the successMessage my html has no change.
It seems like I have lost the scope of my component, then I can't use this keyword
That's because you use the function keyword inside TypeScript. Never do this. Always use the arrow notation () => {}.
You should change your next function to:
next: (data) => {
if(data.status) {
console.log("success register");
this.successMessage = "ok ,success";
console.log(this.errorMessage, this.successMessage);
}else{
this.errorMessage = data.message;
console.warn(data.message)
}

Angular2 # TypeScript Observable error

I have an input field and when the user types a search string I want to wait for the user to stop typing for at least 300 milliseconds (debounce) before doing a _heroService http request. Only changed search values make it through to the service (distinctUntilChanged). The switchMap returns a new observable that combines these _heroService observables, re-arranges them in their original request order, and delivers to subscribers only the most recent search results.
I am using Angular 2.0.0-beta.0 and TypeScript 1.7.5.
How do I get this thing working correct?
I get compile error:
Error:(33, 20) TS2345: Argument of type '(value: string) => Subscription<Hero[]>' is not assignable to parameter of type '(x: {}, ix: number) => Observable<any>'.Type 'Subscription<Hero[]>' is not assignable to type 'Observable<any>'. Property 'source' is missing in type 'Subscription<Hero[]>'.
Error:(36, 31) TS2322: Type 'Hero[]' is not assignable to type 'Observable<Hero[]>'. Property 'source' is missing in type 'Hero[]'.
Run time error (after typing first key in search input field):
EXCEPTION: TypeError: unknown type returned
STACKTRACE:
TypeError: unknown type returned
at Object.subscribeToResult (http://localhost:3000/rxjs/bundles/Rx.js:7082:25)
at SwitchMapSubscriber._next (http://localhost:3000/rxjs/bundles/Rx.js:5523:63)
at SwitchMapSubscriber.Subscriber.next (http://localhost:3000/rxjs/bundles/Rx.js:9500:14)
...
-----async gap----- Error at _getStacktraceWithUncaughtError
EXCEPTION: Invalid argument '[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]' for pipe 'AsyncPipe' in [heroes | async in Test#4:16]
test1.ts
import {bootstrap} from 'angular2/platform/browser';
import {Component} from 'angular2/core';
import {HTTP_PROVIDERS} from 'angular2/http';
import {Observable} from 'rxjs/Observable';
import {Subject} from 'rxjs/Subject';
import 'rxjs/Rx';
import {Hero} from './hero';
import {HeroService} from './hero.service';
#Component({
selector: 'my-app',
template: `
<h3>Test</h3>
Search <input #inputUser (keyup)="search(inputUser.value)"/><br>
<ul>
<li *ngFor="#hero of heroes | async">{{hero.name}}</li>
</ul>
`,
providers: [HeroService, HTTP_PROVIDERS]
})
export class Test {
public errorMessage: string;
private _searchTermStream = new Subject<string>();
private heroes: Observable<Hero[]> = this._searchTermStream
.debounceTime(300)
.distinctUntilChanged()
.switchMap((value: string) =>
this._heroService.searchHeroes(value)
.subscribe(
heroes => this.heroes = heroes,
error => this.errorMessage = <any>error)
)
constructor (private _heroService: HeroService) {}
search(value: string) {
this._searchTermStream.next(value);
}
}
bootstrap(Test);
hero.ts
export interface Hero {
_id: number,
name: string
}
hero.service.ts
import {Injectable} from 'angular2/core';
import {Http, Response} from 'angular2/http';
import {Headers, RequestOptions} from 'angular2/http';
import {Observable} from 'rxjs/Observable';
import 'rxjs/Rx';
import {Hero} from './hero';
#Injectable()
export class HeroService {
private _heroesUrl = 'api/heroes';
constructor (private http: Http) {}
getHeroes () {
return this.http.get(this._heroesUrl)
.map(res => <Hero[]> res.json())
.do(data => console.log(data))
.catch(this.handleError);
}
searchHeroes (value) {
return this.http.get(this._heroesUrl + '/search/' + value )
.map(res => <Hero[]> res.json())
.do(data => console.log(data))
.catch(this.handleError);
}
addHero (name: string) : Observable<Hero> {
let body = JSON.stringify({name});
let headers = new Headers({ 'Content-Type': 'application/json' });
let options = new RequestOptions({ headers: headers });
return this.http.post(this._heroesUrl, body, options)
.map(res => <Hero> res.json())
.do(data => console.log(data))
.catch(this.handleError)
}
private handleError (error: Response) {
// in a real world app, we may send the server to some remote logging infrastructure
// instead of just logging it to the console
console.log(error);
return Observable.throw('Internal server error');
}
}
index.html
<!DOCTYPE html>
<html>
<head>
<base href="/">
<script src="angular2/bundles/angular2-polyfills.js"></script>
<script src="typescript/lib/typescript.js"></script>
<script src="systemjs/dist/system.js"></script>
<script src="angular2/bundles/router.dev.js"></script>
<script src="rxjs/bundles/Rx.js"></script>
<script src="angular2/bundles/angular2.js"></script>
<script src="angular2/bundles/http.dev.js"></script>
<link rel="stylesheet" href="node_modules/bootstrap/dist/css/bootstrap.min.css">
<script>
System.config({
transpiler: 'typescript',
typescriptOptions: { emitDecoratorMetadata: true },
packages: {'components': {defaultExtension: 'ts'}}
});
System.import('components/test1')
.then(null, console.error.bind(console));
</script>
</head>
<body>
<my-app>Loading...</my-app>
</body>
</html>
Here is another version 'test2.ts' that works fine doing a http request after every (keyup) event:
import {bootstrap} from 'angular2/platform/browser';
import {Component} from 'angular2/core';
import {HTTP_PROVIDERS} from 'angular2/http';
import {Hero} from './hero';
import {HeroService} from './hero.service';
#Component({
selector: 'my-app',
template: `
<h3>Test</h3>
Search <input #inputUser (keyup)="search(inputUser.value)"/><br>
<ul>
<li *ngFor="#hero of heroes">{{hero.name}}</li>
</ul>
`,
providers: [HeroService, HTTP_PROVIDERS]
})
export class Test {
public heroes:Hero[] = [];
public errorMessage: string;
constructor (private _heroService: HeroService) {}
search(value: string) {
if (value) {
this._heroService.searchHeroes(value)
.subscribe(
heroes => this.heroes = heroes,
error => this.errorMessage = <any>error);
}
else {
this.heroes = [];
}
}
}
bootstrap(Test);
.subscribe(...) returns a Subscription, not an Observable.
Remove the subscribe(...) or replace it by a .map(...) and use .subscribe(...) when you access it to get the values.

How to catch exception correctly from http.request()?

Part of my code:
import {Injectable} from 'angular2/core';
import {Http, Headers, Request, Response} from 'angular2/http';
import {Observable} from 'rxjs/Observable';
import 'rxjs/add/operator/map';
#Injectable()
export class myClass {
constructor(protected http: Http) {}
public myMethod() {
let request = new Request({
method: "GET",
url: "http://my_url"
});
return this.http.request(request)
.map(res => res.json())
.catch(this.handleError); // Trouble line.
// Without this line code works perfectly.
}
public handleError(error: Response) {
console.error(error);
return Observable.throw(error.json().error || 'Server error');
}
}
myMethod() produces exception in console of browser:
ORIGINAL EXCEPTION: TypeError: this.http.request(...).map(...).catch is not a function
Perhaps you can try adding this in your imports:
import 'rxjs/add/operator/catch';
You can also do:
return this.http.request(request)
.map(res => res.json())
.subscribe(
data => console.log(data),
err => console.log(err),
() => console.log('yay')
);
Per comments:
EXCEPTION: TypeError: Observable_1.Observable.throw is not a function
Similarly, for that, you can use:
import 'rxjs/add/observable/throw';
New service updated to use the HttpClientModule and RxJS v5.5.x:
import { Injectable } from '#angular/core';
import { HttpClient, HttpErrorResponse } from '#angular/common/http';
import { Observable } from 'rxjs/Observable';
import { catchError, tap } from 'rxjs/operators';
import { SomeClassOrInterface} from './interfaces';
import 'rxjs/add/observable/throw';
#Injectable()
export class MyService {
url = 'http://my_url';
constructor(private _http:HttpClient) {}
private handleError(operation: String) {
return (err: any) => {
let errMsg = `error in ${operation}() retrieving ${this.url}`;
console.log(`${errMsg}:`, err)
if(err instanceof HttpErrorResponse) {
// you could extract more info about the error if you want, e.g.:
console.log(`status: ${err.status}, ${err.statusText}`);
// errMsg = ...
}
return Observable.throw(errMsg);
}
}
// public API
public getData() : Observable<SomeClassOrInterface> {
// HttpClient.get() returns the body of the response as an untyped JSON object.
// We specify the type as SomeClassOrInterfaceto get a typed result.
return this._http.get<SomeClassOrInterface>(this.url)
.pipe(
tap(data => console.log('server data:', data)),
catchError(this.handleError('getData'))
);
}
Old service, which uses the deprecated HttpModule:
import {Injectable} from 'angular2/core';
import {Http, Response, Request} from 'angular2/http';
import {Observable} from 'rxjs/Observable';
import 'rxjs/add/observable/throw';
//import 'rxjs/Rx'; // use this line if you want to be lazy, otherwise:
import 'rxjs/add/operator/map';
import 'rxjs/add/operator/do'; // debug
import 'rxjs/add/operator/catch';
#Injectable()
export class MyService {
constructor(private _http:Http) {}
private _serverError(err: any) {
console.log('sever error:', err); // debug
if(err instanceof Response) {
return Observable.throw(err.json().error || 'backend server error');
// if you're using lite-server, use the following line
// instead of the line above:
//return Observable.throw(err.text() || 'backend server error');
}
return Observable.throw(err || 'backend server error');
}
private _request = new Request({
method: "GET",
// change url to "./data/data.junk" to generate an error
url: "./data/data.json"
});
// public API
public getData() {
return this._http.request(this._request)
// modify file data.json to contain invalid JSON to have .json() raise an error
.map(res => res.json()) // could raise an error if invalid JSON
.do(data => console.log('server data:', data)) // debug
.catch(this._serverError);
}
}
I use .do() (now .tap()) for debugging.
When there is a server error, the body of the Response object I get from the server I'm using (lite-server) contains just text, hence the reason I use err.text() above rather than err.json().error. You may need to adjust that line for your server.
If res.json() raises an error because it could not parse the JSON data, _serverError will not get a Response object, hence the reason for the instanceof check.
In this plunker, change url to ./data/data.junk to generate an error.
Users of either service should have code that can handle the error:
#Component({
selector: 'my-app',
template: '<div>{{data}}</div>
<div>{{errorMsg}}</div>`
})
export class AppComponent {
errorMsg: string;
constructor(private _myService: MyService ) {}
ngOnInit() {
this._myService.getData()
.subscribe(
data => this.data = data,
err => this.errorMsg = <any>err
);
}
}
There are several ways to do this. Both are very simple. Each of the examples works great. You can copy it into your project and test it.
The first method is preferable, the second is a bit outdated, but so far it works too.
1) Solution 1
// File - app.module.ts
import { BrowserModule } from '#angular/platform-browser';
import { NgModule } from '#angular/core';
import { HttpClientModule } from '#angular/common/http';
import { AppComponent } from './app.component';
import { ProductService } from './product.service';
import { ProductModule } from './product.module';
#NgModule({
declarations: [
AppComponent
],
imports: [
BrowserModule,
HttpClientModule
],
providers: [ProductService, ProductModule],
bootstrap: [AppComponent]
})
export class AppModule { }
// File - product.service.ts
import { Injectable } from '#angular/core';
import { HttpClient } from '#angular/common/http';
// Importing rxjs
import 'rxjs/Rx';
import { Observable } from 'rxjs/Rx';
import { catchError, tap } from 'rxjs/operators'; // Important! Be sure to connect operators
// There may be your any object. For example, we will have a product object
import { ProductModule } from './product.module';
#Injectable()
export class ProductService{
// Initialize the properties.
constructor(private http: HttpClient, private product: ProductModule){}
// If there are no errors, then the object will be returned with the product data.
// And if there are errors, we will get into catchError and catch them.
getProducts(): Observable<ProductModule[]>{
const url = 'YOUR URL HERE';
return this.http.get<ProductModule[]>(url).pipe(
tap((data: any) => {
console.log(data);
}),
catchError((err) => {
throw 'Error in source. Details: ' + err; // Use console.log(err) for detail
})
);
}
}
2) Solution 2. It is old way but still works.
// File - app.module.ts
import { BrowserModule } from '#angular/platform-browser';
import { NgModule } from '#angular/core';
import { HttpModule } from '#angular/http';
import { AppComponent } from './app.component';
import { ProductService } from './product.service';
import { ProductModule } from './product.module';
#NgModule({
declarations: [
AppComponent
],
imports: [
BrowserModule,
HttpModule
],
providers: [ProductService, ProductModule],
bootstrap: [AppComponent]
})
export class AppModule { }
// File - product.service.ts
import { Injectable } from '#angular/core';
import { Http, Response } from '#angular/http';
// Importing rxjs
import 'rxjs/Rx';
import { Observable } from 'rxjs/Rx';
#Injectable()
export class ProductService{
// Initialize the properties.
constructor(private http: Http){}
// If there are no errors, then the object will be returned with the product data.
// And if there are errors, we will to into catch section and catch error.
getProducts(){
const url = '';
return this.http.get(url).map(
(response: Response) => {
const data = response.json();
console.log(data);
return data;
}
).catch(
(error: Response) => {
console.log(error);
return Observable.throw(error);
}
);
}
}
The RxJS functions need to be specifically imported. An easy way to do this is to import all of its features with import * as Rx from "rxjs/Rx"
Then make sure to access the Observable class as Rx.Observable.
in the latest version of angular4 use
import { Observable } from 'rxjs/Rx'
it will import all the required things.

Resources