Moving from BeautifulSoup to Scrapy - web-scraping

I am learning BeautifulSoup right now and I need to switch to Scrapy as I will need more powerful features later on such as form interaction etc.
Would the correct strategy be to port my BeautifulSoup script to scrapy or to somehow integrate the BeautifulSoup code within scrapy?
Beautiful Soup Code
#Imports
from bs4 import BeautifulSoup
import requests
import pandas as pd
html = """<div class="box1">
<table class="table1">
<tr><td class="label">Item1</td><td>Value1</td></tr>
<tr><td class="label">Item2</td><td>Value2</td></tr>
<tr><td class="label">Item3</td><td>Value3</td></tr>
<tr><td class="label">Item4</td><td>Value4</td></tr>
</table>
</div>"""
#Grab our page as text
soup = BeautifulSoup(html, "html.parser")
#Target what we want
div = soup.find("div", class_="box1")
#Filter what we want
columns = []
for tr in div.find_all('tr'):
columns.append([td.text for td in tr.find_all("td")])
#Transpose our columns
columns = list(zip(*columns))
#Output our results to Excel
df = pd.DataFrame(columns)
df.to_csv('index.csv', index=False, encoding='utf-8')

Related

How to properly set up Ckeditor5 for project VueJS?

I have several problems right now. I installed the assembly through the online constructor, collected all the modules together if possible. Connected to my project on Vue3. The editor has appeared, but splitting the text into paragraphs by clicking on the corresponding button does not work. The button works, but the text is not marked up into paragraphs (1,2,3, and so on).
The next issue is with images. I can't insert an image, I get this error in the console ('filerepository-no-upload-adapter'). I'll attach the assembly code:
Component Code Vue:
import Editor from '../../../../../../../ckeditor5/build/ckeditor';
export default {
data() {
return {
selectedProducts: null,
createDialog: false,
editDialog: false,
deleteDialog: false,
editor: Editor,
editorConfig: {},
}
},
Assembly code along the way 'ckeditor5/build/ckeditor.js'
/**
* #license Copyright (c) 2014-2022, CKSource Holding sp. z o.o. All rights reserved.
* For licensing, see LICENSE.md or https://ckeditor.com/legal/ckeditor-oss-license
*/
import ClassicEditor from '#ckeditor/ckeditor5-editor-classic/src/classiceditor.js';
import Alignment from '#ckeditor/ckeditor5-alignment/src/alignment.js';
import Autoformat from '#ckeditor/ckeditor5-autoformat/src/autoformat.js';
import AutoImage from '#ckeditor/ckeditor5-image/src/autoimage.js';
import AutoLink from '#ckeditor/ckeditor5-link/src/autolink.js';
import Autosave from '#ckeditor/ckeditor5-autosave/src/autosave.js';
import BlockQuote from '#ckeditor/ckeditor5-block-quote/src/blockquote.js';
import Bold from '#ckeditor/ckeditor5-basic-styles/src/bold.js';
import CKFinderUploadAdapter from '#ckeditor/ckeditor5-adapter-ckfinder/src/uploadadapter.js';
import CloudServices from '#ckeditor/ckeditor5-cloud-services/src/cloudservices.js';
import Code from '#ckeditor/ckeditor5-basic-styles/src/code.js';
import CodeBlock from '#ckeditor/ckeditor5-code-block/src/codeblock.js';
import DataFilter from '#ckeditor/ckeditor5-html-support/src/datafilter.js';
import DataSchema from '#ckeditor/ckeditor5-html-support/src/dataschema.js';
import Essentials from '#ckeditor/ckeditor5-essentials/src/essentials.js';
import FindAndReplace from '#ckeditor/ckeditor5-find-and-replace/src/findandreplace.js';
import FontBackgroundColor from '#ckeditor/ckeditor5-font/src/fontbackgroundcolor.js';
import FontColor from '#ckeditor/ckeditor5-font/src/fontcolor.js';
import FontFamily from '#ckeditor/ckeditor5-font/src/fontfamily.js';
import FontSize from '#ckeditor/ckeditor5-font/src/fontsize.js';
import GeneralHtmlSupport from '#ckeditor/ckeditor5-html-support/src/generalhtmlsupport.js';
import Heading from '#ckeditor/ckeditor5-heading/src/heading.js';
import Highlight from '#ckeditor/ckeditor5-highlight/src/highlight.js';
import HorizontalLine from '#ckeditor/ckeditor5-horizontal-line/src/horizontalline.js';
import HtmlComment from '#ckeditor/ckeditor5-html-support/src/htmlcomment.js';
import HtmlEmbed from '#ckeditor/ckeditor5-html-embed/src/htmlembed.js';
import Image from '#ckeditor/ckeditor5-image/src/image.js';
import ImageCaption from '#ckeditor/ckeditor5-image/src/imagecaption.js';
import ImageInsert from '#ckeditor/ckeditor5-image/src/imageinsert.js';
import ImageResize from '#ckeditor/ckeditor5-image/src/imageresize.js';
import ImageStyle from '#ckeditor/ckeditor5-image/src/imagestyle.js';
import ImageToolbar from '#ckeditor/ckeditor5-image/src/imagetoolbar.js';
import ImageUpload from '#ckeditor/ckeditor5-image/src/imageupload.js';
import Indent from '#ckeditor/ckeditor5-indent/src/indent.js';
import IndentBlock from '#ckeditor/ckeditor5-indent/src/indentblock.js';
import Italic from '#ckeditor/ckeditor5-basic-styles/src/italic.js';
import Link from '#ckeditor/ckeditor5-link/src/link.js';
import LinkImage from '#ckeditor/ckeditor5-link/src/linkimage.js';
import List from '#ckeditor/ckeditor5-list/src/list.js';
import ListProperties from '#ckeditor/ckeditor5-list/src/listproperties.js';
import Markdown from '#ckeditor/ckeditor5-markdown-gfm/src/markdown.js';
import MediaEmbed from '#ckeditor/ckeditor5-media-embed/src/mediaembed.js';
import MediaEmbedToolbar from '#ckeditor/ckeditor5-media-embed/src/mediaembedtoolbar.js';
import Mention from '#ckeditor/ckeditor5-mention/src/mention.js';
import PageBreak from '#ckeditor/ckeditor5-page-break/src/pagebreak.js';
import Paragraph from '#ckeditor/ckeditor5-paragraph/src/paragraph.js';
import PasteFromOffice from '#ckeditor/ckeditor5-paste-from-office/src/pastefromoffice.js';
import RemoveFormat from '#ckeditor/ckeditor5-remove-format/src/removeformat.js';
import SourceEditing from '#ckeditor/ckeditor5-source-editing/src/sourceediting.js';
import SpecialCharacters from '#ckeditor/ckeditor5-special-
characters/src/specialcharacters.js';
import SpecialCharactersCurrency from '#ckeditor/ckeditor5-special-
characters/src/specialcharacterscurrency.js';
import SpecialCharactersEssentials from '#ckeditor/ckeditor5-special-
characters/src/specialcharactersessentials.js';
import SpecialCharactersLatin from '#ckeditor/ckeditor5-special-
characters/src/specialcharacterslatin.js';
import SpecialCharactersMathematical from '#ckeditor/ckeditor5-special-
characters/src/specialcharactersmathematical.js';
import SpecialCharactersText from '#ckeditor/ckeditor5-special-
characters/src/specialcharacterstext.js';
import StandardEditingMode from '#ckeditor/ckeditor5-restricted-
editing/src/standardeditingmode.js';
import Strikethrough from '#ckeditor/ckeditor5-basic-styles/src/strikethrough.js';
import Style from '#ckeditor/ckeditor5-style/src/style.js';
import Subscript from '#ckeditor/ckeditor5-basic-styles/src/subscript.js';
import Superscript from '#ckeditor/ckeditor5-basic-styles/src/superscript.js';
import Table from '#ckeditor/ckeditor5-table/src/table.js';
import TableCaption from '#ckeditor/ckeditor5-table/src/tablecaption.js';
import TableCellProperties from '#ckeditor/ckeditor5-table/src/tablecellproperties';
import TableColumnResize from '#ckeditor/ckeditor5-table/src/tablecolumnresize.js';
import TableProperties from '#ckeditor/ckeditor5-table/src/tableproperties';
import TableToolbar from '#ckeditor/ckeditor5-table/src/tabletoolbar.js';
import TextPartLanguage from '#ckeditor/ckeditor5-language/src/textpartlanguage.js';
import TextTransformation from '#ckeditor/ckeditor5-typing/src/texttransformation.js';
import Title from '#ckeditor/ckeditor5-heading/src/title.js';
import TodoList from '#ckeditor/ckeditor5-list/src/todolist';
import Underline from '#ckeditor/ckeditor5-basic-styles/src/underline.js';
import WordCount from '#ckeditor/ckeditor5-word-count/src/wordcount.js';
class Editor extends ClassicEditor {}
// Plugins to include in the build.
Editor.builtinPlugins = [
Alignment,
Autoformat,
AutoImage,
AutoLink,
Autosave,
BlockQuote,
Bold,
CKFinderUploadAdapter,
CloudServices,
Code,
CodeBlock,
DataFilter,
DataSchema,
Essentials,
FindAndReplace,
FontBackgroundColor,
FontColor,
FontFamily,
FontSize,
GeneralHtmlSupport,
Heading,
Highlight,
HorizontalLine,
HtmlComment,
HtmlEmbed,
Image,
ImageCaption,
ImageInsert,
ImageResize,
ImageStyle,
ImageToolbar,
ImageUpload,
Indent,
IndentBlock,
Italic,
Link,
LinkImage,
List,
ListProperties,
Markdown,
MediaEmbed,
MediaEmbedToolbar,
Mention,
PageBreak,
Paragraph,
PasteFromOffice,
RemoveFormat,
SourceEditing,
SpecialCharacters,
SpecialCharactersCurrency,
SpecialCharactersEssentials,
SpecialCharactersLatin,
SpecialCharactersMathematical,
SpecialCharactersText,
StandardEditingMode,
Strikethrough,
Style,
Subscript,
Superscript,
Table,
TableCaption,
TableCellProperties,
TableColumnResize,
TableProperties,
TableToolbar,
TextPartLanguage,
TextTransformation,
Title,
TodoList,
Underline,
WordCount
];
// Editor configuration.
Editor.defaultConfig = {
toolbar: {
items: [
'heading',
'|',
'style',
'|',
'textPartLanguage',
'|',
'bold',
'italic',
'link',
'|',
'bulletedList',
'numberedList',
'|',
'outdent',
'indent',
'|',
'imageUpload',
'imageInsert',
'mediaEmbed',
'|',
'codeBlock',
'htmlEmbed',
'|',
'insertTable',
'blockQuote',
'undo',
'redo',
'alignment',
'code',
'findAndReplace',
'fontBackgroundColor',
'fontColor',
'fontSize',
'fontFamily',
'highlight',
'horizontalLine',
'pageBreak',
'removeFormat',
'sourceEditing',
'specialCharacters',
'strikethrough',
'restrictedEditingException',
'subscript',
'todoList',
'underline'
]
},
language: 'ru',
image: {
toolbar: [
'imageTextAlternative',
'imageStyle:inline',
'imageStyle:block',
'imageStyle:side',
'linkImage'
]
},
table: {
contentToolbar: [
'tableColumn',
'tableRow',
'mergeTableCells',
'tableCellProperties',
'tableProperties'
]
}
};
export default Editor;
This is my first question on this platform. I very much hope that you will help me solve problems with this editor. I won't be able to submit the application without it. I can add anything you need extra

CSS file is applying on another react component even without importing

Hello I'm using React to build a website and I want to use .CSS files for my CSS and I'm using import './example.css' in my component file.
Example:
import React from 'react';
import 'Home.css';
const Home = () => {
return (
<div className="example">
Hi
</div>
)
}
and if i create another page but don't import this CSS file, I get the styles on my other page
other page:
import React from 'react';
const About= () => {
return (
<div className="example">
Hi
</div>
)
}
Any reason and solution for this?
When importing a css file like you've done it will be injected into the project and not just into the component you're importing it from.
What you're looking for is css-modules (adding css-modules using create-react-app)
import React from 'react';
import styles from 'Home.css';
const Home = () => {
return (
<div className={styles.example}>
Hi
</div>
)
}
The reason is that you are using the same class in both of your components.
Doing import 'Home.css' does not encapsulate .css only for that component, all of the .css gets bundled together so it ends up overwriting styles somewhere down the line.
For each of the components, you can specify a unique className on top of your component, and use that class to style only that component.
.home-container .header { ... }
You can also make one global .css part to put styles that you want to keep using throughout the whole app.

how can I import multiple CSS files conditionally based on this.state data in react native

I need to import multiple css files conditionally in react native project like
import React, { Component } from "react";
if(this.state.language = "he"){
import styles from "./he";
}else{
import styles from "./en";
}
But it's not working. I need to add 2 different CSS for LTR and RTL based on language.
You can do like the following code
import heStyles from "./he";
import enStyles from "./en";
const styles = this.state.language = "he" ? heStyles : enStyles;

How to change first day of week in Kendo react UI calendar library

Default first day of a week in Kendo React JS's calendar library is Sunday.
Want starting day of a week should be Monday.
It is localized using the IntlProvider, see the documentation in the KendoReact site. In the example the calendar starts from Monday, and not Sunday, since it is using ES culture.
The IntlProvider provides the cultures to the DatePicker including the first day of week.
You can load the data from CLDR as it is from their repo. Or modify it first to match your needs and then load it. For example: weekData.supplemental.weekData.firstDay.US = 'mon';
Here is such override example with full code:
import * as React from 'react';
import * as ReactDOM from 'react-dom';
import { Calendar } from '#progress/kendo-react-dateinputs';
import { IntlProvider, load } from '#progress/kendo-react-intl';
import likelySubtags from 'cldr-core/supplemental/likelySubtags.json';
import currencyData from 'cldr-core/supplemental/currencyData.json';
import weekData from 'cldr-core/supplemental/weekData.json';
load(likelySubtags, currencyData, weekData);
weekData.supplemental.weekData.firstDay.US = 'mon';
class App extends React.Component {
render() {
return (
<IntlProvider locale={'en-US'}>
<div className="example-wrapper row">
<Calendar />
</div>
</IntlProvider>
);
}
}
ReactDOM.render(
<App />,
document.querySelector('my-app')
);
And here is live version of the above.

Can I have my apps share single redux store?

I have a complex analytics html page, I have converted most of the elements into react components, most of my elements are organized into two sections top / bottom.
My setup is working, yet, I'm wondering if this is legal / correct way of setting things up?
import React from 'react';
import ReactDOM from 'react-dom';
import { Provider } from 'react-redux';
import { createStore, applyMiddleware } from 'redux';
import TopSection from './components/app';
import BottomSection from './components/app_content';
import reducers from './reducers';
const createStoreWithMiddleware = applyMiddleware()(createStore);
// Top Section
ReactDOM.render(
<Provider store={createStoreWithMiddleware(reducers)}>
<TopSection />
</Provider>
, document.querySelector('.top-section'));
// Bottom Section
ReactDOM.render(
<Provider store={createStoreWithMiddleware(reducers)}>
<BottomSection />
</Provider>
, document.querySelector('.bottom-section'));
You can have multiple store if it is really needed. But it is strongly recommoneded to NOT go with multiple store setups. Single store is always best choice because,
it's reliable
it's fast
debugging is easy
Here are the links, why multiple store setup is not recommended.
redux.js.org
stackoverflow.com

Resources