Web scraping: Difference between revisions

Revision as of 08:42, 2 September 2022

Web scraping is used to scrape data such as text and images from websites. In this example we will scrape data from the Gutenberg website.

The purpose of web scraping is to transform web content into usable data for other programs or analysis. In this case we transform the following website into CSV data which can be opened in Microsoft Excel or Numbers.

We will use a browser extension called WebScraper.io. You can install the extension for Firefox or for for Chrome.

@@ Line 4: / Line 4: @@
 {{Columns}}
 {{Column}}[[File:Alice Wonderland Gutenberg.png]]{{ColumnEnd}}
-{{Column}}[[File:Alice Wonderland Gutenberg.png]]{{ColumnEnd}}
+{{Column}}[[File:Alice Wonderland Scraped.png|thumb]]{{ColumnEnd}}
 {{ColumnsEnd}}
 We will use a browser extension called WebScraper.io. You can install the extension [https://addons.mozilla.org/en-US/firefox/addon/web-scraper/ for Firefox] or for [https://addons.mozilla.org/en-US/firefox/addon/web-scraper/ for Chrome].

Anonymous

Search

Web scraping: Difference between revisions

Namespaces

More

Page actions

Revision as of 08:42, 2 September 2022

Navigation

Main navigation

Namespaces

Wiki tools

Wiki tools

Anonymous

Search

Web scraping: Difference between revisions

Revision as of 08:42, 2 September 2022

Navigation

Wiki tools

Page tools