Add weboob to Web Crawling section (!836) · Merge requests · Vinta Chen / awesome-python

Closed Administrator requested to merge github/fork/hydrargyrum/patch-1 into master Feb 21, 2017

Created by: hydrargyrum

What is this Python project?

It's a framework for scraping HTML sites, and aggregating data from multiple sites from a same category (e.g. banking sites, news sites, video sites, etc.). There are ready-made modules for popular websites and ready-apps to interact with them. Think youtube-dl applied to other domains than video!

What's the difference between this Python project and similar ones?

It's possible to scrape new websites with declarative-style extraction rules
It provides a standardized API for categories of sites for dedicated tasks (e.g. banking, web forums, video sites, news sites, music lyrics sites, etc.)
- Scraped websites are grouped in those categories
Scraped websites are grouped in categories for a dedicated task:
The project comes with many existing backends for real-life websites
It has an internal upgrade system