mirror of
https://github.com/lorien/awesome-web-scraping.git
synced 2024-11-24 08:32:19 +02:00
List of libraries, tools and APIs for web scraping and data processing.
captcha-bypasscaptcha-recaptchacrawlercrawlingcrawling-frameworkcrawling-pythoncrawling-toolscrapingscraping-frameworkscraping-pythonscraping-toolspiderweb-scrapingwebscraping
.gitignore | ||
console_tools.md | ||
CONTRIBUTING.md | ||
golang.md | ||
java.md | ||
javascript.md | ||
LICENSE | ||
Makefile | ||
new_language_template.md | ||
perl.md | ||
php.md | ||
python.md | ||
README.md | ||
ruby.md | ||
web_services.md |
Awesome Web Scraping
The list of tools, programming libraries and web services used in web scraping.
Web scraping chats: @grablab (English) and @grablab_ru (Russian)
Programming Libraries
Other Things
- Web Scraping Services
- Console tools
- dhamaniasad / HeadlessBrowsers - a list of (almost) all headless web browsers in existence
- lorien / awesome-python-dev - a list of tools for debugging, profiling and analyzing python programs.
Captcha Solving Services
- 2captcha.com - 2captcha.com
- anti-gate.com - anti-gate.com
Proxy Services
- proxyrack.com - Over 1,250,000 Unique Private IPs With Instant Setup
- webshare.io - 5000+ HTTP server proxies
- proxium.ru - extremely cheap small proxy lists
- scrapinghub.com/crawlera - Smart Proxy Network that automatically rotates IPs
About awesome-web-scraping
Make this list better! Your contributions are always welcome! See contributing how-to. To add new language to Programming Libraries section use new_language_template as starting point.
The list is based initially on some data from these sources awesome-python, awesome-php, awesome-ruby, ruby-nlp, awesome-javascript