1
0
mirror of https://github.com/lorien/awesome-web-scraping.git synced 2024-11-24 08:32:19 +02:00

Add arachnid2 to Ruby scrapers

This commit is contained in:
Sam Nissen 2020-06-25 16:52:15 +01:00
parent 3f44997732
commit 6b215d1ab8

View File

@ -49,6 +49,7 @@ This list contains ruby libraries related to web scraping and data processing
* [Anemone](https://github.com/chriskite/anemone) - web spider framework that can spider a domain and collect useful information about the pages it visits * [Anemone](https://github.com/chriskite/anemone) - web spider framework that can spider a domain and collect useful information about the pages it visits
* [Spidr](https://github.com/postmodern/spidr) - versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use. * [Spidr](https://github.com/postmodern/spidr) - versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
* [kimuraframework](https://github.com/vifreefly/kimuraframework) - Modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites * [kimuraframework](https://github.com/vifreefly/kimuraframework) - Modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites
* [arachnid2](https://github.com/samnissen/arachnid2) A simple, fast, framework-less crawler with sensible defaults and lots of options. Crawls the page and runs your code directly against either Typhoeus responses or a Watir browser.
## HTML/XML Parsing ## HTML/XML Parsing