1
0
mirror of https://github.com/lorien/awesome-web-scraping.git synced 2024-11-24 08:32:19 +02:00

Add arachnid2 to Ruby scrapers

This commit is contained in:
Sam Nissen 2020-06-25 16:52:15 +01:00
parent 3f44997732
commit 6b215d1ab8

View File

@ -49,6 +49,7 @@ This list contains ruby libraries related to web scraping and data processing
* [Anemone](https://github.com/chriskite/anemone) - web spider framework that can spider a domain and collect useful information about the pages it visits * [Anemone](https://github.com/chriskite/anemone) - web spider framework that can spider a domain and collect useful information about the pages it visits
* [Spidr](https://github.com/postmodern/spidr) - versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use. * [Spidr](https://github.com/postmodern/spidr) - versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
* [kimuraframework](https://github.com/vifreefly/kimuraframework) - Modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites * [kimuraframework](https://github.com/vifreefly/kimuraframework) - Modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites
* [arachnid2](https://github.com/samnissen/arachnid2) A simple, fast, framework-less crawler with sensible defaults and lots of options. Crawls the page and runs your code directly against either Typhoeus responses or a Watir browser.
## HTML/XML Parsing ## HTML/XML Parsing
@ -206,9 +207,9 @@ This list contains ruby libraries related to web scraping and data processing
## Multiprocessing ## Multiprocessing
* [Celluloid](https://github.com/celluloid/celluloid) - Actor-based concurrent object framework for Ruby * [Celluloid](https://github.com/celluloid/celluloid) - Actor-based concurrent object framework for Ruby
* [Parallel](https://github.com/grosser/parallel) - Run any code in parallel Processes (> use all CPUs) or Threads (> speedup blocking operations). * [Parallel](https://github.com/grosser/parallel) - Run any code in parallel Processes (> use all CPUs) or Threads (> speedup blocking operations).
* [Concurrent Ruby](https://github.com/ruby-concurrency/concurrent-ruby) - Modern concurrency tools including agents, futures, promises, thread pools, supervisors, and more. * [Concurrent Ruby](https://github.com/ruby-concurrency/concurrent-ruby) - Modern concurrency tools including agents, futures, promises, thread pools, supervisors, and more.
* [childprocess](https://github.com/jarib/childprocess) - Cross-platform ruby library for managing child processes. * [childprocess](https://github.com/jarib/childprocess) - Cross-platform ruby library for managing child processes.
* [forkoff](https://github.com/ahoward/forkoff) - brain-dead simple parallel processing for ruby. * [forkoff](https://github.com/ahoward/forkoff) - brain-dead simple parallel processing for ruby.
* [posix-spawn](https://github.com/rtomayko/posix-spawn) - Fast Process::spawn for Rubys >= 1.8.7 based on the posix_spawn() system interfaces. * [posix-spawn](https://github.com/rtomayko/posix-spawn) - Fast Process::spawn for Rubys >= 1.8.7 based on the posix_spawn() system interfaces.
@ -278,7 +279,7 @@ This list contains ruby libraries related to web scraping and data processing
* [geocoder](https://github.com/alexreisner/geocoder) - A complete geocoding solution for Ruby. With Rails it adds geocoding (by street or IP address), reverse geocoding (find street address based on given coordinates), and distance queries. * [geocoder](https://github.com/alexreisner/geocoder) - A complete geocoding solution for Ruby. With Rails it adds geocoding (by street or IP address), reverse geocoding (find street address based on given coordinates), and distance queries.
* [Geokit](https://github.com/geokit/geokit) - Geokit gem provides geocoding and distance/heading calculations. * [Geokit](https://github.com/geokit/geokit) - Geokit gem provides geocoding and distance/heading calculations.
* [geoip](https://github.com/cjheath/geoip) - Searches a GeoIP database for a given host or IP address, and returns information about the country where the IP address is allocated, and the city, ISP and other information. * [geoip](https://github.com/cjheath/geoip) - Searches a GeoIP database for a given host or IP address, and returns information about the country where the IP address is allocated, and the city, ISP and other information.
## Other Ruby Lists ## Other Ruby Lists
* [awesome-ruby](https://github.com/markets/awesome-ruby/blob/master/README.md) by markets * [awesome-ruby](https://github.com/markets/awesome-ruby/blob/master/README.md) by markets