diff --git a/ruby.md b/ruby.md index df8cc4b..c3be701 100644 --- a/ruby.md +++ b/ruby.md @@ -33,9 +33,10 @@ This list contains ruby libraries related to web scraping and data processing * [nestful](https://github.com/maccman/nestful) Simple Ruby HTTP/REST client with a sane API * [EM-HTTP-Request](https://github.com/igrigorik/em-http-request) - EventMachine based asynchronous HTTP client + ## Web-Scraping Frameworks - * TODO + * [upton](https://github.com/propublica/upton) - A batteries-included framework for easy web-scraping ## HTML/XML Parsing @@ -53,17 +54,21 @@ This list contains ruby libraries related to web scraping and data processing *Libraries for parsing and manipulating specific text formats.* -* Office - * [Yomu](https://github.com/Erol) - Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf) - * [spreadsheet](https://github.com/zdavatz/spreadsheet) - The Spreadsheet Library is designed to read and write Spreadsheet Documents. - * [roo](https://github.com/Empact/roo) - Roo implements read access for all spreadsheet types and read/write access for Google spreadsheets. - * [google-spreadsheet-ruby](https://github.com/gimite/google-spreadsheet-ruby) - This is a library to read/write Google Spreadsheet. - * [rubyXL](https://github.com/weshatheleopard/rubyXL) - rubyXL is a gem which allows the parsing, creation, and manipulation of Microsoft Excel (.xlsx/.xlsm) Documents - * [remote_table](https://github.com/seamusabshere/remote_table) - Open local or remote XLSX, XLS, ODS, CSV (comma separated), TSV (tab separated), other delimited, fixed-width files, and Google Docs. - * [sheets](https://github.com/bspaulding/Sheets) - Work with spreadsheets easily in a native ruby format. - * [workbook](https://github.com/murb/workbook) - Workbook contains workbooks, as in a table, contains rows, contains cells, reads/writes excel, ods and csv and tab separated files... - * [oxcelix](https://github.com/gbiczo/oxcelix) - A fast Excel 2007/2010 (.xlsx) file parser that returns a collection of Matrix objects - * [wrap_excel](https://github.com/tomiacannondale/wrap_excel) - WrapExcel is to wrap the win32ole, and easy to use Excel operations with ruby. Detailed description please see the README. + * Office + * [Yomu](https://github.com/Erol) - Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf) + * [spreadsheet](https://github.com/zdavatz/spreadsheet) - The Spreadsheet Library is designed to read and write Spreadsheet Documents. + * [roo](https://github.com/Empact/roo) - Roo implements read access for all spreadsheet types and read/write access for Google spreadsheets. + * [google-spreadsheet-ruby](https://github.com/gimite/google-spreadsheet-ruby) - This is a library to read/write Google Spreadsheet. + * [rubyXL](https://github.com/weshatheleopard/rubyXL) - rubyXL is a gem which allows the parsing, creation, and manipulation of Microsoft Excel (.xlsx/.xlsm) Documents + * [remote_table](https://github.com/seamusabshere/remote_table) - Open local or remote XLSX, XLS, ODS, CSV (comma separated), TSV (tab separated), other delimited, fixed-width files, and Google Docs. + * [sheets](https://github.com/bspaulding/Sheets) - Work with spreadsheets easily in a native ruby format. + * [workbook](https://github.com/murb/workbook) - Workbook contains workbooks, as in a table, contains rows, contains cells, reads/writes excel, ods and csv and tab separated files... + * [oxcelix](https://github.com/gbiczo/oxcelix) - A fast Excel 2007/2010 (.xlsx) file parser that returns a collection of Matrix objects + * [wrap_excel](https://github.com/tomiacannondale/wrap_excel) - WrapExcel is to wrap the win32ole, and easy to use Excel operations with ruby. Detailed description please see the README. + * libpcap + [PacketFul](https://github.com/packetfu/packetfu) - A library for reading and writing packets to an interface or to a libpcap-formatted file. + * JSON + * [JsonCompare](https://github.com/a2design-company/json-compare) - Returns the difference between two JSON files ## Natural Language Processing @@ -97,6 +102,7 @@ This list contains ruby libraries related to web scraping and data processing * [Delayed::Job](https://github.com/tobi/delayed_job) — Database backed asynchronous priority queue. * [Qu](https://github.com/bkeepers/qu) A Ruby library for queuing and processing background jobs. * [Sidekiq](https://github.com/mperham/sidekiq) Simple, efficient background processing for Ruby + * [Sneakers](https://github.com/jondot/sneakers) - A fast background processing framework for Ruby and RabbitMQ ## Cloud Computing * TODO @@ -117,7 +123,7 @@ This list contains ruby libraries related to web scraping and data processing *Libraries for extracting web contents.* -* TODO +* [Metainspector](https://github.com/jaimeiniesta/metainspector) - scrapes a given URL, and returns its title, meta description, meta keywords, an array with all the links, all the images in it, etc ## WebSocket