1
0
mirror of https://github.com/lorien/awesome-web-scraping.git synced 2024-11-24 08:32:19 +02:00

Update ruby.md

This commit is contained in:
Gregory Petukhov 2015-08-16 23:17:38 +05:00
parent fbe1bc1d86
commit 670c6de632

14
ruby.md
View File

@ -86,6 +86,8 @@ This list contains ruby libraries related to web scraping and data processing
*Libraries for parsing and manipulating specific text formats.* *Libraries for parsing and manipulating specific text formats.*
* General
* [markup](https://github.com/github/markup) — GitHub library to convert mardown, rst, creole, etc into HTML
* Office * Office
* [Yomu](https://github.com/Erol) - Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf) * [Yomu](https://github.com/Erol) - Read text and metadata from files and documents (.doc, .docx, .pages, .odt, .rtf, .pdf)
* [spreadsheet](https://github.com/zdavatz/spreadsheet) - The Spreadsheet Library is designed to read and write Spreadsheet Documents. * [spreadsheet](https://github.com/zdavatz/spreadsheet) - The Spreadsheet Library is designed to read and write Spreadsheet Documents.
@ -101,6 +103,10 @@ This list contains ruby libraries related to web scraping and data processing
[PacketFul](https://github.com/packetfu/packetfu) - A library for reading and writing packets to an interface or to a libpcap-formatted file. [PacketFul](https://github.com/packetfu/packetfu) - A library for reading and writing packets to an interface or to a libpcap-formatted file.
* JSON * JSON
* [JsonCompare](https://github.com/a2design-company/json-compare) - Returns the difference between two JSON files * [JsonCompare](https://github.com/a2design-company/json-compare) - Returns the difference between two JSON files
* [JSON](https://github.com/flori/json) — includes pure Ruby and C implementation for JSON.
* [JSON::Stream](https://github.com/dgraham/json-stream) — a streaming JSON parser that generates SAX-like events.
* [YAJL](https://github.com/brianmario/yajl-ruby) — a streaming JSON parsing and encoding library for Ruby (C bindings to YAJL).
* [OJ](https://github.com/ohler55/oj) — Optimized JSON, as the name implies, was written to provide speed optimized JSON handling. So far it has achieved that, and is about 2 times faster than any other Ruby JSON parser, and 3 or more times faster at serializing JSON.
* Markdown * Markdown
* [kramdown](https://github.com/gettalong/kramdown) - Kramdown is yet-another-markdown-parser but fast, pure Ruby, using a strict syntax definition and supporting several common extensions. * [kramdown](https://github.com/gettalong/kramdown) - Kramdown is yet-another-markdown-parser but fast, pure Ruby, using a strict syntax definition and supporting several common extensions.
* [Maruku](https://github.com/bhollis/maruku) - A pure-Ruby Markdown-superset interpreter. * [Maruku](https://github.com/bhollis/maruku) - A pure-Ruby Markdown-superset interpreter.
@ -110,6 +116,12 @@ This list contains ruby libraries related to web scraping and data processing
* [Feedjira](https://github.com/feedjira/feedjira) - A feed fetching and parsing library. * [Feedjira](https://github.com/feedjira/feedjira) - A feed fetching and parsing library.
* [Ratom](https://github.com/seangeo/ratom) - A fast, libxml based, Ruby Atom library. * [Ratom](https://github.com/seangeo/ratom) - A fast, libxml based, Ruby Atom library.
* [Simple rss](https://github.com/cardmagic/simple-rss) - A simple, flexible, extensible, and liberal RSS and Atom reader. * [Simple rss](https://github.com/cardmagic/simple-rss) - A simple, flexible, extensible, and liberal RSS and Atom reader.
* BSON
* [BSON](https://github.com/mongodb/bson-ruby) — Ruby implementation of the BSON Specification (2.0.0+), http://bsonspec.org
* MessagePack
* [MessagePack](https://github.com/msgpack/msgpack-ruby) — an efficient binary serialization format. It lets you exchange data among multiple languages like JSON but it's faster and smaller. For example, small integers (like flags or error code) are encoded into a single byte, and typical short strings only require an extra byte in addition to the strings themselves. See http://msgpack.org
* Protobuf
* [Protobuf](https://github.com/localshred/protobuf) — Ruby implementation for Protocol Buffers.
## Natural Language Processing ## Natural Language Processing
@ -136,6 +148,8 @@ This list contains ruby libraries related to web scraping and data processing
* [childprocess](https://github.com/jarib/childprocess) - Cross-platform ruby library for managing child processes. * [childprocess](https://github.com/jarib/childprocess) - Cross-platform ruby library for managing child processes.
* [forkoff](https://github.com/ahoward/forkoff) - brain-dead simple parallel processing for ruby. * [forkoff](https://github.com/ahoward/forkoff) - brain-dead simple parallel processing for ruby.
* [posix-spawn](https://github.com/rtomayko/posix-spawn) - Fast Process::spawn for Rubys >= 1.8.7 based on the posix_spawn() system interfaces. * [posix-spawn](https://github.com/rtomayko/posix-spawn) - Fast Process::spawn for Rubys >= 1.8.7 based on the posix_spawn() system interfaces.
* [thread](https://github.com/meh/ruby-thread) — extensions to the thread library (includes thread pool).
* [Sprawling](https://github.com/dreikanter/ruby-bookmarks) — spawn gem for Rails to easily fork or thread long-running code blocks.
## Asynchronous ## Asynchronous