ripgrep

mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00

Author	SHA1	Message	Date
dana	86c890bcec	Improve detection of upper-case characters by smart-case feature Fixes #717 (partially) The previous implementation of the smart-case feature was actually too smart, in that it inspected the final character ranges in the AST to determine if the pattern contained upper-case characters. This meant that patterns like `foo\w` would not be handled case-insensitively, since `\w` includes the range of upper-case characters A–Z. As a medium-term solution to this problem, we now inspect the input pattern itself for upper-case characters, ignoring any that immediately follow a `\`. This neatly handles all of the most basic cases like `\w`, `\S`, and `É`, though it still has problems with more complex features like `\p{Ll}`. Handling those correctly will require improvements to the AST.	2017-12-18 17:58:26 -05:00
Andrew Gallant	efa4de8126	cargo: bump to 0.7.0	2017-10-21 22:40:10 -04:00
Andrew Gallant	1267f01c24	deps: upgrade to memchr 2	2017-10-21 22:40:09 -04:00
Andrew Gallant	c648eadbaa	Bump and update deps.	2017-03-12 21:33:13 -04:00
Andrew Gallant	c1b841e934	Add license files to each crate. Fixes #381	2017-03-12 16:57:15 -04:00
Andrew Gallant	057ed6305a	0.4.0	2017-01-13 23:46:21 -05:00
Andrew Gallant	163e00677a	Update to regex 0.2.	2017-01-01 01:03:21 -05:00
Andrew Gallant	d58236fbdc	bump various versions	2016-12-30 15:44:08 -05:00
Andrew Gallant	b65bb37b14	Remove superfluous memmap dependency in `grep` crate. Fixes #295.	2016-12-27 15:46:40 -05:00
Andrew Gallant	0473df1ef5	Disable Unicode mode for literal regex. When ripgrep detects a literal, it emits them as raw hex escaped byte sequences to Regex::new. This permits literal optimizations for arbitrary byte sequences (i.e., possibly invalid UTF-8). The problem is that Regex::new interprets hex escaped byte sequences as Unicode codepoints by default, but we want them to actually stand for their raw byte values. Therefore, disable Unicode mode. This is OK, since the regex is composed entirely of literals and literal extraction does Unicode case folding. Fixes #251	2016-11-28 18:31:58 -05:00
Andrew Gallant	301a3fd71d	Detect more uppercase literals for --smart-case. This changes the uppercase literal detection for the "smart case" functionality. In particular, a character class is considered to have an uppercase literal if at least one of its ranges starts or stops with an uppercase literal. Fixes #229	2016-11-28 17:57:26 -05:00
Andrew Gallant	8baa0e56b7	grep-0.1.4	2016-11-06 15:35:17 -05:00
Andrew Gallant	0222e024fe	Fixes a bug with --smart-case. This was a subtle bug, but the big picture was that the smart case information wasn't being carried through to the literal extraction in some cases. When this happened, it was possible to get back an incomplete set of literals, which would therefore miss some valid matches. The fix to this is to actually parse the regex and determine whether smart case applies before doing anything else. It's a little extra work, but parsing is pretty fast. Fixes #199	2016-11-06 12:07:47 -05:00
Andrew Gallant	d79add341b	Move all gitignore matching to separate crate. This PR introduces a new sub-crate, `ignore`, which primarily provides a fast recursive directory iterator that respects ignore files like gitignore and other configurable filtering rules based on globs or even file types. This results in a substantial source of complexity moved out of ripgrep's core and into a reusable component that others can now (hopefully) benefit from. While much of the ignore code carried over from ripgrep's core, a substantial portion of it was rewritten with the following goals in mind: 1. Reuse matchers built from gitignore files across directory iteration. 2. Design the matcher data structure to be amenable for parallelizing directory iteration. (Indeed, writing the parallel iterator is the next step.) Fixes #9, #44, #45	2016-10-29 20:48:59 -04:00
Andrew Gallant	d3e118a786	Fix debug expression statement.	2016-10-10 21:48:34 -04:00
Andrew Gallant	b62195b33f	grep 0.1.3	2016-09-25 22:29:35 -04:00
Andrew Gallant	6a8051b258	Don't union inner literals of repetitions. If we do, this results in extracting `foofoofoo` from `(\wfoo){3}`, which is wrong. This does prevent us from extracting `foofoofoo` from `foo{3}`, which is unfortunate, but we miss plenty of other stuff too. Literal extracting needs a good rethink (all the way down into the regex engine). Fixes #93	2016-09-25 20:10:28 -04:00
Andrew Gallant	1595f0faf5	Add --smart-case. It does what it says on the tin. Closes #70.	2016-09-24 21:51:04 -04:00
Andrew Gallant	24e14a0341	grep 0.1.2	2016-09-21 19:14:12 -04:00
Andrew Gallant	2a2b1506d4	Fix a performance bug where using -w could result in very bad performance. The specific issue is that -w causes the regex to be wrapped in Unicode word boundaries. Regrettably, Unicode word boundaries are the one thing our regex engine can't handle well in the presence of non-ASCII text. We work around its slowness by stripping word boundaries in some circumstances, and using the resulting expression as a way to produce match candidates that are then verified by the full original regex. This doesn't fix all cases, but it should fix all cases where -w is used.	2016-09-21 19:12:07 -04:00
Andrew Gallant	4d6b3c727e	Bump regex version.	2016-09-21 19:05:15 -04:00
Andrew Gallant	bf5d873099	grep 0.1.1	2016-09-17 11:32:47 -04:00
Andrew Gallant	d22a3ca3e5	Improve the "bad literal" error message. Incidentally, this was done by using the Debug impl for `char` instead of the Display impl. Cute. Fixes #5.	2016-09-16 18:12:00 -04:00
Andrew Gallant	5fdfae2f15	add readme	2016-09-13 21:15:10 -04:00
Andrew Gallant	7057ee91de	update grep Cargo.toml	2016-09-13 21:13:33 -04:00
Andrew Gallant	954fbeb1d8	Update regex.	2016-09-11 18:52:42 -04:00
Andrew Gallant	98a48b44bc	Fix off-by-one bug in searcher.	2016-09-10 01:35:30 -04:00
Andrew Gallant	a744ec133d	Rename xrep to ripgrep.	2016-09-08 16:15:44 -04:00
Andrew Gallant	af3b56a623	Fix grep match iterator.	2016-09-06 21:45:41 -04:00
Andrew Gallant	fd3e5069b6	Fix required literal handling and add debug prints. In particular, if we had an inner literal and were doing a case insensitive search, then the literals are dropped because we previously only allowed a single inner literal to have an effect. Now we allow alternations of inner literals, but still don't quite take full advantage.	2016-09-06 19:33:03 -04:00
Andrew Gallant	2bda77c414	Fix deps so that others can build it.	2016-09-05 18:22:12 -04:00
Andrew Gallant	0bf278e72f	making search work (finally)	2016-09-03 21:48:23 -04:00
Andrew Gallant	d011cea053	The search code is a mess, but... ... we now support inverted matches and line numbers!	2016-08-29 22:44:15 -04:00
Andrew Gallant	1c8379f55a	Implementing core functionality. Initially experimenting with crossbeam to manage synchronization.	2016-08-28 01:37:12 -04:00
Andrew Gallant	957f90c898	docs and small polish	2016-08-24 18:33:35 -04:00
Andrew Gallant	61f49ba716	Remove the buffered reader. We really need functionality like this when memory maps aren't suitable, either because they're too slow or because they just aren't available (like for reading stdin). However, this particular approach was completely bunk. Namely, the interface was all wrong. The caller needs to maintain some kind of control over the search buffers for special output features (like contexts or inverted matching), but this interface as written doesn't support that kind of pattern at all. So... back to the drawing board.	2016-08-24 18:06:42 -04:00
Andrew Gallant	e97d75c024	Refactor buffered test.	2016-08-08 19:17:25 -04:00
Andrew Gallant	076eeff3ea	update	2016-08-05 00:10:58 -04:00
Andrew Gallant	a3f609222c	progress	2016-06-22 21:19:02 -04:00
Andrew Gallant	0163b39faa	refactor progress	2016-06-20 16:55:13 -04:00

40 Commits