1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2024-12-12 19:18:24 +02:00
Commit Graph

609 Commits

Author SHA1 Message Date
Roman Proskuryakov
01deac9427 Add -0 shortcut for --null
Fixes #419
2017-03-28 18:37:40 -04:00
Andrew Gallant
b4bc3b6349 remove uninstall step 2017-03-28 12:14:15 -04:00
Andrew Gallant
685cc6c562 Add vim type.
It's the same as the vimscript type, but shorter and more obvious.

Fixes #415
2017-03-21 07:56:49 -04:00
Andrew Gallant
08c017330f bump termcolor dep 2017-03-15 07:15:39 -04:00
Andrew Gallant
2f3a8c7f69 termcolor-0.3.2 2017-03-15 06:58:09 -04:00
Andrew Gallant
3ac1b68e54 Add license info to termcolor crate.
Fixes #381
2017-03-15 06:57:54 -04:00
Andrew Gallant
0ebd5465b7 remove allow(dead_code) 2017-03-14 15:09:24 -04:00
Andrew Gallant
5cb4bb9ea0 bump ripgrep version in Cargo.lock 2017-03-14 15:09:24 -04:00
Leaf Garland
c8a179b4da Add powershell completions to build artifacts
Use a `ps:` (powershell) command to copy the completions file so that we
can use directory globbing to find the file.
2017-03-14 08:53:04 -04:00
Andrew Gallant
46f94826fd Update whirlwind tour with encoding info.
Fixes #1
2017-03-14 08:22:37 -04:00
Andrew Gallant
75f1855a91 Fix brew tap sha256 sum.
Fixes #407
2017-03-13 06:50:45 -04:00
Andrew Gallant
fd9870d668 Revert "Add _rg.ps1 to windows zip"
This reverts commit d570f78144.

This was reverted because it's blocking a 0.5.0 release. Windows is
foreign to me, and apparently globs are not allowed.

See:
https://ci.appveyor.com/project/BurntSushi/ripgrep/build/1.0.341/job/7o1jqicmtwm7oa00
2017-03-12 22:59:37 -04:00
Andrew Gallant
a3a2708067 update brew tap to 0.5.0 2017-03-12 22:55:59 -04:00
Andrew Gallant
78847b65c8 0.5.0 2017-03-12 22:32:43 -04:00
Andrew Gallant
e962eea1cc add date to CHANGELOG 2017-03-12 22:31:57 -04:00
Andrew Gallant
95bc678403 Fix interaction with clap.
Previously, `get_matches` would return even if --help or --version was
given, and we could check for them manually. That behavior seems to have
changed. Instead, we must use get_matches_safe to inspect the error to
determine what happened.

We can't use the same process for -V/--version since clap will
unconditionally print its own version info. Instead, we rename (internally)
the version flag so that clap doesn't interfere.
2017-03-12 22:30:54 -04:00
Andrew Gallant
68af3bbdc4 fix CHANGELOG link 2017-03-12 21:58:29 -04:00
Andrew Gallant
70b6bdb104 changelog 0.5.0 2017-03-12 21:57:50 -04:00
Andrew Gallant
c648eadbaa Bump and update deps. 2017-03-12 21:33:13 -04:00
Ralf Jung
d352b79294 Add new -M/--max-columns option.
This permits setting the maximum line width with respect to the number
of bytes in a line. Omitted lines (whether part of a match, replacement
or context) are replaced with a message stating that the line was
elided.

Fixes #129
2017-03-12 21:21:28 -04:00
Andrew Gallant
23aec58669 pin nightly 2017-03-12 20:52:28 -04:00
Andrew Gallant
ae863bc7aa Improve docs for --glob flag.
Fixes #345.
2017-03-12 20:31:09 -04:00
Andrew Gallant
f0d3cae569 Clarify -u/--unrestricted flags.
Fixes #340
2017-03-12 20:24:45 -04:00
Andrew Gallant
4ef4818130 No line numbers when searching only stdin.
This changes the default behavior of ripgrep to *not* show line numbers
when it is printing to a tty and is only searching stdin.

Fixes #380

[breaking-change]
2017-03-12 20:21:40 -04:00
Andrew Gallant
8db24e1353 Stop aggressive inlining.
It's not clear what exactly is happening here, but the Read implementation
for text decoding appears a bit sensitive. Small pertubations in the code
appear to have a nearly 100% impact on the overall speed of ripgrep when
searching UTF-16 files.

I haven't had the time to examine the generated code in detail, but
`perf stat` seems to think that the instruction cache is performing a lot
worse when the code slows down. This might mean that excessive inlining
causes a different code structure that leads to less-than-optimal icache
usage, but it's at best a guess.

Explicitly disabling the inline for the cold path seems to help the
optimizer figure out the right thing.
2017-03-12 20:21:22 -04:00
Andrew Gallant
8bbe58d623 Add support for additional text encodings.
This includes, but is not limited to, UTF-16, latin-1, GBK, EUC-JP and
Shift_JIS. (Courtesy of the `encoding_rs` crate.)

Specifically, this feature enables ripgrep to search files that are
encoded in an encoding other than UTF-8. The list of available encodings
is tied directly to what the `encoding_rs` crate supports, which is in
turn tied to the Encoding Standard. The full list of available encodings
can be found here: https://encoding.spec.whatwg.org/#concept-encoding-get

This pull request also introduces the notion that text encodings can be
automatically detected on a best effort basis. Currently, the only
support for this is checking for a UTF-16 bom. In all other cases, a
text encoding of `auto` (the default) implies a UTF-8 or ASCII
compatible source encoding. When a text encoding is otherwise specified,
it is unconditionally used for all files searched.

Since ripgrep's regex engine is fundamentally built on top of UTF-8,
this feature works by transcoding the files to be searched from their
source encoding to UTF-8. This transcoding only happens when:

1. `auto` is specified and a non-UTF-8 encoding is detected.
2. A specific encoding is given by end users (including UTF-8).

When transcoding occurs, errors are handled by automatically inserting
the Unicode replacement character. In this case, ripgrep's output is
guaranteed to be valid UTF-8 (excluding non-UTF-8 file paths, if they
are printed).

In all other cases, the source text is searched directly, which implies
an assumption that it is at least ASCII compatible, but where UTF-8 is
most useful. In this scenario, encoding errors are not detected. In this
case, ripgrep's output will match the input exactly, byte-for-byte.

This design may not be optimal in all cases, but it has some advantages:

1. In the happy path ("UTF-8 everywhere") remains happy. I have not been
   able to witness any performance regressions.
2. In the non-UTF-8 path, implementation complexity is kept relatively
   low. The cost here is transcoding itself. A potentially superior
   implementation might build decoding of any encoding into the regex
   engine itself. In particular, the fundamental problem with
   transcoding everything first is that literal optimizations are nearly
   negated.

Future work should entail improving the user experience. For example, we
might want to auto-detect more text encodings. A more elaborate UX
experience might permit end users to specify multiple text encodings,
although this seems hard to pull off in an ergonomic way.

Fixes #1
2017-03-12 19:54:48 -04:00
Joshua Horwitz
b3fd0df94b Fixes #394 - Added in svg to the types file 2017-03-12 19:52:01 -04:00
Andrew Gallant
c1b841e934 Add license files to each crate.
Fixes #381
2017-03-12 16:57:15 -04:00
Andrew Gallant
f5ede0e319 Add scss and ejs.
We add scss to the existing `css` file type and `ejs` to the existing
`html` file type.

Fixes #393
2017-03-12 16:51:55 -04:00
Andrew Gallant
6ecffec537 Fix test on Windows.
(This is what I get for directly pushing to master.)
2017-03-12 16:07:31 -04:00
Andrew Gallant
80e91a1f1d Fix leading slash bug when used with !.
When writing paths like `!/foo` in gitignore files (or when using the
-g/--glob flag), the presence of `!` would prevent the gitignore builder
from noticing the leading slash, which causes absolute path matching to
fail.

Fixes #405
2017-03-12 15:51:17 -04:00
Daniel Santa Cruz
d570f78144 Add _rg.ps1 to windows zip
Tested with local cargo build paths.
2017-03-09 09:45:28 -05:00
Andrew Gallant
7c37065911 update deps 2017-03-08 20:23:12 -05:00
Jean-Marie Comets
50f7a60a8d Add "Known issues" section in README.md
Also document that ctrl-c doesn't restore the termcolor.
Fixes #347.
2017-03-08 10:18:19 -05:00
Marc Tiehuis
33ec988d70 Remove regex build-dependency in Cargo.toml 2017-03-08 10:17:18 -05:00
Marc Tiehuis
adff43fbb4 Remove clap validator + add max-filesize integration tests 2017-03-08 10:17:18 -05:00
Marc Tiehuis
71585f6d47 Reduce unnecessary stat calls for max_filesize 2017-03-08 10:17:18 -05:00
tiehuis
714ae82241 Add --max-filesize option to cli
The --max-filesize option allows filtering files which are larger than
the specified limit. This is potentially useful if one is attempting to
search a number of large files without common file-types/suffixes.

See #369.
2017-03-08 10:17:18 -05:00
tiehuis
49fd668712 Add file size exclusion to walker
A maximum filesize can be specified as an argument to a `WalkBuilder`.
If a file exceeds the specified size it will be ignored as part of the
resulting file/directory set.

The filesize limit never applies to directories.
2017-03-08 10:17:18 -05:00
Marc Tiehuis
066f97d855 Add enclosing group to alternations in globs
Fixes #391.
2017-03-08 10:13:28 -05:00
David Salter
df1bf4a042 Added Chocolatey to the installation list 2017-03-01 06:41:52 -05:00
Andrew Gallant
4e8c0fc4ad bump clap to 2.20.5
Fixes #383
2017-02-25 18:43:13 -05:00
Igor Gnatenko
da1764dfd1 update env_logger to 0.4 2017-02-25 17:46:43 -05:00
Andrew Gallant
48a8a3a691 kick travis 2017-02-24 08:41:20 -05:00
deepy
796eaab0d7 Add .log as FileType 2017-02-23 11:41:32 -05:00
Andrew Gallant
bf49448e1e fix badges 2017-02-19 11:28:36 -05:00
Andrew Gallant
cffba53379 use termcolor 0.3, not 0.1 2017-02-19 11:27:41 -05:00
Andrew Gallant
79d40d0e20 Tweak how binary files are handled internally.
This commit fixes two issues. The first issue is that if a file contained
many NUL bytes without any LF bytes, then the InputBuffer would read the
entire file into memory. This is not typically a problem, but if you run
rg on /proc, then bad things can happen when reading virtual memory mapping
files. Arguably, such files should be ignored, but we should also try to
avoid exhausting memory too. We fix this by pushing the `-a/--text` flag
option down into InputBuffer, so that it knows to stop immediately if it
finds a NUL byte.

The other issue this fixes is that binary detection is now applied to every
buffer instead of just the first one. This helps avoid detecting too many
files as plain text if the first parts of a binary file happen to contain
no NUL bytes. This issue still persists somewhat in the memory map
searcher, since we probably don't want to search the entire file upfront
for NUL bytes before actually performing our search. Instead, we search the
first 10KB for now.

Fixes #52, Fixes #311
2017-02-18 16:20:21 -05:00
Andrew Gallant
525b278049 Don't parses regexes with --files.
When the --files flag is given, ripgrep would still try to parse some of
the positional arguments as regexes. Don't do that.

Fixes #326
2017-02-18 15:34:54 -05:00
Andrew Gallant
16de47920c Permit --heading to override --no-heading.
@kbknapp <3

Fixes #327
2017-02-18 15:25:08 -05:00