1
0
mirror of https://github.com/BurntSushi/ripgrep.git synced 2025-06-25 14:22:54 +02:00
Commit Graph

1389 Commits

Author SHA1 Message Date
23aec58669 pin nightly 2017-03-12 20:52:28 -04:00
ae863bc7aa Improve docs for --glob flag.
Fixes #345.
2017-03-12 20:31:09 -04:00
f0d3cae569 Clarify -u/--unrestricted flags.
Fixes #340
2017-03-12 20:24:45 -04:00
4ef4818130 No line numbers when searching only stdin.
This changes the default behavior of ripgrep to *not* show line numbers
when it is printing to a tty and is only searching stdin.

Fixes #380

[breaking-change]
2017-03-12 20:21:40 -04:00
8db24e1353 Stop aggressive inlining.
It's not clear what exactly is happening here, but the Read implementation
for text decoding appears a bit sensitive. Small pertubations in the code
appear to have a nearly 100% impact on the overall speed of ripgrep when
searching UTF-16 files.

I haven't had the time to examine the generated code in detail, but
`perf stat` seems to think that the instruction cache is performing a lot
worse when the code slows down. This might mean that excessive inlining
causes a different code structure that leads to less-than-optimal icache
usage, but it's at best a guess.

Explicitly disabling the inline for the cold path seems to help the
optimizer figure out the right thing.
2017-03-12 20:21:22 -04:00
8bbe58d623 Add support for additional text encodings.
This includes, but is not limited to, UTF-16, latin-1, GBK, EUC-JP and
Shift_JIS. (Courtesy of the `encoding_rs` crate.)

Specifically, this feature enables ripgrep to search files that are
encoded in an encoding other than UTF-8. The list of available encodings
is tied directly to what the `encoding_rs` crate supports, which is in
turn tied to the Encoding Standard. The full list of available encodings
can be found here: https://encoding.spec.whatwg.org/#concept-encoding-get

This pull request also introduces the notion that text encodings can be
automatically detected on a best effort basis. Currently, the only
support for this is checking for a UTF-16 bom. In all other cases, a
text encoding of `auto` (the default) implies a UTF-8 or ASCII
compatible source encoding. When a text encoding is otherwise specified,
it is unconditionally used for all files searched.

Since ripgrep's regex engine is fundamentally built on top of UTF-8,
this feature works by transcoding the files to be searched from their
source encoding to UTF-8. This transcoding only happens when:

1. `auto` is specified and a non-UTF-8 encoding is detected.
2. A specific encoding is given by end users (including UTF-8).

When transcoding occurs, errors are handled by automatically inserting
the Unicode replacement character. In this case, ripgrep's output is
guaranteed to be valid UTF-8 (excluding non-UTF-8 file paths, if they
are printed).

In all other cases, the source text is searched directly, which implies
an assumption that it is at least ASCII compatible, but where UTF-8 is
most useful. In this scenario, encoding errors are not detected. In this
case, ripgrep's output will match the input exactly, byte-for-byte.

This design may not be optimal in all cases, but it has some advantages:

1. In the happy path ("UTF-8 everywhere") remains happy. I have not been
   able to witness any performance regressions.
2. In the non-UTF-8 path, implementation complexity is kept relatively
   low. The cost here is transcoding itself. A potentially superior
   implementation might build decoding of any encoding into the regex
   engine itself. In particular, the fundamental problem with
   transcoding everything first is that literal optimizations are nearly
   negated.

Future work should entail improving the user experience. For example, we
might want to auto-detect more text encodings. A more elaborate UX
experience might permit end users to specify multiple text encodings,
although this seems hard to pull off in an ergonomic way.

Fixes #1
2017-03-12 19:54:48 -04:00
b3fd0df94b Fixes #394 - Added in svg to the types file 2017-03-12 19:52:01 -04:00
c1b841e934 Add license files to each crate.
Fixes #381
2017-03-12 16:57:15 -04:00
f5ede0e319 Add scss and ejs.
We add scss to the existing `css` file type and `ejs` to the existing
`html` file type.

Fixes #393
2017-03-12 16:51:55 -04:00
6ecffec537 Fix test on Windows.
(This is what I get for directly pushing to master.)
2017-03-12 16:07:31 -04:00
80e91a1f1d Fix leading slash bug when used with !.
When writing paths like `!/foo` in gitignore files (or when using the
-g/--glob flag), the presence of `!` would prevent the gitignore builder
from noticing the leading slash, which causes absolute path matching to
fail.

Fixes #405
2017-03-12 15:51:17 -04:00
d570f78144 Add _rg.ps1 to windows zip
Tested with local cargo build paths.
2017-03-09 09:45:28 -05:00
7c37065911 update deps 2017-03-08 20:23:12 -05:00
50f7a60a8d Add "Known issues" section in README.md
Also document that ctrl-c doesn't restore the termcolor.
Fixes #347.
2017-03-08 10:18:19 -05:00
33ec988d70 Remove regex build-dependency in Cargo.toml 2017-03-08 10:17:18 -05:00
adff43fbb4 Remove clap validator + add max-filesize integration tests 2017-03-08 10:17:18 -05:00
71585f6d47 Reduce unnecessary stat calls for max_filesize 2017-03-08 10:17:18 -05:00
714ae82241 Add --max-filesize option to cli
The --max-filesize option allows filtering files which are larger than
the specified limit. This is potentially useful if one is attempting to
search a number of large files without common file-types/suffixes.

See #369.
2017-03-08 10:17:18 -05:00
49fd668712 Add file size exclusion to walker
A maximum filesize can be specified as an argument to a `WalkBuilder`.
If a file exceeds the specified size it will be ignored as part of the
resulting file/directory set.

The filesize limit never applies to directories.
2017-03-08 10:17:18 -05:00
066f97d855 Add enclosing group to alternations in globs
Fixes #391.
2017-03-08 10:13:28 -05:00
df1bf4a042 Added Chocolatey to the installation list 2017-03-01 06:41:52 -05:00
4e8c0fc4ad bump clap to 2.20.5
Fixes #383
2017-02-25 18:43:13 -05:00
da1764dfd1 update env_logger to 0.4 2017-02-25 17:46:43 -05:00
48a8a3a691 kick travis 2017-02-24 08:41:20 -05:00
796eaab0d7 Add .log as FileType 2017-02-23 11:41:32 -05:00
bf49448e1e fix badges 2017-02-19 11:28:36 -05:00
cffba53379 use termcolor 0.3, not 0.1 2017-02-19 11:27:41 -05:00
79d40d0e20 Tweak how binary files are handled internally.
This commit fixes two issues. The first issue is that if a file contained
many NUL bytes without any LF bytes, then the InputBuffer would read the
entire file into memory. This is not typically a problem, but if you run
rg on /proc, then bad things can happen when reading virtual memory mapping
files. Arguably, such files should be ignored, but we should also try to
avoid exhausting memory too. We fix this by pushing the `-a/--text` flag
option down into InputBuffer, so that it knows to stop immediately if it
finds a NUL byte.

The other issue this fixes is that binary detection is now applied to every
buffer instead of just the first one. This helps avoid detecting too many
files as plain text if the first parts of a binary file happen to contain
no NUL bytes. This issue still persists somewhat in the memory map
searcher, since we probably don't want to search the entire file upfront
for NUL bytes before actually performing our search. Instead, we search the
first 10KB for now.

Fixes #52, Fixes #311
2017-02-18 16:20:21 -05:00
525b278049 Don't parses regexes with --files.
When the --files flag is given, ripgrep would still try to parse some of
the positional arguments as regexes. Don't do that.

Fixes #326
2017-02-18 15:34:54 -05:00
16de47920c Permit --heading to override --no-heading.
@kbknapp <3

Fixes #327
2017-02-18 15:25:08 -05:00
a114b86063 update termcolor dep 2017-02-18 15:09:25 -05:00
a5a16ebb27 termcolor-0.3.0 termcolor-0.3.0 2017-02-18 15:07:43 -05:00
8ac5bc0147 Remove Windows deps from ripgrep proper.
All Windows specific code has been (mostly) pushed out of ripgrep and
into its constituent libraries.
2017-02-18 15:06:20 -05:00
cf750a190f Implement Hash for Glob, and re-implement PartialEq using only non-redundant fields. 2017-02-18 11:46:03 -05:00
d825648b86 Remove lazy_static from globset 2017-02-12 15:37:50 -05:00
22cb644eb6 termcolor: add support for output to standard error
This is essentially a rename of the existing `Stdout` type to `StandardStream`
and a change of its constructor from a single `new()` function to have two
`stdout()` and `stderr()` functions.

Under the hood, we add add internal IoStandardStream{,Lock} enums that allow
us to abstract between Stdout and Stderr conveniently. The rest of the needed
changes then fall out fairly naturally.

Fixes #324.

[breaking-change]
2017-02-09 20:57:23 -05:00
e424f87487 Add .twig as a Filetype 2017-02-09 15:12:48 -05:00
f5b2c96b77 add '*.sass' to sass type 2017-01-31 07:26:25 -05:00
6e209b6fdb Look for global git/ignore in ~/.config/git, not ~/git
The documentation says:

> If `$XDG_CONFIG_HOME` is not set or is empty, then
> `$HOME/.config/git/ignore` is used instead.

This is the expected behavior, but the code looked at ~/git/ignore
instead.
2017-01-30 16:58:58 -05:00
72e3c54e0a Add Ceylon file type filtering 2017-01-24 07:16:53 -05:00
b67886264f Add 'text-processing' category. 2017-01-21 11:31:09 -05:00
e67ab459d3 Add categories to Cargo.toml 2017-01-20 14:20:28 -05:00
7a926d090d Fix homebrew formula 2017-01-19 08:33:54 -05:00
596f94aa7f Add shell completion files to ripgrep-bin.rb 2017-01-18 18:39:11 -05:00
de55d37bea File Types: Add .eex under Elixir
.eex is the default file ending for templates using Elixir's template engine EEx.
2017-01-18 18:38:53 -05:00
fecef10c1c update deps 2017-01-17 19:36:23 -05:00
79e5e6671f wincolor-0.1.2 wincolor-0.1.2 2017-01-17 19:34:48 -05:00
b04a68a782 Update 0.4.0 changelog.
It was missing a change about colors/styles.

Fixes #330
2017-01-17 19:34:18 -05:00
e573ab5c60 Add stderr support to wincolor. 2017-01-17 10:42:35 -05:00
f5a2d022ec Replace internal atty module with atty crate.
This removes all use of explicit unsafe in ripgrep proper except for
one: accessing the contents of a memory map. (Which may never go away.)
2017-01-15 16:32:30 -05:00