1
0
mirror of https://github.com/alecthomas/chroma.git synced 2025-03-25 21:39:02 +02:00

23 Commits

Author SHA1 Message Date
Gusted
4d11870090
Don't output extra whitespace in YAML multiline ()
This resolves a particular issue with parsing YAML multiline, for
example:
```yaml
a: |
  multiline literal
  line 2
```

The regex used would capture the amount of indentation in the third
capture group and then use that as a kind of "status" to know which
lines are part of the indented multiline. However, because its a
captured group it has to be assigned a token which was `TextWhitespace`.
This meant that the indentation was outputted after the multiline,
technically it should be seen as an non-captured group, but then its no
longer to refer to it in the regex. Therefore I've gone with the
solution to add a new token, Ignore, which will not be emitted as a
token in the iterator, which can safely be used to make use of capture
groups but not have them show up in the output.

## Before

![image](https://github.com/user-attachments/assets/c29353c5-9e15-4f14-a733-57a60fb51910)

## After

![image](https://github.com/user-attachments/assets/57b5d129-a9d3-4b84-ae1f-dc05182b9ad3)
2024-08-23 06:58:31 +10:00
Koki Fukuda
895a0488b5
Port Minecraft lexers from Pygments ()
Ported lexers for mcfuntion, snbt from Pygments using
`pygments2chroma_xml.py` script.

While doing so, I encountered lack of `LiteralNumberByte` in TokenType,
so I've added the type and regenerated tokentype_enumer.go.
2024-08-23 06:04:41 +10:00
Alec Thomas
c263f6fa19
feat: XML style definitions ()
Fixes .
2022-11-01 21:23:14 -07:00
Alec Thomas
d0e811c0ef
fix: use a class with line links when requested ()
Fixes 
2022-10-31 23:28:30 -07:00
Alec Thomas
cc2dd5b8ad Version 2 of Chroma
This cleans up the API in general, removing a bunch of deprecated stuff,
cleaning up circular imports, etc.

But the biggest change is switching to an optional XML format for the
regex lexer.

Having lexers defined only in Go is not ideal for a couple of reasons.
Firstly, it impedes a significant portion of contributors who use Chroma
in Hugo, but don't know Go. Secondly, it bloats the binary size of any
project that imports Chroma.

Why XML? YAML is an abomination and JSON is not human editable. XML
also compresses very well (eg. Go template lexer XML compresses from
3239 bytes to 718).

Why a new syntax format? All major existing formats rely on the
Oniguruma regex engine, which is extremely complex and for which there
is no Go port.

Why not earlier? Prior to the existence of fs.FS this was not a viable
option.

Benchmarks:

    $ hyperfine --warmup 3 \
        './chroma.master --version' \
        './chroma.xml-pre-opt --version' \
        './chroma.xml --version'
    Benchmark 1: ./chroma.master --version
      Time (mean ± σ):       5.3 ms ±   0.5 ms    [User: 3.6 ms, System: 1.4 ms]
      Range (min … max):     4.2 ms …   6.6 ms    233 runs

    Benchmark 2: ./chroma.xml-pre-opt --version
      Time (mean ± σ):      50.6 ms ±   0.5 ms    [User: 52.4 ms, System: 3.6 ms]
      Range (min … max):    49.2 ms …  51.5 ms    51 runs

    Benchmark 3: ./chroma.xml --version
      Time (mean ± σ):       6.9 ms ±   1.1 ms    [User: 5.1 ms, System: 1.5 ms]
      Range (min … max):     5.7 ms …  19.9 ms    196 runs

    Summary
      './chroma.master --version' ran
        1.30 ± 0.23 times faster than './chroma.xml --version'
        9.56 ± 0.83 times faster than './chroma.xml-pre-opt --version'

A slight increase in init time, but I think this is okay given the
increase in flexibility.

And binary size difference:

    $ du -h lexers.test*
    $ du -sh chroma*                                                                                                                                                                                                                                                                                                                                                                                                                                                             951371ms
    8.8M	chroma.master
    7.8M	chroma.xml
    7.8M	chroma.xml-pre-opt

Benchmarks:

    $ hyperfine --warmup 3 \
        './chroma.master --version' \
        './chroma.xml-pre-opt --version' \
        './chroma.xml --version'
    Benchmark 1: ./chroma.master --version
      Time (mean ± σ):       5.3 ms ±   0.5 ms    [User: 3.6 ms, System: 1.4 ms]
      Range (min … max):     4.2 ms …   6.6 ms    233 runs

    Benchmark 2: ./chroma.xml-pre-opt --version
      Time (mean ± σ):      50.6 ms ±   0.5 ms    [User: 52.4 ms, System: 3.6 ms]
      Range (min … max):    49.2 ms …  51.5 ms    51 runs

    Benchmark 3: ./chroma.xml --version
      Time (mean ± σ):       6.9 ms ±   1.1 ms    [User: 5.1 ms, System: 1.5 ms]
      Range (min … max):     5.7 ms …  19.9 ms    196 runs

    Summary
      './chroma.master --version' ran
        1.30 ± 0.23 times faster than './chroma.xml --version'
        9.56 ± 0.83 times faster than './chroma.xml-pre-opt --version'

Incompatible changes:

- (*RegexLexer).SetAnalyser: changed from func(func(text string) float32) *RegexLexer to func(func(text string) float32) Lexer
- (*TokenType).UnmarshalJSON: removed
- Lexer.AnalyseText: added
- Lexer.SetAnalyser: added
- Lexer.SetRegistry: added
- MustNewLazyLexer: removed
- MustNewLexer: changed from func(*Config, Rules) *RegexLexer to func(*Config, func() Rules) *RegexLexer
- Mutators: changed from func(...Mutator) MutatorFunc to func(...Mutator) Mutator
- NewLazyLexer: removed
- NewLexer: changed from func(*Config, Rules) (*RegexLexer, error) to func(*Config, func() Rules) (*RegexLexer, error)
- Pop: changed from func(int) MutatorFunc to func(int) Mutator
- Push: changed from func(...string) MutatorFunc to func(...string) Mutator
- TokenType.MarshalJSON: removed
- Using: changed from func(Lexer) Emitter to func(string) Emitter
- UsingByGroup: changed from func(func(string) Lexer, int, int, ...Emitter) Emitter to func(int, int, ...Emitter) Emitter
2022-01-27 15:22:00 +11:00
Siavash Askari Nasr
5ed5054d73 Put lines in a div & code lines in a span, Add WrapLongLines Option 2021-11-09 19:14:33 +11:00
Siavash Askari Nasr
225e1862d3 Pass *LexerState as context to emitters
Useful for accessing named capture groups and context set by
`mutators` and other field and methods LexerState provides.
2021-05-07 22:55:54 +10:00
Alec Thomas
022b6f4fc2 Fix --json.
Fixes .
2018-11-12 18:50:00 +11:00
Daniel Eloff
9c3abeae1d Tokens by value ()
This results in about a 8% improvement in speed.
2018-11-04 10:22:51 +11:00
Kaushal Modi
371820dad6 Assign .gl class to GenericUnderline; add CSS rules for the same
'l' in gl is for under(l)ine, as the "gu" class is taken by
GenericSubheading.

- Rules for GenericUnderline are added to all the styles
- Make "Underline" style insert "text-decoration: underline" in CSS.

Fixes https://github.com/alecthomas/chroma/issues/159.
2018-08-01 17:28:52 -04:00
Alec Thomas
e56590a815 Add data-driven test framework for lexers.
See .
2018-01-02 14:53:25 +11:00
Bjørn Erik Pedersen
27733ac753 Add table styled line numbers ()
Fixes 
2017-10-13 10:49:20 +11:00
Alec Thomas
bdc1124369 Switch to Pygments-style CSS class names.
Add GitHub theme + CSS to style importer.
2017-09-25 21:46:25 +10:00
Alec Thomas
cc0e4a59ab Switch to an Iterator interface.
This is to solve an issue where writers returned by the Formatter
were often stateful, but this fact was not obvious to the API consumer,
and failed in interesting ways.
2017-09-20 22:19:36 +10:00
Alec Thomas
a5637e60b2 Support for highlighting ranges of lines. 2017-09-20 14:24:49 +10:00
Alec Thomas
3f230ec717 Add support for line numbers. 2017-09-20 13:33:44 +10:00
Alec Thomas
c8636118d5 Remove unused dark/light style type. 2017-09-18 14:19:59 +10:00
Alec Thomas
a10fd0a23d Switch to github.com/dlclark/regexp2.
This makes translating Pygments lexers much much simpler (and possible).
2017-09-18 11:16:44 +10:00
Alec Thomas
d12529ae61 HTML formatter + import all Pygments styles. 2017-07-20 00:01:29 -07:00
Alec Thomas
1f47bd705c Use pointers to tokens + support regex flags in importer. 2017-06-07 19:47:59 +10:00
Alec Thomas
5dedc6e45b Add a bunch of automatically translated lexers. 2017-06-07 19:47:59 +10:00
Alec Thomas
b30de35ff1 Use a callback to emit tokens.
This is a) faster and b) supports streaming output.
2017-06-07 19:47:59 +10:00
Alec Thomas
b2fb8edf77 Initial commit! Working! 2017-06-07 19:47:59 +10:00