Sublime Forum

[SOLVED] .tmLanguage match and "patterns"

#1

Is it possible to have match tokens be re-matched like the “contains=” part of vim syntax files ?

Right now, it is impossible to do another thing than “captures” for submatches, which don’t work well with repeating patterns (it only catches the last one).

For example, I have a construct like so :

@div.class.toto#myid attr=hello another="hello"

The regexp matches “@something(\s*(.classname|#idname|attr=stuff))*” (it is of course much more complicated than that and much less readable)

I can’t use a region because I have no end token (it ends when I can’t match anymore), and I can’t have submatch.

Would it be possible to add the option of having “patterns” on match constructs also ?

0 Likes

#2

[quote=“Pwipwi”]Is it possible to have match tokens be re-matched like the “contains=” part of vim syntax files ?

Right now, it is impossible to do another thing than “captures” for submatches, which don’t work well with repeating patterns (it only catches the last one).

For example, I have a construct like so :

@div.class.toto#myid attr=hello another="hello"

The regexp matches “@something(\s*(.classname|#idname|attr=stuff))*” (it is of course much more complicated than that and much less readable)

I can’t use a region because I have no end token (it ends when I can’t match anymore), and I can’t have submatch.

Would it be possible to add the option of having “patterns” on match constructs also ?[/quote]

I don`t get what you want. Did you see this tutorial? sublimetext.info/docs/en/extensi … xdefs.html

0 Likes

#3

Yes I have.

AFAIK, I have two types of syntax group:

  • one with a beginning, an end, patterns found inside, and optionnaly captures for the beginning and the end.
  • one with one regexp for the whole match, with the possibility to color parentesized subgroups based on their number.

Here is the thing : I would like to say for the one in the regexp that I can have patterns as well inside what I matched, since the main problem with the parenthesis is that they can only get ONE match.

For example:

{ 'match': '@div(\s*((\.a)|(#a)|(a=b)))*', captures: { '2': { 'name': 'whatever'}, '3': {'name': ......}

Everything would go right with this input : “@div**.a#a a=b**”
BUT, if I wrote “@div.a.a**.a****#a** a=b a=b a=b”. In short, the “captures” when refering to groups that are repeated with * or + can only match the last one, which is a consequence of the fact that in every regexp library that exist when you want the corresponding match to a parenthesis, you only get the last one that matched.

Which is why I suggest the following : the possibility for a “match” construct to have a “patterns” entry that looks for the patterns inside the match just like if it were between a “begin” and an “end”.

vim does that by having a “contains” clause on “match” constructs, where you specify other rules that you want to look for in this particular match.

0 Likes

#4

[quote=“Pwipwi”]AFAIK, I have two types of syntax group:

  • one with a beginning, an end, patterns found inside, and optionnaly captures for the beginning and the end.
    [/quote]

This colorize all z and x after aa but before bb.

aazxzxxzzbbzxzx

{ "name": "variable.complex.ssraw", "begin": "(aa)", "beginCaptures": { "1": { "name": "keyword.ssraw" } }, "patterns": { "include": "z" }, { "name": "variable.parameter.function.python", "match": "z" }, { "include": "x" }, { "name": "entity.name.function.python", "match": "x" } ], "end": "bb" }

is it that you need?

0 Likes

#5

As I said at the beginning : “I can’t use a region because I have no end token (it ends when I can’t match anymore)”, which would translate into “I have no ‘bb’ to mark the end of the block.”

I solved it though, because the hint was in “it ends when I can’t match anymore” ; I have a “end” with a pattern of “(?!..)” with the same pattern I wanted to match before until I can’t anymore.

Although this works, it still feels inefficient, but well…

0 Likes

#6

[quote=“Pwipwi”]As I said at the beginning : “I can’t use a region because I have no end token (it ends when I can’t match anymore)”, which would translate into “I have no ‘bb’ to mark the end of the block.”
[/quote]

It will work until end of string or until “End”

$ end of the line (geocities.jp/kosako3/oniguruma/doc/RE.txt)

0 Likes

#7

I know that $ is the end of the line.

But I don’t want to go until the end of the line.

In “@hello.myclass.mystuff SOME TEXT .something”, “.something” must NOT be colored, which means that $ is not an acceptable solution.

As said, I used an “end” clause with a pretty sophisticated (?!..) group match that stops the region whenever I can not match anymore the patterns I want to match.

0 Likes

#8

I’m working on a pull request for the sublime-text-2-ruby-test plugin and am having a bit of a simliar problem as listed here.

The RubyTest plugin pops up a widget and dumps the stdout from the ‘cucumber’ command into the widget. It’s then up to a tmLanguage file to style it. I’ve gotten really quite far, see here - github.com/maltize/sublime-text … ts/pull/50 but the thing I can’t seem to figure out is how to match a step before a failure or pending notice. For example, how would I target line 3 below generically? There can be any number of steps (given, when, then, and, but) before it.

  Scenario: Successful withdrawal from an account in credit # features/cash_withdrawl.feature:14
    Given I have deposited $100 in my account               # features/step_definitions/steps.rb:1
    When I request $20                                      # features/step_definitions/steps.rb:6
      TODO (Cucumber::Pending)
      ./features/step_definitions/steps.rb:8:in `/^I request \$(\d+)$/'
      features/cash_withdrawl.feature:16:in `When I request $20'
    Then $20 should be dispensed                            # features/step_definitions/steps.rb:11

I’m also having trouble determining how SublimeText2 handles regex matches that overlap. Who wins? Can you make it so that your ‘begin’ and ‘end’ captures overlap and you match different things inside each with patterns? Perhaps I’m just speaking nonsense now :smile: Any help would be greatly apprecaited!!! Thanks much!
-Jon

0 Likes

#9

Just from your last paragraph, it sounds like you might want to look into positive/negative lookahead assertions. For example, **q(?=u) **will find a q which is followed by a u, but will not consume u within the match. In this way, your begin and end captures can overlap.

To match different things inside each pattern sounds like alternation (with grouping):

(bob|(ted)|(fred))

ted would be captured as group 2. If you don’t want bob included in the match then you could use a non-capturing group:

(?:bob|(ted)|(fred))

in which case, ted is in group 1.

If might also be worth my mentioning the ‘\G’ anchor which “matches at the position where the previous match ended”, but I haven’t worked this one out myself as yet :wink:

**

0 Likes