Home Download Buy Blog Forum Support

[SOLVED] .tmLanguage match and "patterns"

[SOLVED] .tmLanguage match and "patterns"

Postby Pwipwi on Tue Feb 07, 2012 10:45 am

Is it possible to have match tokens be re-matched like the "contains=" part of vim syntax files ?

Right now, it is impossible to do another thing than "captures" for submatches, which don't work well with repeating patterns (it only catches the last one).

For example, I have a construct like so :

Code: Select all
@div.class.toto#myid attr=hello another="hello"


The regexp matches "@something(\s*(\.classname|#idname|attr=stuff))*" (it is of course much more complicated than that and much less readable)

I can't use a region because I have no end token (it ends when I can't match anymore), and I can't have submatch.

Would it be possible to add the option of having "patterns" on match constructs also ?
Last edited by Pwipwi on Wed Feb 08, 2012 10:56 am, edited 1 time in total.
Pwipwi
 
Posts: 11
Joined: Tue Feb 07, 2012 10:31 am

Re: .tmLanguage match and "patterns"

Postby Cjkjvfnby on Tue Feb 07, 2012 11:00 am

Pwipwi wrote:Is it possible to have match tokens be re-matched like the "contains=" part of vim syntax files ?

Right now, it is impossible to do another thing than "captures" for submatches, which don't work well with repeating patterns (it only catches the last one).

For example, I have a construct like so :

Code: Select all
@div.class.toto#myid attr=hello another="hello"


The regexp matches "@something(\s*(\.classname|#idname|attr=stuff))*" (it is of course much more complicated than that and much less readable)

I can't use a region because I have no end token (it ends when I can't match anymore), and I can't have submatch.

Would it be possible to add the option of having "patterns" on match constructs also ?


I don`t get what you want. Did you see this tutorial? http://sublimetext.info/docs/en/extensi ... xdefs.html
Cjkjvfnby
 
Posts: 20
Joined: Wed Feb 01, 2012 11:35 am

Re: .tmLanguage match and "patterns"

Postby Pwipwi on Tue Feb 07, 2012 2:41 pm

Yes I have.

AFAIK, I have two types of syntax group:

* one with a beginning, an end, patterns found inside, and optionnaly captures for the beginning and the end.
* one with one regexp for the whole match, with the possibility to color parentesized subgroups based on their number.

Here is the thing : I would like to say for the one in the regexp that I can have patterns as well inside what I matched, since the main problem with the parenthesis is that *they can only get ONE match*.

For example:

Code: Select all
{ 'match': '@div(\s*((\.a)|(#a)|(a=b)))*', captures: { '2': { 'name': 'whatever'}, '3': {'name': ......}


Everything would go right with this input : "@div.a#a a=b"
BUT, if I wrote "@div.a.a.a#a a=b a=b a=b". In short, the "captures" when refering to groups that are repeated with * or + can only match _the last one_, which is a consequence of the fact that in every regexp library that exist when you want the corresponding match to a parenthesis, you only get the last one that matched.

Which is why I suggest the following : the possibility for a "match" construct to have a "patterns" entry that looks for the patterns *inside the match* just like if it were between a "begin" and an "end".

vim does that by having a "contains" clause on "match" constructs, where you specify other rules that you want to look for in this particular match.
Pwipwi
 
Posts: 11
Joined: Tue Feb 07, 2012 10:31 am

Re: .tmLanguage match and "patterns"

Postby Cjkjvfnby on Tue Feb 07, 2012 8:31 pm

Pwipwi wrote:AFAIK, I have two types of syntax group:
* one with a beginning, an end, patterns found inside, and optionnaly captures for the beginning and the end.


This colorize all z and x after aa but before bb.

aazxzxxzzbbzxzx

Code: Select all
{ "name": "variable.complex.ssraw",
   "begin": "(aa)",
   "beginCaptures": {
       "1": { "name": "keyword.ssraw" }
   },
   "patterns": [
       { "include": "z" },
       {  "name": "variable.parameter.function.python",
          "match": "z"
       },
       { "include": "x" },
       {  "name": "entity.name.function.python",
          "match": "x"
       }
   ],
   "end": "bb"
  }


is it that you need?
Cjkjvfnby
 
Posts: 20
Joined: Wed Feb 01, 2012 11:35 am

Re: .tmLanguage match and "patterns"

Postby Pwipwi on Tue Feb 07, 2012 10:14 pm

As I said at the beginning : "I can't use a region because I have no end token (it ends when I can't match anymore)", which would translate into "I have no 'bb' to mark the end of the block."

I solved it though, because the hint was in "it ends when I can't match anymore" ; I have a "end" with a pattern of "(?!...)" with the same pattern I wanted to match before until I can't anymore.

Although this works, it still feels inefficient, but well...
Pwipwi
 
Posts: 11
Joined: Tue Feb 07, 2012 10:31 am

Re: .tmLanguage match and "patterns"

Postby Cjkjvfnby on Wed Feb 08, 2012 7:29 am

Pwipwi wrote:As I said at the beginning : "I can't use a region because I have no end token (it ends when I can't match anymore)", which would translate into "I have no 'bb' to mark the end of the block."

It will work until end of string or until "End"

I solved it though, because the hint was in "it ends when I can't match anymore" ; I have a "end" with a pattern of "(?!...)" with the same pattern I wanted to match before until I can't anymore.


$ end of the line (http://www.geocities.jp/kosako3/oniguruma/doc/RE.txt)
Cjkjvfnby
 
Posts: 20
Joined: Wed Feb 01, 2012 11:35 am

Re: .tmLanguage match and "patterns"

Postby Pwipwi on Wed Feb 08, 2012 10:56 am

I know that $ is the end of the line.

But I don't want to go until the end of the line.

In "@hello.myclass.mystuff SOME TEXT .something", ".something" must NOT be colored, which means that $ is not an acceptable solution.

As said, I used an "end" clause with a pretty sophisticated (?!...) group match that stops the region whenever I can not match anymore the patterns I want to match.
Pwipwi
 
Posts: 11
Joined: Tue Feb 07, 2012 10:31 am

Re: [SOLVED] .tmLanguage match and "patterns"

Postby j2fly on Sat Feb 25, 2012 10:53 pm

I'm working on a pull request for the sublime-text-2-ruby-test plugin and am having a bit of a simliar problem as listed here.

The RubyTest plugin pops up a widget and dumps the stdout from the 'cucumber' command into the widget. It's then up to a tmLanguage file to style it. I've gotten really quite far, see here - https://github.com/maltize/sublime-text ... ts/pull/50 but the thing I can't seem to figure out is how to match a step before a failure or pending notice. For example, how would I target line 3 below generically? There can be any number of steps (given, when, then, and, but) before it.

Code: Select all
  Scenario: Successful withdrawal from an account in credit # features/cash_withdrawl.feature:14
    Given I have deposited $100 in my account               # features/step_definitions/steps.rb:1
    When I request $20                                      # features/step_definitions/steps.rb:6
      TODO (Cucumber::Pending)
      ./features/step_definitions/steps.rb:8:in `/^I request \$(\d+)$/'
      features/cash_withdrawl.feature:16:in `When I request $20'
    Then $20 should be dispensed                            # features/step_definitions/steps.rb:11


I'm also having trouble determining how SublimeText2 handles regex matches that overlap. Who wins? Can you make it so that your 'begin' and 'end' captures overlap and you match different things inside each with patterns? Perhaps I'm just speaking nonsense now :) Any help would be greatly apprecaited!!! Thanks much!
-Jon
j2fly
 
Posts: 1
Joined: Sat Feb 25, 2012 10:13 pm

Re: [SOLVED] .tmLanguage match and "patterns"

Postby agibsonsw on Sat Feb 25, 2012 11:34 pm

Just from your last paragraph, it sounds like you might want to look into positive/negative lookahead assertions. For example, q(?=u) will find a q which is followed by a u, but will not consume u within the match. In this way, your begin and end captures can overlap.

To match different things inside each pattern sounds like alternation (with grouping):

(bob|(ted)|(fred))

ted would be captured as group 2. If you don't want bob included in the match then you could use a non-capturing group:

(?:bob|(ted)|(fred))

in which case, ted is in group 1.

If might also be worth my mentioning the '\G' anchor which "matches at the position where the previous match ended", but I haven't worked this one out myself as yet ;)

[I should say that I'm discussing regex in general, rather than specifically within tmLanguage, but I believe it should apply.]
"I'm here to save your life. But if I'm going to do that, I'll need total uninanonynymity." Me Myself & Irene.
agibsonsw
 
Posts: 901
Joined: Fri Jan 27, 2012 9:11 pm


Return to Plugin Development

Who is online

Users browsing this forum: No registered users and 1 guest