Sublime Forum

Can't figure out regex replace

#1

I’m using OSX sublime text 2, should be current version. I have regex turned on, and it’s finding what I want, which is;

<td>Foo [linebreak]
 [not </td>]</td>

In other words, some of the HTML I’m editing has a cell, with text, and then additional text on the next line (after a break). I’m trying to remove the newline, move the 2nd line of text up to the first.

$\n\s]*[a-zA-z]+

This regex is correctly finding what I want to find. However, I don’t know how to KEEP the text but remove the line break. Here is an example:

<td>Single [break]
[tab]use bottle for Injection</td>

Should instead be

<td>Single use bottle for injection</td>

I don’t know what to put into the replace field to ‘maintain’ the ‘foo’ word that is found with the [A-Za-z]+ part of the expression. I’ve tried \2, $2, &2, \2, $2, &2. None of those work.

0 Likes

#2

I’m having trouble following your samples, but here’s the general approach:

xxx(.*)yyy(.*)zzz

matches “xxx” followed by any number of characters followed by “yyy” followed by any number of characters followed by “zzz”. The parentheses mark a “capture group”; the first set of parentheses is capture group number 1, and the second set of parentheses is capture group number 2. You can have pretty much however many capture groups you want, and you can have capture groups inside capture groups. The number of a group is determined by counting left parentheses from the beginning of the regular expression.

In your replacement text, you use $n to refer to the text captured by the nth capture group. So

$2]$1]

would change “xxxalphayyybetazzz” to “[beta][alpha]”.

0 Likes

#3

Isn’t \n\s|\t]*(?=\w) what you’re looking for?

A linebreak, followed by either a space or tab (0 - unlimited times), up to the next word character (excluding it)?

If that’s the case, you one have to enter a simple space in the replace with field. No need for backreferences in the
search term.

[quote]

Single
use bottle for Injection Single use bottle for Injection Single use bottle for Injection[/quote]

leads to:

[quote]

Single use bottle for Injection Single use bottle for Injection Single use bottle for Injection[/quote]
0 Likes

#4

I want to find:

  <td>A
  one piece plastic luer-lock syringe that connects to a valve</td>

but not:

  <td>ANSY</td>

or

 <tr>
  <td>Prod</td>

And for the found string, replace it/change it to:

[code]

A one piece plastic luer-lock syringe that connects to a valve[/code]

The $\n\s]*[a-zA-z]+ Successfully finds only what I want it to find, I just don’t know how to replace it without losing it.

0 Likes

#5

Via backreferences.

E.g.:

The first group (everything that is found inside the first pair of parenthesis) can be retained via $1 in the replace with field.

0 Likes

#6

The square brackets mean “any of”, so you don’t need the ‘|’ in \s|\t] As written it will match a space, a ‘|’, or a newline.

0 Likes