Home Download Buy Blog Forum Support

Can't figure out regex replace

Can't figure out regex replace

Postby stormtitan on Fri Nov 09, 2012 4:10 pm

I'm using OSX sublime text 2, should be current version. I have regex turned on, and it's finding what I want, which is;

Code: Select all
<td>Foo [linebreak]
[not </td>]</td>

In other words, some of the HTML I'm editing has a cell, with text, and then additional text on the next line (after a break). I'm trying to remove the newline, move the 2nd line of text up to the first.

Code: Select all
$[\n\s]*[a-zA-z]+

This regex is correctly finding what I want to find. However, I don't know how to KEEP the text but remove the line break. Here is an example:

Code: Select all
<td>Single [break]
[tab]use bottle for Injection</td>


Should instead be
Code: Select all
<td>Single use bottle for injection</td>


I don't know what to put into the replace field to 'maintain' the 'foo' word that is found with the [A-Za-z]+ part of the expression. I've tried \\2, \$2, \&2, \2, $2, &2. None of those work.
stormtitan
 
Posts: 4
Joined: Tue Nov 06, 2012 4:04 pm

Re: Can't figure out regex replace

Postby pete340 on Fri Nov 09, 2012 4:51 pm

stormtitan wrote:I'm using OSX sublime text 2, should be current version. I have regex turned on, and it's finding what I want, which is;


I'm having trouble following your samples, but here's the general approach:

Code: Select all
xxx(.*)yyy(.*)zzz


matches "xxx" followed by any number of characters followed by "yyy" followed by any number of characters followed by "zzz". The parentheses mark a "capture group"; the first set of parentheses is capture group number 1, and the second set of parentheses is capture group number 2. You can have pretty much however many capture groups you want, and you can have capture groups inside capture groups. The number of a group is determined by counting left parentheses from the beginning of the regular expression.

In your replacement text, you use $n to refer to the text captured by the nth capture group. So

Code: Select all
[$2][$1]


would change "xxxalphayyybetazzz" to "[beta][alpha]".
pete340
 
Posts: 62
Joined: Mon Oct 10, 2011 9:45 pm

Re: Can't figure out regex replace

Postby highend on Fri Nov 09, 2012 4:58 pm

Isn't
Code: Select all
\n[\s|\t]*(?=\w)
what you're looking for?

A linebreak, followed by either a space or tab (0 - unlimited times), up to the next word character (excluding it)?

If that's the case, you one have to enter a simple space in the replace with field. No need for backreferences in the
search term.

<td>Single
use bottle for Injection</td>

<td>Single
use bottle for Injection</td>

<td>Single
use bottle for Injection</td>


leads to:

<td>Single use bottle for Injection</td>

<td>Single use bottle for Injection</td>

<td>Single use bottle for Injection</td>
highend
 
Posts: 93
Joined: Fri Jan 20, 2012 2:47 pm

Re: Can't figure out regex replace

Postby stormtitan on Fri Nov 09, 2012 5:53 pm

I want to find:
Code: Select all
  <td>A
  one piece plastic luer-lock syringe that connects to a valve</td>


but not:

Code: Select all
  <td>ANSY</td>


or

Code: Select all
<tr>
  <td>Prod</td>


And for the found string, replace it/change it to:

Code: Select all
  <td>A one piece plastic luer-lock syringe that connects to a valve</td>


The
Code: Select all
$[\n\s]*[a-zA-z]+
Successfully finds only what I want it to find, I just don't know how to replace it without losing it.
stormtitan
 
Posts: 4
Joined: Tue Nov 06, 2012 4:04 pm

Re: Can't figure out regex replace

Postby highend on Fri Nov 09, 2012 6:52 pm

Via backreferences.

E.g.:
$[\n\s]*([a-zA-z]+)


The first group (everything that is found inside the first pair of parenthesis) can be retained via $1 in the replace with field.
highend
 
Posts: 93
Joined: Fri Jan 20, 2012 2:47 pm

Re: Can't figure out regex replace

Postby pete340 on Fri Nov 09, 2012 7:46 pm

highend wrote:Isn't
Code: Select all
\n[\s|\t]*(?=\w)
what you're looking for?


The square brackets mean "any of", so you don't need the '|' in
Code: Select all
[\s|\t]
As written it will match a space, a '|', or a newline.
pete340
 
Posts: 62
Joined: Mon Oct 10, 2011 9:45 pm


Return to General Discussion

Who is online

Users browsing this forum: krskrs and 30 guests