Saturday, 15 July 2017

regex - PHP Remove all paragraph tags inside header tags



I've been dwelling on this for a while.



I have this string (there are more contents before and after the h2 tags):



...

Lorem Ipsum

...


What regex do I use to remove all the

and

tags inside those header tags?



I'm trying to do something like this, but the positive lookbehind one is not working:



// for the starting 

tag
$str = preg_replace('/(?<=]+>)\s*

/i', '', $str);
// for the ending

tag
$str = preg_replace('/<\/p>\s*(?=<\/h[1-6]{1}>\s*)/i', '', $str);


This does not take account paragraph tags deep inside the text within the

tag also



[Update]



This is derived from one of PeeHaa's suggested links



// for the starting 

tag
$str = preg_replace("#()#", '$1', $str);
// for the ending

tag
$str = preg_replace("#<\/p>(<\/h[1-6]>)#", '$1', $str);

Answer



You shouldn't try parse html with regexes, though having said that, since this is a subset of html and not a full document / nested layout, it is possible:



preg_replace('/(]*>)\s?

(.*)?<\/p>\s?(<\/h\2>)/', "$1$3$4")



Test case here:



http://codepad.org/oA2rtNP9


No comments:

Post a Comment

casting - Why wasn&#39;t Tobey Maguire in The Amazing Spider-Man? - Movies &amp; TV

In the Spider-Man franchise, Tobey Maguire is an outstanding performer as a Spider-Man and also reprised his role in the sequels Spider-Man...