keiG Posted July 11, 2013 Report Share Posted July 11, 2013 $preg='/<h2><a href="(.*)">([^>]*)<\/a><\/h2>/iu'; $from='<h2><a href="anything here">Ā b čc ļāļāā</a></h2>' preg_match_all($preg, $from, $matches); bez mīkstinājumiem darbojās lieliski, bet ar - neatgriež rezultātu. Ir kādi varianti? Quote Link to comment Share on other sites More sharing options...
daGrevis Posted July 11, 2013 Report Share Posted July 11, 2013 Neparsē HTML ar regexiem. Quote Link to comment Share on other sites More sharing options...
Džei Posted July 11, 2013 Report Share Posted July 11, 2013 http://simplehtmldom.sourceforge.net/ šis Tev palīdzēs ;) Quote Link to comment Share on other sites More sharing options...
l27 Posted July 11, 2013 Report Share Posted July 11, 2013 Korekti būu: $preg='#<h2><a href="([^"]*)">([^>]*)</a></h2>#'; $from='<h2><a href="anything here">Ā b čc ļāļāā</a></h2>' preg_match_all($preg, $from, $matches); Neesmu mēģinājis ar modifieriem. Parasti vispirms normalizēju visus html tagus un kreisos simbolus. Tad nav jāuztraucas mazie, lielie. Rezultātu var redzēt l2d.lv, kur ir apstrādāti likumi.lv akti ar regulārajām izteiksmēm. Quote Link to comment Share on other sites More sharing options...
draugz Posted July 12, 2013 Report Share Posted July 12, 2013 Neparsē HTML ar regexiem. Viss atkarīgs no situācijas, ja zini ka lapas kodam apakšā ir kāds XSLT veidīgs template engins, kas outputu vienmēr spēs izveidot XHTML veidā, tad neredzu problēmas parsēt html :) Quote Link to comment Share on other sites More sharing options...
v3rb0 Posted July 12, 2013 Report Share Posted July 12, 2013 php joprojām nav utf8 atbalsta visur un vajag pašiem žonglēt starp unikodu un single byte čaraktersetu? Quote Link to comment Share on other sites More sharing options...
daGrevis Posted July 12, 2013 Report Share Posted July 12, 2013 Unicode visur ir pain in the arse. Quote Link to comment Share on other sites More sharing options...
keiG Posted July 12, 2013 Author Report Share Posted July 12, 2013 Paldies! Jau ieķēru PHP Simple HTML DOM Parser Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.