Jump to content
php.lv forumi

Img src atrašana ar php no lazyload


Pieduriens

Recommended Posts

Labdien.

 

No Tripavisor profila lapas cenšos noskračot atsauksmes rakstītāja avatar bildi, bet saits izmanto lazyload, kas img src tur nevis atribūtā, bet footerī un tikai pie page load ielādē bildes. Līdz ar to mani panākumi ar PHP Simple HTML DOM izpaliek,nevaru neko noselektēk, jo nav jau ko.

 

Avatars divs: 

<div class="avatar profile_CF48B2B4A31B43EC96F0561F498CE6BF ">
    <a onclick="">
        <img id="lazyload_-247847544_0" height="74" width="74" class="avatar potentialFacebookAvatar avatarGUID:CF48B2B4A31B43EC96F0561F498CE6BF" src="http://media-cdn.tripadvisor.com/media/photo-l/05/f3/67/c3/lilrazzy.jpg" />
    </a>
</div>

Un Ir js masīvs, kas stāv footerī kur katram img#lazyload_-* pretī ir interesējošais img src:

var lazyImgs = [
{"id":"lazyload_-759354445_0","tagType":"img","scroll":true,"priority":100,"data":"http://media-cdn.tripadvisor.com/media/photo-l/05/f3/67/c3/lilrazzy.jpg"}
,   {"id":"lazyload_-759354445_1","tagType":"img","scroll":true,"priority":100,"data":"http://c1.tacdn.com/img2/icons/gray_flag.png"}
,   {"id":"lazyload_-759354445_2","tagType":"img","scroll":true,"priority":100,"data":"http://media-cdn.tripadvisor.com/media/photo-l/01/2a/fd/98/avatar.jpg"}
,   {"id":"lazyload_-759354445_3","tagType":"img","scroll":true,"priority":100,"data":"http://c1.tacdn.com/img2/icons/gray_flag.png"}
,   {"id":"lazyload_-759354445_4","tagType":"img","scroll":true,"priority":100,"data":"http://media-cdn.tripadvisor.com/media/photo-l/01/2e/70/5e/avatar036.jpg"}
,   {"id":"lazyload_-759354445_5","tagType":"img","scroll":false,"priority":100,"data":"http://c1.tacdn.com/img2/x.gif"}


U.T.T.

Kā no remote url izvilkt interesējošo img src.

Pieduriens.

Link to comment
Share on other sites

var lazyImgs = [{"id":""....]

šitas ir JSON. Ar php izkasi masīva definēšanu un ar json_decode dabūsi php masīvu, kurā būs bildes id sasaistīts ar īsto avatara url

 

Pēc tam jau <img id="lazyload_-247847544_0" ... > šito sasaisti ar id no masīva. Tā arī dabūsi vajadzīgo avatara url

Edited by Kasspars
Link to comment
Share on other sites

$c = file_get_contents('te_url');

 

// Atrodam, kur sākas masīva definēšana

$p1 = strpos( $c, 'var lazyImgs =' ) + 14;

 

// Es te pieņemu, ka masīvā netiek definēts vēl viens masīvs. Vismaz tavā, piemēra izsaktās, ka tā arī ir

// Tas nozīmē, ka mūsu masīvu noslēgs simbols ]

$p2 = strpos( $c, ']', $p1 );

 

$images = json_decode( substr( $c, $p1, $p2 - $p1 );

var_dump($images);

Edited by Kasspars
Link to comment
Share on other sites

Paldies Kaspar par atbildi. Ļoti iepatikās šī stratēģija kā tiks pie kārotā, bet..

No:

$url = 'http://www.tripadvisor.com/Hotel_Review-g274965-d952833-Reviews-Ezera_Maja-Liepaja_Kurzeme_Region.html#REVIEWS';
$html = file_get_contents($url);
$p1 = strpos( $html, 'var lazyImgs =' ) + 14;
$p2 = strpos( $html, ']', $p1 );
$images = json_decode( substr( $html, $p1, $p2 - $p1 ));
var_dump($images);

iegūstu pliku NULL.

Link to comment
Share on other sites

Tev nopietni nav pieejama kkāda vide kur izpildīt JS? Kaut vai tas pats node?

Pat nezinu vai šo jautājumu sapratu pareizi..

Bet par tēmu:

paldies kasparam. Json nebija valīds jo nepievienojās ']' pašās beigās.

Nu re, tieku pie rezultāta:


$url = 'http://www.tripadvisor.com/Hotel_Review-g274965-d952833-Reviews-Ezera_Maja-Liepaja_Kurzeme_Region.html#REVIEWS';
$html = file_get_contents($url);
$p1 = strpos( $html, 'var lazyImgs =' ) + 14;
$p2 = strpos( $html, ']', $p1  );
$raw = substr( $html, $p1, $p2 - $p1 ) . ']';
$images = json_decode($raw);
echo '<pre>';
print_r($images);
echo '</pre>';

Edited by Pieduriens
Link to comment
Share on other sites

> Parsing or syntactic analysis is the process of analysing a string of symbols, either in natural language or in computer languages, according to the rules of a formal grammar.

 

Tu patreiz to centies izdarīt ļoti, ļoti sliktā, low-level veidā, kas salūzīs pie pirmās reizes, kad input kaut nedaudz pamainīsies. Es tev iesaku izmantot gatavu parseri, kas parsē JavaScript.

Link to comment
Share on other sites

Tas ir JSON - Javascript object notation. Tas jau tagad ir tāds pats standarts kā xml. Nekā slikta šajā nav, šis ir ļoti ātrs veids kā dabūt vajadzīgās bildes.

 

Tavs node variants arī noklasies tikko tripadvisor nomainīs lazyImgs mainīgā vārdu vai izvadīto struktūru

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
×
×
  • Create New...