emsy Posted April 3, 2010 Report Share Posted April 3, 2010 (edited) Man pirms dažām dienām ienāca prātā, nezinu vai laba vai sūdīga, ideja, izveidot botu kurš ptas staigā pa lapām un ievāc simbolus, storējot tos SQL tabulā, bet es nesaprotu kā izdarīt tā, lai bots pats automātiski staigātu pa interneta vidi nolasot HTML rezultātu un izanalizētu katru doto simbolu, tā saglabāšana reāli ir sīkums. Varbūt kāds var palīdzēt vai kko ieteikt! =] Edited April 3, 2010 by emsy Quote Link to comment Share on other sites More sharing options...
daGrevis Posted April 3, 2010 Report Share Posted April 3, 2010 Jēga? Quote Link to comment Share on other sites More sharing options...
emsy Posted April 3, 2010 Author Report Share Posted April 3, 2010 Gan jau pielietoju varētu atrast =] Quote Link to comment Share on other sites More sharing options...
m8t Posted April 3, 2010 Report Share Posted April 3, 2010 Noteikti ka ir vairāki veidi kā to var izdarit, bet te ir viens. Neteikšu ka viņš automātiski līstu cauri lapām un to linkiem, bet gan lien cauri noteiktiem linkiem no .txt faila Ideja šāda: Izveido .txt failu ar vairākiem linkiem. Izveidolai bots nolasa šo failu pa rindiņām un katras rindiņas laikā iegūst atiecīgās lapas saturu (file_get_contents()). Kad tas ir darīts, ja gribi, vari vienkārši viņu pievienot datubāzei. Ja gribi izvilkt kādu attiecīgu keyword vai kko, tad izmanto funkciju strpos(). Quote Link to comment Share on other sites More sharing options...
briedis Posted April 3, 2010 Report Share Posted April 3, 2010 (edited) Ar regulāru izteiksmi izvelkam visus a tagad linkus no lapas, kaut kur pieglabājam, apstaigājam. Un tā tālāk... Gūglē noteikti vari atrast tādu crawleru piemērus.. Reku kaut kāds piemērs.... $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,"http://www.urlyourstart.com"); curl_setopt($ch, CURLOPT_TIMEOUT, 30); //timeout after 30 seconds curl_setopt($ch, CURLOPT_RETURNTRANSFER,1); $result=curl_exec ($ch); curl_close ($ch); // Search The Results From The Starting Site if( $result ) { // I LOOK ONLY FROM TOP domains change this for your usage preg_match_all( '/<a href="(http:\/\/www.[^0-9].+?)"/', $result, $output, PREG_SET_ORDER ); foreach( $output as $item ) { // ALL LINKS DISPLAY HERE print_r($item); // NOW YOU ADD IN YOU DATABASE AND MAKE A LOOP TO ENGINE NEVER STOP } } Edited April 3, 2010 by briedis Quote Link to comment Share on other sites More sharing options...
emsy Posted April 3, 2010 Author Report Share Posted April 3, 2010 (edited) Fatal error: Call to undefined function curl_int() Itkā php.ini ir extension=php_curl.dll ieslēgts :s Edited April 3, 2010 by emsy Quote Link to comment Share on other sites More sharing options...
briedis Posted April 3, 2010 Report Share Posted April 3, 2010 Fatal error: Call to undefined function curl_int() Itkā php.ini ir extension=php_curl.dll ieslēgts :s restartēji apači? Pameklē vai vēl kādā mapē nav php.ini, es izmantoju xmapp, un man jālabo bija apache/bin/php.ini, nevis php/php.ini Quote Link to comment Share on other sites More sharing options...
emsy Posted April 3, 2010 Author Report Share Posted April 3, 2010 1) Kā nerestartējot PC var restartēt apači 2) Atradu vienu php.ini, bet tas bija domāts, lai varētu e-pastus sūtīt Quote Link to comment Share on other sites More sharing options...
waplet Posted April 3, 2010 Report Share Posted April 3, 2010 kas tev pa komplektāciju? Quote Link to comment Share on other sites More sharing options...
emsy Posted April 3, 2010 Author Report Share Posted April 3, 2010 kas tev pa komplektāciju? AppServ Quote Link to comment Share on other sites More sharing options...
waplet Posted April 3, 2010 Report Share Posted April 3, 2010 1) Start>All programms>AppServ>Restart apache 2) C:\windows\php.ini Quote Link to comment Share on other sites More sharing options...
emsy Posted April 3, 2010 Author Report Share Posted April 3, 2010 Restartēju, bez izmaiņām :s Quote Link to comment Share on other sites More sharing options...
briedis Posted April 3, 2010 Report Share Posted April 3, 2010 Restartēju, bez izmaiņām :s appserv mapē nav kāds php.ini? piemēram, C:\AppServ\php5\php.ini Quote Link to comment Share on other sites More sharing options...
waplet Posted April 3, 2010 Report Share Posted April 3, 2010 nē Quote Link to comment Share on other sites More sharing options...
zintis8789 Posted April 3, 2010 Report Share Posted April 3, 2010 C:\WINDOWS\php.ini Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.