腦友記廣場 -> 新手上路 -> How to Grab a simply website? 登錄 -> 註冊 -> 回復主題 -> 發表主題

polywave 2006-12-15 12:57
Dear ManInNet,

I try to grab a website with very simply structure. It have several page and all end with /page*.html. * is number and stand for 1 to something.

I use the following tab to start

$page=$_GET['.-page'];
$requesturl = "http://www.funisland.com/Funisland_GameBoy_Adv.-page$page.html";

if ($fp = fopen("$requesturl","r")){
while(!feof($fp))
{
$line = $line.fgets($fp,256);
}
fclose($fp);
}

However the page number is not ID, so I think the tab should not be write like this. I think I should use valuable but I don't know how to write. Will you help me on this?

B.rdgs
polywave

maninnet 2006-12-15 14:28
試試:
$page=$_GET['page'];
$requesturl = "http://www.funisland.com/Funisland_GameBoy_Adv.-page".$page".".html";

polywave 2006-12-15 15:26
Dear Maninnet,

Thanks for your reply. However, it is still not working and I cannot get other page.

I post all the tag. Pls help me to take a look.

<html><head><meta http-equiv="Content-Type" content="text/html; charset=big5"><title>Funisland Game Cheat</title>
</head>
<body>
Funisland Game Cheat<br>
<hr>
<?
$page=$_GET['page'];
$requesturl = "http://www.funisland.com/Funisland_GameBoy_Adv.-page".$page".".html";

if ($fp = fopen("$requesturl","r")){
while(!feof($fp))
{
$line = $line.fgets($fp,256);
}
fclose($fp);
}

eregi("<!-- Google ad code here ------>(.*)<!--// Google Adsense Banner Here //-->",$line,$matches);
$line=$matches[1];
$allowed_tags = "<a>,<hr>,<br>";
$line = strip_tags($line, $allowed_tags);

echo($line);
?>
<hr><br>
powered by ManInNet
</body></html>


B.rdgs
Gabriel

never4get 2006-12-16 00:48
試試:

<html><head><meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /><title>Funisland Game Cheat</title>
</head>
<body>
Funisland Game Cheat<br>
<hr>
<?
$page=$_GET['page'];
if (!$page) {
   $page = "1";
}
$requesturl = "http://www.funisland.com/Funisland_GameBoy_Adv.-page".$page.".html";
$line = file_get_contents($requesturl);
eregi("<!-- Google ad code here ------>(.*)<!--// Google Adsense Banner Here //-->",$line,$matches);
$line=$matches[1];
$line = eregi_replace("HREF=\"","HREF=\"http://www.funisland.com/",$line);
$allowed_tags = "<a>,<hr>,<br>";
$line = strip_tags($line, $allowed_tags);
echo($line);
?>
<hr><br>
powered by ManInNet
</body></html>

polywave 2006-12-16 11:08
Dear Maninet,

thanks for your reply. However, it can't work as all the link refer to the orignal website not to the localhost. I try to let it refer to localhost but it still not working as I do not know how to add the parameter on the end. Will you help me more? There should be another php to view the content. I am think about to refer to the content php but the same problem as I don't know how to set the 1, 2, etc after the page*.html. Pls advice if you can help.

B.rdgs
polywave

polywave 2006-12-18 12:46
Dear Maninet,

Is there no way to do that? Pls advice.

B.rdgs
polywave

polywave 2006-12-20 14:17
Dear Maninet,

Any update. I try different way but it cannot. If the website is not write by php or no ?, = or &, it just a very simpy address and link to other simply address. How to link the main.php and view.php? I go through all the forum and post but no one example is like that. Is there any way to exact the website content? Pls advice.

B.rdgs
polywave

maninnet 2006-12-20 15:00
實際上, 不明白你想怎樣, 是否一開始有所誤會? 以
http://www.funisland.com
來說, 是否分開不同的 Category?
Action Games, Casino/Card Games, Classic Games, Educational Games .......

如果是這樣, 則要三個檔案, 第一個為總的 Category 檔, 可以以簡單的 html 形式表示, 就算是 index.html 吧, 第二個, 則為分類目錄, 為了省事, 用回 gamelist.php, 同樣, 第三個就是 gamedetail.php

第一個 index.html 檔案, 最簡單的方法, 就是:
<html><body>
<li><a href=gamelist.php?id=4>Action Games</a></li>
<li><a href=gamelist.php?id=4>Card Games</a></li>
.
.
</body></html>

而 gamelist.php 則開始內容如下:

$page=$_GET['id'];
$requesturl = "http://www.funisland.com/gamelist.php?id=$page";

if ($fp = fopen("$requesturl","r")){
while(!feof($fp))
{
$line = $line.fgets($fp,256);
}
fclose($fp);

gametail.php 處理方法相同, 開始的部分亦一樣:
$page=$_GET['id'];
$requesturl = "http://www.funisland.com/gamedetail.php?id=$page";

if ($fp = fopen("$requesturl","r")){
while(!feof($fp))
{
$line = $line.fgets($fp,256);
}
fclose($fp);

polywave 2006-12-20 15:46
Dear Maninet,

I am sorry that I am not make my question very clear.

I fully understand your comment and script. And I understand we need to separate two or three php for index, list and details. It just similar to the other newspaper php post in this forum.

My question is when you go to website
http://www.funisland.com/Funisland_GameBoy_Adv.-page1.html
or
http://www.nintendocc.com/Content/Game_Boy_Advance_tips_and_codes

You will find that these website address and link structure is very simply and there is no valuable you can get inforamtion like $page=$_GET['id'];

Take an example for
http://www.funisland.com/Funisland_GameBoy_Adv.-page1.html

There is two kinds of link
One is page link from
http://www.funisland.com/Funisland_GameBoy_Adv.-page1.html
http://www.funisland.com/Funisland_GameBoy_Adv.-page2.html
.
.
http://www.funisland.com/Funisland_GameBoy_Adv.-page20.html
or even more pape.

I can solve this by using an index file index.html like you suggest. However if the page increase page, we need change the index again, so I think it is not the best way to do that.

Another link is the game details link e.g http://www.funisland.com/Funisland_cheat1.html, etc . I think we need another gamedetail.php to get the details.

Then the problem is how to link the gamelist.php and gamedetail.php as there is no $_GET['id'] I can get from gamelist.php to store the information for gamedetail.php. Also the address structure of http://www.funisland.com/Funisland_cheat1.html is not like xxxx.php?id=. When I call gamedetail.php in gamelist.php, the href= will start by gamedetail.php.

In http://www.nintendocc.com/Content/Game_Boy_Advance_tips_and_codes, even all the link are link by the name of game and there is no relationship between the link. Therefore, how we link the page if the website structure like this.

I hope that you can understand my question and I think that this kinds of website is very simply but seems that very different to get.

Thanks for your help.

B.rdgs
polywave

maninnet 2006-12-20 21:22
我明白了, 在 gamelist.php 中, 出現了要兩種不同的 link, 一種是 link 回 gamelist.php, 另一種需要 link 去 gemadetail.php

你的問題是如何實現此方法.

maninnet 2006-12-20 21:41
建議略作修改

只須兩個檔案, 一個是 index.html, 一個是 gamelist.php

<html>
<body?
<li><A HREF="gamelist.php?url=Funisland_letterNum-page1.html">#</A></li>
<li><A HREF="gamelist.php?url=Funisland_letterA-page1.html">A</A><li>
<li><A HREF="gamelist.php?url=Funisland_letterB-page1.html">B</A><li>
.
.
.
</body></html>

而 gamelist.php 則開始內容如下:

$page=$_GET['url'];
$requesturl = "http://www.funisland.com/".$url;

if ($fp = fopen("$requesturl","r")){
while(!feof($fp))
{
$line = $line.fgets($fp,256);
}
fclose($fp);

問題就在這裡, 要分別處理, 分別處理 gamelist 同 gamedetail

if (eregi("Funisland_cheat",$url))
{
這裡處理 detail
}
else
{
這裡處理 gamelist
留意, 要將 HREF=" 用 eregi_replace 置換成 HREF="gamelist.php?url=
}

polywave 2006-12-21 17:16
Dear Maninet,

Thanks for your big help. It is working.

B.rdgs
polywave

maninnet 2006-12-21 17:54
我求其整理一下, 無執到畫面

兩個檔案, gb.php 及 gamelist.php

gb.php 內容如下:

<html>
<head><title>Cheat codes</title></head>
<body>
Browse Cheat Codes by Game Title
<hr>
<li><A HREF="gamelist.php?url=Funisland_letterNum-page1.html">#</A></li>
<?
$i = 65;
for ($i = 65;$i <=90; $i++)
{
  echo ("<li><A HREF=gamelist.php?url=Funisland_letter".chr($i)."-page1.html>".chr($i)."</A></li>");
}
?>
<hr>
</body></html>

chr(65)=A
每次加 1, 就加到 Z; 唔想多打字的方法.

gamelist.php 內容如下:

<html>
<head><title>Cheat code</title></head>
<body>
<?

$url=$_GET['url'];

$requesturl = "http://www.funisland.com/".$url;

if ($fp = fopen("$requesturl","r")){
while(!feof($fp))
{
  $line = $line.fgets($fp,256);
}
fclose($fp);
if (eregi("Funisland_cheat",$url))
{
  eregi("<!--// Google Adsense Banner Here //-->(.*)<!--// Google Adsense Banner Here //-->",$line,$matches);
  $line=$matches[1];
  $allowed_tags = "<a>,<b>,<hr>,<br>,<h2>,<h3>,<h4>";
  $line = strip_tags($line, $allowed_tags);
  $line = eregi_replace("HREF=\"","HREF=\"gamelist.php?url=",$line);
  echo ("Please refer following information carefully: <Hr>".$line);
}
else
{
  eregi("<!--// Google Adsense Banner Here //-->(.*)<!--// Google Adsense Banner Here //-->",$line,$matches);
  $line=$matches[1];
  $allowed_tags = "<a>,<hr>,<br>";
  $line = strip_tags($line, $allowed_tags);
  $line = eregi_replace("HREF=\"","HREF=\"gamelist.php?url=",$line);
  echo ("Please select your favority: <hr>".$line);
}
}
?>
<br><a href=gb.php>return to index</a>
<hr>
</body>
</html>

用 url 有沒有 cheat 這個字, 作判斷.

polywave 2006-12-22 10:00
Thanks for your big help. Now I understand more.


查看完整版本: [-- How to Grab a simply website? --] [-- top --]


Powered by PHPWind v4.0.1 Code © 2003-05 PHPWind
Gzip enabled

You can contact us