用PHP备份Bloglines feeds OPML信息
以下是我现在用来抓取Bloglines export信息的一个最简陋的php脚本。最前面的两个变量请根据自己的情况进行设置。尤其第一个$cookieVars是需要提交给Bloglines的个性化cookie信息,你可以通过flashget之类的下载工具下载过程中出现的通讯日志得到相关cookie的内容。
<?php
// Supply your cookies information here.
// You can get this from download tools' log such as flashget, etc.
$cookieVars = "BloglinesLang=0; clickedFoldersubtree=subid13152697%... ... ... BloglinesEmail=..; BloglinesTracker=MXk4...hEVw";
// The filename to save exported opml information.
$desFileName = "./bloglines_export.xml";
$hostname = "www.bloglines.com";
$httpRequest = "";
$httpRequest .= "GET /export HTTP/1.1\r\n";
$httpRequest .= "Host: ".$hostname."\r\n";
$httpRequest .= "Accept: */*\r\n";
$httpRequest .= "Cookie: ".$cookieVars."\r\n";
$httpRequest .= "User-Agent: Mozilla/4.0 (compatible; MSIE 5.00; Windows 98)\r\n";
$httpRequest .= "Pragma: no-cache\r\n";
$httpRequest .= "Cache-Control: no-cache\r\n";
$httpRequest .= "Accept: */*\r\n";
$httpRequest .= "\r\n";
echo "connecting to the host ".$hostname."<br />";
flush();
$resString = "";
// Open socket to URL
$sHnd = @fsockopen ($hostname, 80, &$errno, &$errstr, $this->timeout);
if(!$sHnd){
$resString = "Connect to host ".$hostname." failed: ".$errstr."[errno=".$errno."]";
}
else{
echo "connected. send http request:<br />";
echo nl2br($httpRequest);
flush();
fputs ($sHnd, $httpRequest);
// Get source
$httpResponseHeader = "";
if(!feof($sHnd)){
$tmpLine = fgets($sHnd,4096);
$httpResponseHeader .= $tmpLine;
$tmpStatus = explode(" ",$tmpLine);
if($tmpStatus[1] != 200){
$resString = "Response Error:".$tmpLine;
}
else{
$bIsChunked = false;
$contentLen = 0;
while(!feof($sHnd)){
$tmpLine = fgets($sHnd,4096);
if($tmpLine == "\r\n" || $httpResponseHeader == ''){
break;
}
else{
$tmpArr = explode(":",$tmpLine);
if( count($tmpArr) == 2){
if( strtoupper(trim($tmpArr[0])) == "TRANSFER-ENCODING" &&
strtoupper(trim($tmpArr[1])) == "CHUNKED" ){
$bIsChunked = true;
}
else if(strtoupper(trim($tmpArr[0])) == "CONTENT-LENGTH"){
$contentLen = intval(trim($tmpArr[1]));
}
}
}
$httpResponseHeader .= $tmpLine;
}
echo nl2br($httpResponseHeader);
flush();
$opmlContent = "";
if(!feof($sHnd)){
if($contentLen > 0){
$opmlContent = fread( $sHnd, $contentLen );
}
else if($bIsChunked){
echo "response is chunked.<br />";
$chunk_size = (integer)hexdec(fgets( $sHnd, 4096 ) );
while(!feof($sHnd) && $chunk_size > 0) {
$opmlContent .= fread( $sHnd, $chunk_size );
fread( $sHnd, 2 ); // skip \r\n
$chunk_size = (integer)hexdec(fgets( $sHnd, 4096 ) );
}
}
}
fclose($sHnd);
echo "opml information OK.<br />";
//echo nl2br(htmlspecialchars($opmlContent));
flush();
$fd = fopen($desFileName,"w+");
if(!$fd){
$resString = "write OPML information to file ".$desFileName." failed.";
}
else{
@fputs($fd,$opmlContent);
@fclose($fd);
$resString = "OPML information saved successfully.";
}
}
}
}
echo "<font style='color:red;font-size:14px;'>".$resString."</font><br />";
?>
由于Bloglines的HTTP Response内容编码采用的chunked格式,所以需要额外的解码处理,你也可以使用更为模块化的CURL或者我这里提供的Advanced HTTP Client Class来发送http Request,效果一样。
resource: http://www.opml.org/ , http://www.vchelp.net/itbookreview/view_paper.asp?paper_id=1464


人均花费一万,一人有一个梦想,相信中<a href="http://spaces.msn.com/100cnn/">..
可说是大部分香港人的梦想。香港市民投注三十载的马会营办的<a href="http://spaces.msn.com/cncn9/">..
每星期为港民制造了不少大富翁。三十年来每名港人至少花费了一万一千七百元买<a href="http://spaces.msn.com/fa2633/">..
马会更透露,有一名超级幸运儿,一年内中三次<a href="http://spaces.msn.com/cncn9/">..
<a href="http://spaces.msn.com/fa2633/">..
为制造巨奖的效果,节日金多宝于1985年10月1日开始推出,最初只在农历新年、端午和中秋3个传统大节日中推出,
其后陆续推广至多个中西节日,及因应纪念性日子而推出特别的金多宝。
<a href="http://spaces.msn.com/fa263/">..
三十年来共创造二千三百六十位百万富翁及三百四十六位千万富翁。
虽然人人也“恨”中<a href="http://liuhecais.livejournal.com/"&..
但三十年来无人认领的奖金亦达十二亿四千多万元。
<a href="http://my.opera.com/3ntnt/">香..