Using PHP cURL to read RSS feed XML

Using PHP cURL to read RSS feed XML
Page content

Using PHP cURL to read RSS feed XML

A very common fact is that feeds are getting very popular nowadays and we search for ways to easily access and parse those feeds using PHP. In order to parse feeds, you need to get then first and a very popular tool for handling HTTP connections is cURL library. By using this library, one get data feeds from any page on the Internet, it hardly matters whether the data is password protected or requires to POST some data.

Basics

If you are using curl on windows, you just need to uncomment it in the “php.ini” file. Else, on Linux you need to compile PHP with –with-curl.

Step-By-Step Guide

The following step shows you how to connect to some RSS feeds with curl and for that you need to init curl resource handle as below;

$ch = curl_init(“https://localhost/curl/rss.xml”);

At the next step we need to setup a few connection options by using curl_setopt() using three parameters, the first one being the curl resource, secondly the curl option key and lastly the option value. Follow the below code for this;

curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

curl_setopt($ch, CURLOPT_HEADER, 0);

The following step might sound a time taking job, but actually it is not. In this step we will execute connection, wait for a response and close it.

$data = curl_exec($ch);

curl_close($ch);

Now, half of our work is done as the page we just connected to contains the RSS feeds or XML data. Now that we have our XML string in $data variable, so all we need to do to parse it is as follows;

$doc = new SimpleXmlElement($data, LIBXML_NOCDATA);

So, it is clear so far that doc is an instance of SimpleXmlElement and if we want to know the contents of the object, use the following code;

print_r($doc);

Now comes the checking part of the RSS feeds standards, which is quite easy. It is to be kept in mind that all RSS documents have node. If any document contains this node then chances are that it is a RSS document. The following code checks whether it is a RSS document.

if(isset($doc->channel))

{

parseRSS($doc);

}

The parseRSS() function is used to get data from the SimpleXmlElement objects, which is written as follows;

function parseRSS($xml)

{

echo “”.$xml->channel->title."";

$cnt = count($xml->channel->item);

for($i=0; $i<$cnt; $i++)

{

$url = $xml->channel->item[$i]->link;

$title = $xml->channel->item[$i]->title;

$desc = $xml->channel->item[$i]->description;

echo ‘’.$title.’’.$desc.’’;

}

}