 | Level: Advanced Jack D Herrington (jherr@pobox.com), Senior Software Engineer, Leverage Software Inc.
12 May 2006 Updated 03 Apr 2007 Learn how to build an Asynchronous JavaScript and XML (Ajax) Really Simple Syndication (RSS) reader, as well as a Web component that you can place on any Web site to look at the articles in the RSS feeds.
The first thing I thought about doing when I read about requesting Extensible Markup Language (XML) from JavaScript code on a Web page was to get some RSS and display it. But I immediately ran into the security issue of XML Hypertext Transfer Protocol (HTTP), where a page that comes from www.mysite.com can't address pages from anywhere other than www.mysite.com. My plans to build a generic RSS reader in just the page were dashed. But Web 2.0 is all about ingenuity, and solving the problem of how to create an RSS reader with XMLHTTP teaches a lot about how to program the 2.0 Web.
This article walks through the construction of an Ajax-based RSS reader using both XMLHTTP and <script> tags as the transport mechanisms.
Building the server side
The server side of the equation comes in two pieces. The first is the database, and the second is a set of PHP pages that allow you to add feeds, request the list of feeds, and get the article associated with a particular feed. I'll start with the database.
The database
For this article, I use a MySQL database. Listing 1 shows the schema.
Listing 1. The schema for the database
CREATE TABLE rss_feeds (
rss_feed_id MEDIUMINT NOT NULL AUTO_INCREMENT,
url TEXT NOT NULL,
name TEXT NOT NULL,
last_update TIMESTAMP,
PRIMARY KEY ( rss_feed_id )
);
CREATE TABLE rss_articles (
rss_feed_id MEDIUMINT NOT NULL,
link TEXT NOT NULL,
title TEXT NOT NULL,
description TEXT NOT NULL
);
|
There are two tables. The rss_feeds table contains the list of feeds. And the rss_articles table contains the list of articles associated with each feed. When the system updates the articles, it deletes all of the current articles associated with the given rss_feed_id and then refreshes the table with the new set of articles.
The database wrapper
The next step is to wrap the database with a set of PHP classes that build the business logic for the application. This starts with the DatabaseConnection singleton that manages the connection to the database, as shown in Listing 2.
Listing 2. The DatabaseConnection singleton in rss_db.php
<?php
// Install the DB module using 'pear install DB'
require_once 'DB.php';
require_once 'XML/RSS.php';
class DatabaseConnection
{
public static function get()
{
static $db = null;
if ( $db == null )
$db = new DatabaseConnection();
return $db;
}
private $_handle = null;
private function __construct()
{
$dsn = 'mysql://root:password@localhost/rss';
$this->_handle =& DB::Connect( $dsn, array() );
}
public function handle()
{
return $this->_handle;
}
}
|
This is a standard PHP singleton pattern. It connects to the database and returns a handle through the handle method. The two require_once statements are another interesting part of this code. The first references the PHP Extension and Application Repository (PEAR) DB module that connects to the database. The second references the XML_RSS module that parses RSS feeds. I admit it; I used modules here because I'm far too lazy to worry about parsing all of the different forms of RSS. If you don't have these modules installed, use these on the command line:
% pear install DB
And:
% pear install XML_RSS
The DB module is commonly installed, but the XML_RSS module isn't.
The next step is to build a class that wraps the list of feeds so that you can add a feed, get a list of feeds, and so on. Listing 3 shows this class.
Listing 3. The FeedList class in rss_db.php
class FeedList {
public static function add( $url ) {
if ( FeedList::getFeedByUrl( $url ) != null ) return;
$db = DatabaseConnection::get()->handle();
$rss =& new XML_RSS( $url );
$rss->parse();
$info = $rss->getChannelInfo();
$isth = $db->prepare( "INSERT INTO rss_feeds VALUES( null, ?, ?, null )" );
$db->execute( $isth, array( $url, $info['title'] ) );
$info = FeedList::getFeedByUrl( $url );
Feed::update( $info['rss_feed_id'] );
}
public static function getAll( ) {
$db = DatabaseConnection::get()->handle();
$res = $db->query( "SELECT * FROM rss_feeds" );
$rows = array();
while( $res->fetchInto( $row, DB_FETCHMODE_ASSOC ) ) { $rows []= $row; }
return $rows;
}
public static function getFeedInfo( $id ) {
$db = DatabaseConnection::get()->handle();
$res = $db->query( "SELECT * FROM rss_feeds WHERE rss_feed_id=?",
array( $id ) );
while( $res->fetchInto( $row, DB_FETCHMODE_ASSOC ) ) { return $row; }
return $null;
}
public static function getFeedByUrl( $url ) {
$db = DatabaseConnection::get()->handle();
$res = $db->query( "SELECT * FROM rss_feeds WHERE url=?", array( $url ) );
while( $res->fetchInto( $row, DB_FETCHMODE_ASSOC ) ) { return $row; }
return null;
}
public static function update() {
$db = DatabaseConnection::get()->handle();
$usth1 = $db->prepare( "UPDATE rss_feeds SET name='' WHERE rss_feed_id=?" );
$usth2 = $db->prepare( "UPDATE rss_feeds SET name=? WHERE rss_feed_id=?" );
$res = $db->query(
"SELECT rss_feed_id,name FROM rss_feeds WHERE last_update<now()-600" );
while( $res->fetchInto( $row, DB_FETCHMODE_ASSOC ) ) {
Feed::update( $row['rss_feed_id'] );
$db->execute( $usth1, array( $row['rss_feed_id'] ) );
$db->execute( $usth2, array( $row['name'], $row['rss_feed_id'] ) );
}
}
}
|
The add method adds a feed to the list and updates the feed. The getAll method returns a list of all of the feeds. The getFeedInfo method returns the information for a given feed. The getFeedByUrl method does the same thing as the getFeedInfo method, but it does it using the URL of the feed as a key. And the update function calls the update method on the given feed if that feed hasn't been updated in the last ten minutes.
Listing 4 shows the Feed class, which is the final class in the business logic classes. It has methods that deal with an individual feed.
Listing 4. The Feed class from rss_db.php
class Feed
{
public static function update( $id )
{
$db = DatabaseConnection::get()->handle();
$info = FeedList::getFeedInfo( $id );
$rss =& new XML_RSS( $info['url'] );
$rss->parse();
$dsth = $db->prepare( "DELETE FROM rss_articles WHERE rss_feed_id=?" );
$db->execute( $dsth, array( $id ) );
$isth = $db->prepare( "INSERT INTO rss_articles VALUES( ?, ?, ?, ? )" );
foreach ($rss->getItems() as $item) {
$db->execute( $isth, array( $id,
$item['link'], $item['title'],
$item['description'] ) );
}
}
public static function get( $id )
{
$db = DatabaseConnection::get()->handle();
$res = $db->query( "SELECT * FROM rss_articles WHERE rss_feed_id=?",
array( $id ) );
$rows = array();
while( $res->fetchInto( $row, DB_FETCHMODE_ASSOC ) )
{
$rows []= $row;
}
return $rows;
}
}
?>
|
The update method uses the RSS parser to get the feed and update the database. And the get method returns the current contents of the articles table for the given feed.
The PHP service pages
The first page you need to use is the add.php page, in Listing 5, to add feeds to the list.
Listing 5. add.php
<?php
require_once 'rss_db.php';
header( 'Content-type: text/xml' );
FeedList::add( $_GET['url'] );
?>
<done />
|
This is a very simple wrapper around the add method on the FeedList class. The <done> tag at the bottom satisfies the need for this to return some type of XML indicating the success or failure of the process.
The next page is the list.php page, in Listing 6, that returns the list of feeds in the database.
Listing 6. list.php
<?php
require_once 'rss_db.php';
$rows = FeedList::getAll();
$dom = new DomDocument();
$dom->formatOutput = true;
$root = $dom->createElement( 'feeds' );
$dom->appendChild( $root );
foreach( $rows as $row )
{
$an = $dom->createElement( 'feed' );
$an->setAttribute( 'id', $row['rss_feed_id'] );
$an->setAttribute( 'link', $row['url'] );
$an->setAttribute( 'name', $row['name'] );
$root->appendChild( $an );
}
header( "Content-type: text/xml" );
echo $dom->saveXML();
?>
|
To make it easier to write the XML properly, I use the Document Object Model (DOM) functions in the PHP core to create an XML DOM on the fly. Then I use the saveXML function to format it for output.
If I browse to this page using my Firefox® browser, I see the output in Figure 1.
Figure 1. The feed list XML page
Of course, this is after I have added eight feeds to the list.
The final page I need to build before getting into the client side of the system is the read.php page, in Listing 7, that returns the articles associated with a given feed ID.
Listing 7. read.php
<?php
require_once 'rss_db.php';
FeedList::update();
$rows = Feed::get( $_GET['id'] );
$dom = new DomDocument();
$dom->formatOutput = true;
$root = $dom->createElement( 'articles' );
$dom->appendChild( $root );
foreach( $rows as $row )
{
$an = $dom->createElement( 'article' );
$an->setAttribute( 'title', $row['title'] );
$an->setAttribute( 'link', $row['link'] );
$an->appendChild( $dom->createTextNode( $row['description'] ) );
$root->appendChild( $an );
}
header( "Content-type: text/xml" );
echo $dom->saveXML();
?>
|
This is very similar in form to the list.php page. I use the Feed class to get the list of articles. Then I use the XML DOM object to create the XML and output it. When I browse to this page in Firefox, I see the output in Figure 2.
Figure 2. The XML from the read.php page
That finishes the server side of the equation. Now I need to put together a Dynamic Hyper Text Markup Language (DHTML) page that uses Ajax to use these PHP pages.
Building the client
The next thing to do is build a client that uses the PHP pages. I'll build it in three phases so you can follow along. The first version, in Listing 8, displays a control that shows the list of feeds.
Listing 8. index2.html
<html> <head> <title>Ajax RSS Reader</title>
<style>
body { font-family: arial, verdana, sans-serif; }
</style>
<script>
var g_homeDirectory = 'http://localhost/rss/';
var req = null;
function processReqChange( handler ) {
if (req.readyState == 4 && req.status == 200 && req.responseXML ) {
handler( req.responseXML ); }
}
function loadXMLDoc( url, handler ) {
if(window.XMLHttpRequest) {
try { req = new XMLHttpRequest(); } catch(e) { req = false; }
}
else if(window.ActiveXObject)
{
try { req = new ActiveXObject("Msxml2.XMLHTTP"); } catch(e) {
try { req = new ActiveXObject("Microsoft.XMLHTTP"); } catch(e) { req = false; } }
}
if(req) {
req.onreadystatechange = function() { processReqChange( handler ); };
req.open("GET", url, true);
req.send("");
}
}
function parseFeedList( dom ) {
var elfl = document.getElementById( 'elFeedList' );
elfl.innerHTML = '';
var nl = req.responseXML.getElementsByTagName( 'feed' );
for( var i = 0; i < nl.length; i++ ) {
var nli = nl.item( i );
var id = nli.getAttribute( 'id' );
var link = nli.getAttribute( 'link' );
var name = nli.getAttribute( 'name' );
var elOption = document.createElement( 'option' );
elOption.value = id;
elOption.innerHTML = name;
elfl.appendChild( elOption );
}
}
function getFeedList()
{
loadXMLDoc( g_homeDirectory+'list.php', parseFeedList );
}
</script> </head> <body>
<select id="elFeedList"> </select>
<script> getFeedList(); </script>
</body> </html>
|
The page has one control on it, the <select> control. This control is filled by the getFeedList function that requests the list.php page from the server. When the page is loaded, the parseFeedList function adds the items to the <select> control.
When I browse to this in Firefox, I see the output in Figure 3.
Figure 3. The first version of the RSS reader
To get these first few feeds into the system, I use the MySQL interface to add them manually.
The next step is to display the content of the selected feed. Listing 9 shows the upgraded code.
Listing 9. index3.html
<html> <head> <title>Ajax RSS Reader</title>
<style>
body { font-family: arial, verdana, sans-serif; }
.title { font-size: 14pt; border-bottom: 1px solid black; }
.title a { text-decoration: none; }
.title a:hover { text-decoration: none; }
.title a:visited { text-decoration: none; }
.title a:active { text-decoration: none; }
.title a:link { text-decoration: none; }
.description { font-size: 9pt; margin-left: 20px; }
</style>
<script>
var g_homeDirectory = 'http://localhost/rss/';
var req = null;
function processReqChange( handler ) { ... }
function loadXMLDoc( url, handler ) { ... }
function parseFeed( dom ) {
var ela = document.getElementById( 'elArticles' );
ela.innerHTML = '';
var elTable = document.createElement( 'table' );
var elTBody = document.createElement( 'tbody' );
elTable.appendChild( elTBody );
var nl = req.responseXML.getElementsByTagName( 'article' );
for( var i = 0; i < nl.length; i++ ) {
var nli = nl.item( i );
var title = nli.getAttribute( 'title' );
var link = nli.getAttribute( 'link' );
var description = nli.firstChild.nodeValue;
var elTR = document.createElement( 'tr' );
elTBody.appendChild( elTR );
var elTD = document.createElement( 'td' );
elTR.appendChild( elTD );
var elTitle = document.createElement( 'h1' );
elTitle.className = 'title';
elTD.appendChild( elTitle );
var elTitleLink = document.createElement( 'a' );
elTitleLink.href = link;
elTitleLink.innerHTML = title;
elTitleLink.target = '_blank';
elTitle.appendChild( elTitleLink );
var elDescription = document.createElement( 'p' );
elDescription.className = 'description';
elDescription.innerHTML = description;
elTD.appendChild( elDescription );
}
ela.appendChild( elTable );
}
function parseFeedList( dom ) {
var elfl = document.getElementById( 'elFeedList' );
elfl.innerHTML = '';
var nl = req.responseXML.getElementsByTagName( 'feed' );
var firstId = null;
for( var i = 0; i < nl.length; i++ ) {
var nli = nl.item( i );
var id = nli.getAttribute( 'id' );
var link = nli.getAttribute( 'link' );
var name = nli.getAttribute( 'name' );
var elOption = document.createElement( 'option' );
elOption.value = id;
elOption.innerHTML = name;
elfl.appendChild( elOption );
if ( firstId == null ) firstId = id;
}
loadFeed( firstId );
}
function loadFeed( id ) { loadXMLDoc( g_homeDirectory+'read.php?id='+id, parseFeed ); }
function getFeedList() { loadXMLDoc( g_homeDirectory+'list.php', parseFeedList ); }
</script> </head> <body> <div style="width:600px;">
<select id="elFeedList"
onchange="loadFeed( this.options[this.selectedIndex].value )"> </select>
<div id="elArticles"> </div>
<script> getFeedList(); </script>
</div> </body> </html>
|
I omitted the processReqChange and loadXMLDoc functions because they are the same as before. The new code is in the loadFeed and parseFeed functions that request data from the read.php page, parse it, and add it to the page.
Figure 4 shows the output of this page in Firefox.
Figure 4. The upgraded page that shows the article list
The next step is to finish the page with the ability to add a feed to the list through the add.php page. This final code for the page is in Listing 10.
Listing 10. index.html
<html> <head> <title>Ajax RSS Reader</title>
<style>
...
</style>
<script>
var g_homeDirectory = 'http://localhost/rss/';
// The same transfer functions as before
function addFeed()
{
var url = prompt( "Url" );
loadXMLDoc( g_homeDirectory+'add.php?url='+escape( url ), parseAddReturn );
window.setTimeout( getFeedList, 1000 );
}
function loadFeed( id ) { loadXMLDoc( g_homeDirectory+'read.php?id='+id, parseFeed ); }
function getFeedList() { loadXMLDoc( g_homeDirectory+'list.php', parseFeedList ); }
</script> </head> <body> <div style="width:600px;">
<select id="elFeedList" onchange="loadFeed( this.options[this.selectedIndex].value )">
</select>
<input type="button" onclick="addFeed()" value="Add Feed..." />
<div id="elArticles"> </div>
<script> getFeedList(); </script>
</div> </body> </html>
|
Most of the code here is the same, but I have inserted a new Add Feed... button that opens a dialog box where you can insert a new URL into the feed list. To make it easy on myself, I have the browser wait for two seconds and then get the new feed list after the feed has been added.
Figure 5 shows the finished page.
Figure 5. The finished page
Now this is pretty cool. But I'm not satisfied because the XMLHTTP security prevents me from taking the JavaScript code from this page and copying it onto someone else's blog so that anyone can look at the feeds. To do that, I need to re-engineer the services to use the <script> tag and the JavaScript Object Notation (JSON) syntax.
Going from XML to JSON
For this article, I'm only going to allow the feeds to be viewed through the script syntax, although I really could go the whole way using script tags as the data transport mechanism. To get to the feeds, I first need the feed list encoded as JavaScript code. So I create a list_js.php page as shown in Listing 11.
Listing 11. list_js.php
<?php
require_once 'rss_db.php';
header( 'Content-type: text/javascript' );
$rows = FeedList::getAll();
$feeds = array();
foreach( $rows as $row )
{
$feed = "{ id:".$row['rss_feed_id'];
$feed .= ", link:'".$row['url']."'";
$feed .= ", name:'".$row['name']."' }";
$feeds []= $feed;
}
?>
setFeeds( [ <?php echo( join( ', ', $feeds ) ); ?> ] );
|
When I run this script on the command line, I see the output in Listing 12.
Listing 12. Output from list_js.php
setFeeds( [
{ id:1, link:'http://muttmansion.com/ds/index.xml', name:'Driving Sideways' },
{ id:2, link:'http://slashdot.org/slashdot.rdf', name:'Slashdot' },
{ id:3, link:'http://muttmansion.com/vl/index.xml', name:'Visible Light' },
{ id:4, link:'http://muttmansion.com/sor/index.xml', name:'Socks on a Rooster' },
{ id:5, link:'http://muttmansion.com/dd/index.xml', name:'Doxie Digest' },
{ id:6, link:'http://rss.cnn.com/rss/cnn_topstories.rss', name:'CNN.com' },
{ id:7, link:'http://rss.cnn.com/rss/cnn_world.rss', name:'CNN.com - World' },
{ id:8, link:'http://rss.cnn.com/rss/cnn_us.rss', name:'CNN.com - U.S.' } ] );
|
This is conducive to a <script> tag. When the browser loads this, the setFeeds function is called with the list of the feeds. That, in turn, sets up the <select> control and loads the first feed.
I also need the equivalent of the read.php function that returns article data in JavaScript code instead of XML. Listing 13 shows the read_js.php page.
Listing 13. read_js.php
<?php
require_once 'rss_db.php';
function js_clean( $str )
{
$str = preg_replace( "/\'/", "", $str );
return $str;
}
FeedList::update();
$id = array_key_exists( 'id', $_GET ) ? $_GET['id'] : 1;
$rows = Feed::get( $id );
$items = array();
foreach( $rows as $row )
{
$js = "{ title:'".js_clean($row['title'])."'";
$js .= ", link:'".js_clean($row['link'])."'";
$js .= ", description:'".js_clean($row['description'])."' }";
$items []= $js;
}
?>
addFeed( <?php echo( $id ); ?>,
[ <?php echo( join( ', ', $items ) ); ?> ] );
|
Once again, after I run this script on the command line, I see the output in Listing 14.
Listing 14. Output from read_js.php
addFeed( 1,
[ { title:'War',
link:'http://www.muttmansion.com/ds/archives/002816.html',
description:'The...' }, ... ] );
|
I've truncated it here for brevity, but you get the point. The addFeed function is called with the ID of the feed and the article data encoded in JavaScript format.
With these new JavaScript-enabled services, I can now create a new page that uses the services. Listing 15 shows this new page.
Listing 15. script.html
<html> <head> <title>Script Component Test</title>
<style>
...
</style>
<script>
var g_homeDirectory = 'http://localhost/rss/';
function loadScript( url ) {
var elScript = document.createElement( 'script' );
elScript.src = url;
document.body.appendChild( elScript );
}
function addFeed( id, articles ) {
var ela = document.getElementById( 'elArticles' );
ela.innerHTML = '';
var elTable = document.createElement( 'table' );
var elTBody = document.createElement( 'tbody' );
elTable.appendChild( elTBody );
for( var a in articles ) {
var title = articles[a].title;
var link = articles[a].link;
var description = articles[a].description;
// Create elements as before...
}
ela.appendChild( elTable );
}
function setFeeds( feeds ) {
var elfl = document.getElementById( 'elFeedList' );
elfl.innerHTML = '';
var firstId = null;
for( var f in feeds ) {
var elOption = document.createElement( 'option' );
elOption.value = feeds[f].id;
elOption.innerHTML = feeds[f].name;
elfl.appendChild( elOption );
if ( firstId == null ) firstId = feeds[f].id;
}
loadFeed( firstId );
}
function loadFeed( id ) { loadScript( g_homeDirectory+'read_js.php?id='+id ); }
function getFeedList() { loadScript( g_homeDirectory+'list_js.php' ); }
</script> </head> <body>
<div class="rssControl"> <div class="rssControlTitle">
<select id="elFeedList"
onchange="loadFeed( this.options[this.selectedIndex].value )">
</select> </div> <div id="elArticles"> </div> </div>
<script> getFeedList(); </script>
</body> </html>
|
This page is similar to the original index.html page. However, instead of using the loadXMLDoc function, I use a new function, called loadScript, that creates a <script> tag dynamically. The <script> tag then loads the JavaScript code from the specified URL.
These script tags call the read_js.php and list_js.php pages. These pages, in turn, create JavaScript code that calls back to the setFeeds and addFeed functions in the host page.
When I go to the page, my browser displays what is in Figure 6.
Figure 6. The RSS reader that uses <script> tags for the data
The big advantage of this code is that anyone can use the View Source command to view the script from the page and copy the code into their own pages. Then their pages will use the PHP services that return JavaScript code to update the page.
Conclusion
In this article, I demonstrated how to use two different techniques to access data dynamically from a Web page to create an RSS reader on the page. Hopefully, you can use the concepts and code provided here to enrich your own application without having to entirely retool your code. That's the real value of Ajax -- if you are familiar with Web technologies, it's a snap to upgrade the interactivity of your page with a few new services on the server side and a little code on the client side.
Animated demos
If this is your first encounter with a developerWorks article that includes demos, here are a few things you might want to know:
Demos are an optional way to see the same steps described in the tutorial. To see an animated demo, click the Show me link. The demo opens in a new browser window.
Each demo contains a navigation bar at the bottom of the screen. Use the navigation bar to to pause, exit, rewind, or fast forward portions of the demo.
The demos are 800 x 600 pixels. If this is the maximum resolution of your screen or if your resolution is lower than this, you will have to scroll to see some areas of the demo. JavaScript must be enabled in your browser and Macromedia Flash Player 6 or higher must be installed.
Download Adobe Flash Player.
Download | Description | Name | Size | Download method |
|---|
| Source code | x-ajaxrss-code.zip | 8KB | HTTP |
|---|
Resources Learn
Get products and technologies
- Visit the PHP home page, a great place to learn about PHP.
- Build your next development project with IBM trial software, available for download directly from developerWorks.
Discuss
- Check out ajaxian, an excellent blog that tracks developments in Ajax applications.
- Participate in developerWorks blogs and get involved in the developerWorks community.
About the author
Rate this page
|  |