Creating your own database drivers in PHP

Have you ever come across a situation where you’re dying to make that API for access to the database, only to find out there are multiple database versions and syntaxes to support? Now ofcourse there’s the PDO library, but we don’t want to use that in our case. Why? Because we want to write our own drivers. This allows us to easily implement support for new databases like CouchDB and MongoDB. (We have to note that these two are not similar to classical relational databases, so we might want to write restrictions or fallbacks for them.)

How shall we proceed?

We are going to have a proper thought process on this. Rushing into programming rather abstract PHP isn’t something I can recommend.

    Database representation
  1. List the public methods / creating an abstract class
  2. Decide how to handle drivers
  3. Decide on a file structure
  4. Create a simple database with a test table
  5. Write protected methods
  6. Write an interface for the drivers
  7. Write the first driver
  8. Test the driver
  9. Design a proprietary database system
  10. Write proprietary driver
  11. Test the proprietary driver
  12. Review & refactor
  13. Conclusion

List the public methods / creating an abstract class

To make ourselves a really simple database layer, we’re going to stick with the following few public methods:

abstract class Db_Abstract{
 
	// Force Db class to define these methods
	abstract protected function _prepare($query);
	abstract protected function _query($query);
	abstract protected function _result(); // resource will be stored inside the driver object
	abstract protected function _isConnected();
	abstract protected function _connect($params);
	abstract protected function _affectedRows(); // resource will be stored inside the driver object
	abstract protected function _driver($driver);
 
	/** =Public methods
	 * These functions will define the behavior of our Database class
	 **/
 
	/*
	* @function database
	* @description Select the database
	* @param String $db
	* @return Bool True|False
	*/
	public function database($db){
		return $this->driver->_selectDb($db);
	}
 
	/*
	* @function select
	* @description Select table rows
	* @param Array $fields
	* @param String $table
	* @param String $arguments (optional)
	* @return Array $rows
	*/
	public function select($fields,$table,$arguments=""){
		return $this->driver->select($fields,$table,$arguments);
	}
 
	/*
	* @function insert
	* @description Insert new row(s) into the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @return Int $affectedRows
	*/
	public function insert($keys,$values,$table){
		return $this->driver->insert($keys,$values,$table);
	}
 
	/*
	* @function update
	* @description Update row(s) in the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @param String $arguments (optional)
	* @return Int $affectedRows
	*/
	public function update($keys,$values,$table,$arguments=""){
		return $this->driver->update($keys,$values,$table,$arguments);
	}
 
	/*
	* @function delete
	* @description Delete a row from the table
	* @param String $key
	* @param String $value
	* @param String $table
	* @return Int $affectedRows
	*/
	public function delete($key,$value,$table){
		return $this->driver->delete($key,$value,$table);
	}
}

As you can see, the Database Abstract purely gives us a new abstraction layer. It makes simple aliases for the functions within the driver. This allows us to have proper naming conventions across all the drivers.

Decide how to handle drivers

We must now decide how to work with the driver. What is handled by the driver and what is handled by the database class.

The driver must

  • Set up and maintain a connection with the database
  • Prepare queries
  • Execute queries
  • Store resources
  • Parse results
  • Insert, update and delete rows
  • Handle statistics about the database

The database class must

  • Provide naming conventions
  • Handle I/O between the database and the rest of your application

Decide on a file structure

Since we want to keep it as readable as possible, we’re going for a decent file structure that describes the contents of itself.

For now we’ll use the following naming conventions:

  • Database Abstract class and core class

    ./db.php

  • Database drivers

    ./drivers/{$driver}.php

We take for granted that ./ points to the __DIR__[1] magic constant of db.php.

Create a simple database with a test table

The first driver we will write will be one for MySQL, since I’m most familiar with the MySQL server. We’ll use the following simple test database:

SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
 
CREATE DATABASE `test` DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
USE `test`;
 
CREATE TABLE IF NOT EXISTS `test` (
  `key` VARCHAR(255) NOT NULL,
  `value` VARCHAR(255) NOT NULL,
  PRIMARY KEY  (`key`)
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;
 
INSERT INTO `test` (`key`, `value`) VALUES
('key1', 'value1'),
('key2', 'value2');

test

key value
key1 value1
key2 value2

Write protected methods

The next step is to provide the I/O layer by writing the protected methods. These methods cannot be called from outside the database class. They will parse input before it is handed over to the driver and prepare the output from the driver before it arrives at your application.

class Db extends Db_Abstract{
 
	var $driver;
 
	/*
	* @function __construct
	* @description Run on object creation
	* @param String $connection (input: 'driver://username:password@server:port/database')
	* @return void
	*/
	function __construct($connection=null){
		if(!defined('__DIR__')){
			define('__DIR__',dirname(__FILE__));
		}
		if(!empty($connection)){
			$connection = parse_url($connection);
			$this->_driver($connection['scheme']);
			$this->_connect($connection);
		}
	}
 
	/*
	* @function _escape
	* @description Emulate the behavior of mysql_real_escape_string() to prevent SQL Injection
	* @param String $query
	* @return String $query
	*/
	protected function _escape($query){
		$unsafe_tokens = array("\x00","\n","\r","\\","'",'"',"\x1a");
		$safe_tokens = array("\\\x00","\\\n","\\\r","\\\\","\\"."'","\\".'"',"\\\x1a");
		return str_replace($unsafe_tokens,$safe_tokens,$query);
	}
 
	/*
	* @function _prepare
	* @description Prepares the query before handing it over to the SQL client
	* @param String $query
	* @return String $query
	*/
	protected function _prepare($query){
		// In case the driver has it's own _prepare method
		if(method_exists($this->driver,'_prepare')){
			return $this->driver->_prepare($query);
		}
		return $this->_escape($query);
	}
 
	/*
	* @function _query
	* @description Runs an internal query request
	* @param String $query
	* @return Int $affectedRows
	*/
	protected function _query($query){
		$this->driver->query($query);
		return $this->_affectedRows();
	}
 
	/*
	* @function _result(void)
	* @description Fetch the result of a query
	* @return mixed $result
	*/
	protected function _result(){
		return $this->driver->_result();
	}
 
	/*
	* @function _isConnected(void)
	* @description Fetch the connection status
	* @return Bool True|False
	*/
	protected function _isConnected(){
		return (bool) is_object($this->driver)&&$this->driver->_isConnected();
	}
 
	/*
	* @function _connect
	* @description Connect to the database
	* @param Array $params (result of parse_url)
	* @return Bool True|False $isConnected
	*/
	protected function _connect($params){
		if(!empty($params['host'])&&!empty($params['user'])&&!empty($params['path'])){
			$host = $params['host'];
			$user = $params['user'];
			$pass = (!empty($params['pass'])) ? $params['pass'] : '';
			$port = (!empty($params['port'])) ? $params['port'] : 3306;
			$database = trim($params['path'],'/');
			$this->driver->_connect($host,$user,$pass,$port);
			$this->driver->_selectDb($database);
		}
		return $this->_isConnected();
	}
 
	/*
	* @function connect
	* @description Connect to the database
	* @param String $connection (input: 'driver://username:password@server:port/database')
	* @return Bool True|False $isConnected
	*/
	public function connect($connection){
		if(!$this->_isConnected()){
			$params = parse_url($connection);
			$this->_driver($params['scheme']);
			return $this->_connect($params);
		}
		return true;
	}
 
	/*
	* @function disconnect(void)
	* @description Disconnect from the database
	* @return void
	*/
	public function disconnect(){
		$this->driver->_disconnect();
	}
 
	/*
	* @function _affectedRows(void)
	* @description Fetch the number of rows affected by the last query
	* @return Int $affectedRows
	*/
	protected function _affectedRows(){
		return $this->driver->_affectedRows();
	}
 
	/*
	* @function _driver
	* @description Use the given driver
	* @param String $driver
	* @return void
	*/
	protected function _driver($driver){
		$driver_class = 'Db_Driver_'.ucfirst(strtolower($driver));
		$driver_file = __DIR__.'/drivers/'.strtolower($driver).'.php';
		if(!class_exists($driver_class)&&file_exists($driver_file)){
			require_once($driver_file);
		}
		if(!is_object($this->driver)||get_class($this->driver)!==$driver_class){
			$this->driver = new $driver_class;
		}
	}
 
}

Some functionality should be explained:

  • _escape($query)

    Normally, to prepare a request for the MySQL client, we use mysql_real_escape_string($query) to prevent SQL Injection. Since this function specifically only works when a MySQL connection is active, we can’t use this to escape queries for other database connections. Therefore, we emulate the same functionality with a manual function. This is specifically done for _prepare($query).

  • _prepare($query)

    This method first looks for a nested _prepare method inside the driver object. If it’s available, then it will execute that method. Otherwise, it will fall back to _escape.

  • _connect($params)

    This method uses the result of parse_url(). I specifically chose that structure, because a URL can store all information needed to select a driver, connect to the database server and select the database.

    parse_url return values

    • scheme (used for driver)
    • host
    • port
    • user
    • pass
    • path
    • query
    • fragment

    query and fragment are not used, but allow for future extending on the connection methods.

    The driver must store the connection status. The database class will not cache the connection status for obvious reasons.

  • _driver($driver)

    If the corresponding driver class does not exist, this will be loaded from the drivers directory.

    When there is no driver set or the driver is not the same as the current driver, a new driver object will be created.

Write an interface for the drivers

Writing an interface is quite similar to writing an abstract class. The difference is that an abstract class can contain methods that will be inherited. An interface can only define methods and variables. It is similar to a set of construction guidelines. It is the bare minimum of which a class should consist, without having predefined what happens inside the methods.

In essence, we want to write down whatever method can be called from within the database class. This way, a driver must consist of at least these methods.

interface Db_Driver_Interface {
 
	public function select($fields,$table,$arguments="");
	public function insert($keys,$values,$table);
	public function update($keys,$values,$table);
	public function delete($key,$value,$table);
 
	public function query($query);
	public function _result();
	public function _affectedRows();
 
	public function _isConnected();
	public function _connect($host,$user,$pass,$port);
	public function _selectDb($database);
	public function _disconnect();
 
}

Write the first driver

As We’re testing the database class on a MySQL server, the first driver is a MySQL driver. It will be stored in ./drivers/mysql.php and the class name will be Db_Driver_Mysql. The naming convention is Db_Driver_$class_name where $class_name = ucfirst(strtolower($driver)); (all lower case and capitalize first character).

class DB_Driver_Mysql implements Db_Driver_Interface {
 
	/*
	* @var $res
	* @description Used to store the connection and result resources
	*/
	var $res;
 
	/*
	* @function __construct(void)
	* @description Populate the resources array
	*/
	function __construct(){
		$this->res = array(
			'connection'	=> null,
			'result'		=> null
		);
	}
 
	/*
	* @function select
	* @description Select table rows
	* @param Array $fields
	* @param String $table
	* @param String $arguments (optional)
	* @return Array $rows
	*/
	public function select($fields,$table,$arguments=""){
		$query = 'SELECT `'.implode('`,`',$fields).'` FROM `'.$table.'`';
		$query .= (!empty($arguments)) ? ' '.$arguments : '';
		$this->query($query);
		return $this->_result();
	}
 
	/*
	* @function insert
	* @description Insert new row(s) into the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @return Int $affectedRows
	*/
	public function insert($keys,$values,$table){
		$query = 'INSERT INTO `'.$table.'` (`'.implode('`, `',$keys).'`) VALUES ';
		if(isset($values[0])&&is_array($values[0])){ // There are multiple rows to be inserted
			$rows = array();
			foreach($values AS $row){
				$rows[] = '\''.implode('\', \'',$row).'\'';
			}
			$query .= '('.implode('), (',$rows).')';
		}
		else{
			$query .= '(\''.implode('\', \'',$values).'\')';
		}
		$query .= ';';
		$this->query($query);
		return $this->_affectedRows();
	}
 
	/*
	* @function update
	* @description Update row(s) in the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @param String $arguments (optional)
	* @return Int $affectedRows
	*/
	public function update($keys,$values,$table,$arguments=""){
 
		$affectedRows = 0;
 
		if(!is_array($values[0])){ $values = array($values); } // Make it multi-dimensional
		for($i=0;$i<count($values);$i++){
			$vals = $values[$i];
			$query = 'UPDATE `'.$table.'` SET ';
			$updates = array();
			for($j=0;$j<count($keys);$j++){
				$val = (is_numeric($vals[$j])) ? $vals[$j] : "'".$this->_prepare($vals[$j])."'";
				$updates[] = '`'.$keys[$j].'` = '.$val;
			}
			$query .= implode(', ',$updates);
			$query .= (!empty($arguments)) ? ' '.$arguments : '';
			$this->query($query);
			$affectedRows += $this->_affectedRows();
		}
 
		return $affectedRows;
	}
 
	/*
	* @function delete
	* @description Delete a row from the table
	* @param String $key
	* @param String $value
	* @param String $table
	* @return Int $affectedRows
	*/
	public function delete($key,$value,$table){
		$value = (is_numeric($value)) ? $value : "'".$value."'";
		$query = 'DELETE FROM `'.$table.'` WHERE `'.$key.'` = '.$value;
		$this->query($query);
		return $this->_affectedRows();
	}
 
	/*
	* @function query
	* @description Perform a native query
	* @param String $query
	* @return void
	*/
	public function query($query){
		$this->res['result'] = mysql_query($query);
	}
 
	/*
	* @function _prepare
	* @description Prepares the query before handing it over to the SQL client
	* @param String $query
	* @return String $query
	*/
	public function _prepare($query){
		return mysql_real_escape_string($query,$this->res['connection']);
	}
 
	/*
	* @function _result(void)
	* @description Fetch the result of a query
	* @return mixed $result
	*/
	public function _result(){
		$result = array();
		while($row=mysql_fetch_assoc($this->res['result'])){
			$result[] = $row;
		}
		return $result;
	}
 
	/*
	* @function _affectedRows(void)
	* @description Fetch the number of rows affected by the last query
	* @return Int $affectedRows
	*/
	public function _affectedRows(){
		return mysql_affected_rows($this->res['connection']);
	}
 
	/*
	* @function _isConnected(void)
	* @description Fetch the connection status
	* @return Bool True|False
	*/
	public function _isConnected(){
		return (bool) is_resource($this->res['connection']);
	}
 
	/*
	* @function _connect
	* @description Connect to the database
	* @param String $host
	* @param String $user
	* @param String $pass
	* @param Int $port
	* @return void
	*/
	public function _connect($host,$user,$pass,$port){
		$this->_disconnect();
		$this->res['connection'] = mysql_connect($host.':'.$port,$user,$pass);
	}
 
	/*
	* @function _selectDb
	* @description Select the database
	* @param String $db
	* @return Bool True|False
	*/
	public function _selectDb($database){
		if($this->_isConnected()){
			return (bool) mysql_select_db($database,$this->res['connection']);
		}
		return false;
	}
 
	/*
	* @function _disconnect(void)
	* @description Disconnect from the database
	* @return void
	*/
	public function _disconnect(){
		if($this->_isConnected()){
			mysql_close($this->res['connection']);
			$this->res['connection'] = null;
		}
	}
 
	/*
	* @function __destruct(void)
	* @description Clean up any open connection
	*/
	function __destruct(){
		$this->_disconnect();
	}
 
}

It’s a whole lot of code, but a lot of it is very common and we’ll run through the methods real fast.

  • __construct()

    First we set up a local array in which we store the connection and result data.

  • select($fields,$table,$arguments="")

    We can use select to select fields from a table and we can use $arguments to provide restrictions to the query, like $arguments = 'WHERE key = \'key1\'';.

  • insert($keys,$values,$table)

    This method can insert values into the table. The values should be in the same order as the keys. Multiple rows can be entered at once. In that case, the structure of the values array is array( $row => array($val,$val), $row => array($val,$val) ).

  • update($keys,$values,$table,$arguments="")

    Update can do single or multiple updates at once. It uses the same structure as insert for the keys and values and can use $argument the same way as the select method can.

  • delete($key,$value,$table)

    This method can delete single or multiple rows, but simply uses 'DELETE FROM `'.$table.'` WHERE `'.$key.'` = '.$value;.

  • query($query)

    This function enables you to perform queries that are native to the system connected to the driver. The result can be retrieved using _result().

  • _prepare($query)

    This method is optional, but since MySQL provides a native function to prepare values, we make use of that.

  • _result()

    All the resulting query data will be put in an associative multi-dimensional array. The first dimension describes the rows and the second dimension consists of the key-value-pair.

  • _affectedRows()

    The number of rows, affected by the last query.

  • _isConnected()

    Tells if a connection with the database is active.

  • _connect($host,$user,$pass,$port)

    Connect to the database.

  • _selectDb($database)

    Select a different database.

  • _disconnect()

    Disconnect from the database.

  • __destruct()

    Clean up any open database connections.

Test the driver

To test our freshly baked driver, we simply have to take the following steps:

  1. Connect to the database
  2. Select all rows
  3. Insert a new row
  4. Select that row
  5. Update that row
  6. Select that row again
  7. Delete that row
  8. Select all rows

Connect to the database

require_once('db.php');
 
$db = new Db;
$db->connect('mysql://root:lamepass@localhost:3306/test');
/*
* Alternatively you could issue:
$db = new Db('mysql://root:lamepass@localhost:3306/test');
*/
 
echo ($db->driver->_isConnected()) ? 'Connection established' : 'Connection failed'; // returns 'Connection established'
Connection established

Select all rows

$s = $db->select(array('key','value'),'test');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key1
            [value] => value1
        )

    [1] => Array
        (
            [key] => key2
            [value] => value2
        )

)

Insert a new row

echo $db->insert(
	array('key','value'),
	array(
		array('key3','value3'),
		array('key4','value4')
	),
	'test'
); // returns affected rows
2

Select that row

$s = $db->select(array('key','value'),'test','WHERE `test`.`key` = \'key3\'');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key3
            [value] => value3
        )

)

Update that row

echo $db->update(
	array('value'),
	array('value3-2'),
	'test',
	'WHERE `test`.`key` = \'key3\''
); // returns affected rows
1

Select that row again

$s = $db->select(array('key','value'),'test','WHERE `test`.`key` = \'key3\'');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key3
            [value] => value3-2
        )

)

Delete that row

echo $db->delete('key','key3','test');
1

And after repeat:

0

This proves that it deleted the row in the first place and couldn’t find it the second time.

Select all rows

$s = $db->select(array('key','value'),'test');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key1
            [value] => value1
        )

    [1] => Array
        (
            [key] => key2
            [value] => value2
        )

    [2] => Array
        (
            [key] => key4
            [value] => value4
        )

)

Design a proprietary database system

Because we’re only going to use this database system once, I went for the name dummy.

I’m sorry for all the lists in this article, but bear with me as we make another one. The proprietary system will have no DBMS[2]. To enable the needed functionality, we will implement features of a DBMS inside our driver. This will have no great influence on how we write the driver. The proprietary system will have the following features:

  • Multiple databases
  • Multiple tables
  • Multiple records
  • A simple key-value-pair structure
  • Authentication
  • Rights per database
  • Rights per table
  • A simple storage system

auth

johmanx:lamepass
[test]:create,drop,flush,update
=>[test]:select,insert,update,delete

dummy:lamepass
[test]:
=>[test]:select

dbs

[test]
=>[test]:[key,value]
==>[key1,value1]
==>[key2,value2]

[test2]
=>[test]:[name,mail]
==>[Jan-Marten de Boer,lievheid@johmanx.com]

Write proprietary driver

class DB_Driver_Dummy implements Db_Driver_Interface {
 
	/*
	* @var $res
	* @description Used to store the connection and result resources
	*/
	var $res;
 
	/*
	* @function __construct(void)
	* @description Populate the resources array
	*/
	function __construct(){
		$this->res = array(
			'connection'	=> false,
			'credentials'	=> array(),
			'result'		=> null,
			'database'		=> array(),
			'rights'		=> array(),
			'affectedRows'	=> 0,
			'dbs'			=> array()
		);
	}
 
	/*
	* @function select
	* @description Select table rows
	* @param Array $fields
	* @param String $table
	* @param String $arguments (optional) ("key=key3,key=key4")
	* @return Array $rows
	*/
	public function select($fields,$table,$arguments=""){
		$t = $this->res['dbs'][$this->res['database']][$table];
		$result = array();
		$keys = $t['keys'];
		$records = $t['records'];
		$ks = $this->res['dbs'][$this->res['database']][$table]['keys'];
		$affectedRows = 0;
 
		if(empty($arguments)){
			foreach($records AS $row){
				$r = array();
				foreach($fields AS $field){
					if(in_array($field,$keys)){
						$r[$field] = $row[$field];
					}
				}
				$result[]=$r;
				$affectedRows++;
			}
		}
		else{
			if(strpos($arguments,',')!==false){
				$keywords = explode(',',$arguments);
			}
			else{
				$keywords = array(trim($arguments));
			}
			$kws = array();
			foreach($keywords AS $keyword){
				$keyword = explode('=',$keyword);
				$kws[$keyword[0]][]=$keyword[1];
			}
 
			for($i=0;$i<count($records);$i++){
				$rac = 0;
				for($j=0;$j<count($ks);$j++){
					$key = $ks[$j];
					$val = $records[$i][$key];
					if(isset($kws[$key])&&in_array($val,$kws[$key])){
						$rac++;
					}
				}				
 
				if($rac==count($kws)){
					$row = $records[$i];
					$r = array();
					foreach($fields AS $field){
						if(in_array($field,$keys)){
							$r[$field] = $row[$field];
						}
					}
					$result[]=$r;
					$affectedRows++;
				}	
			}			
		}
 
		$this->res['result'] = $result;
		$this->res['affectedRows'] = $affectedRows;
		return $this->_result();
	}
 
	/*
	* @function insert
	* @description Insert new row(s) into the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @return Int $affectedRows
	*/
	public function insert($keys,$values,$table){
		if(!is_array($values[0])){
			$values = array($values);
		}
 
		foreach($values AS $record){
			$r = array();
			for($i=0;$i<count($record);$i++){
				$r[$keys[$i]]=$record[$i];
			}
			$this->res['dbs'][$this->res['database']][$table]['records'][] = $r;
		}
 
		$this->res['affectedRows'] = count($values);
		return $this->_affectedRows();
	}
 
	/*
	* @function update
	* @description Update row(s) in the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @param String $arguments (optional) ("key=key3,key=key4")
	* @return Int $affectedRows
	*/
	public function update($keys,$values,$table,$arguments=""){
		$records = $this->res['dbs'][$this->res['database']][$table]['records'];
		$ks = $this->res['dbs'][$this->res['database']][$table]['keys'];
		$affectedRows = 0;
 
		if(empty($arguments)){
			for($i=0;$i<count($records);$i++){
				for($j=0;$j<count($keys);$j++){
					$records[$i][$keys[$j]] = $values[$j];
				}
				$affectedRows++;
			}
		}
		else{
			if(strpos($arguments,',')!==false){
				$keywords = explode(',',$arguments);
			}
			else{
				$keywords = array(trim($arguments));
			}
			$kws = array();
			foreach($keywords AS $keyword){
				$keyword = explode('=',$keyword);
				$kws[$keyword[0]][]=$keyword[1];
			}
 
			for($i=0;$i<count($records);$i++){
				$rac = 0;
				for($j=0;$j<count($ks);$j++){
					$key = $ks[$j];
					$val = $records[$i][$key];
					if(isset($kws[$key])&&in_array($val,$kws[$key])){
						$rac++;
					}
				}				
 
				if($rac==count($kws)){
					for($j=0;$j<count($keys);$j++){
						$records[$i][$keys[$j]] = $values[$j];
					}
					$affectedRows++;
				}				
			}
		}
 
		$this->res['dbs'][$this->res['database']][$table]['records'] = $records;
		$this->res['affectedRows'] = $affectedRows;
		return $this->_affectedRows();
	}
 
	/*
	* @function delete
	* @description Delete a row from the table
	* @param String $key
	* @param String $value
	* @param String $table
	* @return Int $affectedRows
	*/
	public function delete($key,$value,$table){
		$records = $this->res['dbs'][$this->res['database']][$table]['records'];
		$affectedRows=0;
		for($i=0;$i<count($records);$i++){
			if($records[$i][$key]==$value){
				unset($this->res['dbs'][$this->res['database']][$table]['records'][$i]);
				$affectedRows++;
			}
		}
		$this->res['affectedRows'] = $affectedRows;
		return $this->_affectedRows();
	}
 
	/*
	* @function query
	* @description Perform a native query
	* @param String $query
	* @return void
	*/
	public function query($query){
		// there is no native query support
	}
 
	/*
	* @function _result(void)
	* @description Fetch the result of a query
	* @return mixed $result
	*/
	public function _result(){
		return $this->res['result'];
	}
 
	/*
	* @function _affectedRows(void)
	* @description Fetch the number of rows affected by the last query
	* @return Int $affectedRows
	*/
	public function _affectedRows(){
		return (int) $this->res['affectedRows'];
	}
 
	/*
	* @function _isConnected(void)
	* @description Fetch the connection status
	* @return Bool True|False
	*/
	public function _isConnected(){
		return (bool) $this->res['connection'];
	}
 
	/*
	* @function _connect
	* @description Connect to the database
	* @param String $host
	* @param String $user
	* @param String $pass
	* @param Int $port
	* @return void
	*/
	public function _connect($host,$user,$pass,$port){
		$this->_disconnect();
 
		$this->res['credentials'] = array(
			'host'		=>	$host,
			'user'		=>	$user,
			'database'	=>	$port
		);
 
		$dummy_file = $host.'/.iamdummy';
		$auth_file = $host.'/auth';
		$dbs_file = $host.'/dbs';
 
		if(file_exists($dummy_file)&&file_exists($auth_file)&&file_exists($dbs_file)){
			$auth = trim(file_get_contents($auth_file));
			$auth = explode("\n\n",$auth);
 
			foreach($auth AS $usr){
				$usr = explode("\n",$usr);
				if($usr[0]==$user.':'.$pass){
					$rights = $usr;
					unset($rights[0]);
 
					$ra = array();
					$cur_db = '';
					foreach($rights AS $rs){
						$rs = explode(':',$rs);
						if(substr($rs[0],0,1)=='['){
							$cur_db = trim($rs[0],'[]');
							$cur_rights = explode(',',$rs[1]);
							$ra[$cur_db] = array(
								'rights'	=>	$cur_rights,
								'tables'	=>	array()
							);
						}
						else{
							$cur_table = substr($rs[0],3,-1);
							$cur_rights = explode(',',$rs[1]);
							$ra[$cur_db]['tables'][$cur_table]=$cur_rights;
						}
					}
					$this->res['rights'] = $ra;
					$this->res['connection'] = true;
				}
			}
 
			$dbs = trim(file_get_contents($dbs_file));
			$dbs = explode("\n\n",$dbs);
 
			$dba = array();
			$cur_db = '';
			foreach($dbs AS $db){
				$db = explode("\n",$db);
				$cur_db = trim(array_shift($db),'[]');
				$dba[$cur_db] = array(); // Database can be empty
				$cur_table = '';
				foreach($db AS $table){
					if(substr($table,0,2)=='=>'){
						$cur_table = substr(trim($table),3,-1);
						$cur_table = explode(']:[',$table);
						$keys = explode(',',rtrim($cur_table[1],']'));
						$cur_table = str_replace('=>[','',$cur_table[0]);
						$dba[$cur_db][$cur_table] = array(
							'keys' => $keys,
							'records' => array()
						);
					}
					else{
						$record = explode(',',substr($table,4,-1));
						$r = array();
						$ks = $dba[$cur_db][$cur_table]['keys'];
						for($i=0;$i<count($record);$i++){
							$r[$ks[$i]]=$record[$i];
						}
						$dba[$cur_db][$cur_table]['records'][]=$r;
					}
				}
			}
			$this->res['dbs']=$dba;
 
			$this->_selectDb($port);
		}
	}
 
	/*
	* @function _selectDb
	* @description Select the database
	* @param String $db
	* @return Bool True|False
	*/
	public function _selectDb($database){
		if(isset($this->res['rights'][$database])&&isset($this->res['dbs'][$database])){
			$this->res['database']=$database;
		}
	}
 
	/*
	* @function _disconnect(void)
	* @description Disconnect from the database
	* @return void
	*/
	public function _disconnect(){
		if($this->_isConnected()){
			$dbs_file = $this->res['credentials']['host'].'/dbs';
			$dbs_data = '';
			foreach($this->res['dbs'] AS $database=>$tables){
				$dbs_data .= '['.$database.']';
				foreach($tables AS $table=>$data){
					$keys = $data['keys'];
					$rows = $data['records'];
 
					$dbs_data .= "\n".'=>['.$table.']:['.implode(',',$keys).']';
					foreach($rows AS $row){
						$ro = array();
						foreach($keys AS $key){
							$ro[] = $row[$key];
						}
						$dbs_data .= "\n".'==>['.implode(',',$ro).']';
					}
				}
				$dbs_data .= "\n\n";
			}
			$dbs_data = rtrim($dbs_data);
			file_put_contents($dbs_file,$dbs_data);
		}
		$this->__construct(); // reset values
	}
 
	/*
	* @function __destruct(void)
	* @description Clean up any open connection
	*/
	function __destruct(){
		$this->_disconnect();
	}
 
}

Test the proprietary driver

And to test the proprietary driver, we simply repeat the tests

  1. Connect to the database
  2. Select all rows
  3. Insert a new row
  4. Select that row
  5. Update that row
  6. Select that row again
  7. Delete that row
  8. Select all rows

Connect to the database

require_once('db.php');
 
$db = new Db('dummy://johmanx:lamepass@dummy/test');
 
echo ($db->driver->_isConnected()) ? 'Connection established' : 'Connection failed'; // returns 'Connection established'
Connection established

Select all rows

$s = $db->select(array('key','value'),'test');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key1
            [value] => value1
        )

    [1] => Array
        (
            [key] => key2
            [value] => value2
        )

)

Insert a new row

echo $db->insert(
	array('key','value'),
	array(
		array('key3','value3'),
		array('key4','value4')
	),
	'test'
); // returns affected rows
2

Select that row

$s = $db->select(array('key','value'),'test','key=key3');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key3
            [value] => value3
        )

)

Update that row

echo $db->update(
	array('value'),
	array('value3-2'),
	'test',
	'key=key3'
); // returns affected rows
1

Select that row again

$s = $db->select(array('key','value'),'test','key=key3');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key3
            [value] => value3-2
        )

)

Delete that row

echo $db->delete('key','key3','test');
1

And after repeat:

0

This proves that it deleted the row in the first place and couldn’t find it the second time.

Select all rows

$s = $db->select(array('key','value'),'test');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key1
            [value] => value1
        )

    [1] => Array
        (
            [key] => key2
            [value] => value2
        )

    [2] => Array
        (
            [key] => key4
            [value] => value4
        )

)

Review & refactor

This database class has opened up a world of new possibilities concerning the implementation of databases. We can now add on database systems like MongoDB. It’s also easy to write drivers for systems that in essence are no database at all. I can see a driver for ftp, so you could use file trees as databases, their directories as tables and the files as records with the file name and other properties as keys for the table. Select could display a directory, insert could add files, update could modify files and delete could delete them.

What needs refactoring

I would change the arguments for the MySQL driver to correspond the arguments for the Dummy driver. SQL is too complex to be fully supported over all possible drivers and the key=value,key=value syntax is much more generic and could easily substitute the WHERE key = 'value' AND key = 'value' argument.

I have also stripped the rights implementation for the dummy driver. As I wrote the driver, I found that it was a really large amount of code for a simple example and thus for abstractions sake, I removed the rights checks. I created a simple method that checked if $action was inside the rights array. I then made a wrapper if-clause for the select, insert, update and delete methods. When the code would be refactored, that would be the first step to implementing a proper rights system.

On Forrst, I got the comment:

Shouldn’t the driver attribute be protected (you haven’t set the attribute visibility at all), and make an abstraction for the _isConnected method? So you don’t have to do $db->driver->_isConnected(). The end user shouldn’t call methods prefixed with _ IMO

– Horacio J. Peña

 

By the time I was writing the drivers, I detected this flaw, since you shouldn’t have to do this: $db->driver->_isConnected();. However, I found that coding for a tutorial may introduce some flaws, so we can all have a good look at it. Horacio is completely right. When refactoring, we should introduce a new abstraction and make all the methods preceded by _ private or protected.

Conclusion

As it is, the database class is still incomplete and immature, but it has great potential and great extendability (by nature, since it supports it’s own drivers). It could also fit well in a RESTful API where you make extensive use of the HTTP Methods GET, POST, PUT and DELETE, easily interchangeable with select, insert, update and delete.

This tutorial was a great learning process, even for myself. It has opened a number of insights in database systems and what my proprietary driver especially points out is the flaws in such a simple system. It’s hackable and breakable in so many ways, you can’t start counting them. Still, in a secluded and protected environment, the simplicity is a big win, since you can open up your fave text editor and alter the database in a breeze. Really useful when your code breaks the database.

If you find any of this useful or have some thoughts on how I made some horrible mistakes, please feel free to leave a comment. All feedback is appreciated and whenever possible I’ll improve the article correspondingly.

Notes

  1. ^ __DIR__ is a magic constant that defines the current directory in which the script resides. In PHP versions prior to 5.3.0, the same could be achieved using dirname(__FILE__), which extracts the directory name of the current file.
  2. ^ A database management system (DBMS) is a software package with computer programs that controls the creation, maintenance, and the use of a database. We combined the DBMS with the driver to prevent abstraction, in order to maintain a readable tutorial.

PHP return types: Beginners guide to debugging PHP

I can remember it so clearly. My first “Hello World!” soared through the skies and cried out a song of victory, like only an eagle could. It was only a minute later I encountered my first bug.
Note: This all was years ago at the age of eleven. Some portion of this might be exaggerated.

It’s a pain to understand what goes wrong when being a newbie

That’s for certain. When dealing with some of the most basic programming, you can still get errors like:
[Fri Jul 01 15:16:47 2011] [warn] mod_fcgid: stderr: PHP Catchable fatal error: Argument 1 passed to print_resource_type() must be an instance of resource, string given, called in /home/johmanx/domains/lievheid.nl/public_html/examples/debugging-php/index.php on line 10 and defined in /home/johmanx/domains/lievheid.nl/public_html/examples/debugging-php/index.php on line 5

 

And so you’re like:

Well, I guess it’s a catchable fatal error. That’s not being helpfull at all. Argument 1 passed to print_resource_type(). Hmm. So it’s something I put into that function? And apparently it must be an instance of resource, whatever that may be. A string was given in some file on line 10. Let’s have a look at that:

1
2
3
4
5
6
7
8
9
10
11
12
<?php
 
error_reporting(E_ALL);
 
function print_resource_type(resource $handle)
    {
        print get_resource_type($handle);
    }
 
print_resource_type("hello");
 
?>

So we see that PHP expects a resource to be given on line 5 as the first argument. If handle is anything but a resource, the whole thing trips and gives off a fatal error. A fatal error terminates the whole script.

Introducing PHP types

Or as used in most common languages

PHP uses different types to tell it’s processor what kind of data needs to be processed.

Type Description Example
(string) Data represented as a textual string. It can contain every character in a row. "My name is Jan-Marten"
(int) A whole number, represented as a number and not a string. 12 is an int and “12″ is a string, although PHP is a really forgiving language. More about that later on. Decimals are also stored in an int. 12
(float) The scientific notation of numbers. Floating points are really large numbers. Programming using floats can result in really unexpected errors and return values, so please stick to using int unless the values no longer fit in an int. 12.12553e-18
(array) A collection/stack of values. As PHP doesn’t support tuples or triples, all corresponding operations should be performed using an array. On a note: PHP extends the capabilities of a string with those of an array.. Sorta.. $string = "abc"; can return part of it by treating it as a string. echo $string[1]; // returns "b". array(“Monkey”=>”I am a primate!”,120,”Bananas per day”=>12,’Ooh ooh ooh aaaa!’);
(bool) A boolean is a representation of success or failure. It can either be TRUE or FALSE and is mostly used in an if statement or in loops. TRUE
(double) Double is an alias of Float. As there are more aliases of data types in PHP, I want to give this one special attention, since it’s not a notable alias to the non educated programmer. 9.88752e-22
(null) NULL is a representation of a value if it’s completely empty. Note: a string of “” is not considered NULL, but a string of zero characters long.
(object) An object is an object is an object … but only in PHP. (Ferengi rule of acquisition #17) The definition of an object is best described as a collection of methods and values. An object contains these methods and values which can or cannot be called from outside the object or within the object. Some are protected, others are private and most are public. The point of an object is to normalize a repeated task, to make it more abstract and enable it to be usefull in more scenarios. More information about objects is available on php.net object(Lievheid)#1 (1) { ["lief"]=> string(20) "This is a public var" }
(resource) A resource is a means of tracking a stream, whether it’s a MySQL link, a file or an XML document. It makes it able for PHP to track the progress of a specific stream. resource(2) of type (stream)

Awesome, but why do we need to know all of this? Well.. The first thing you want to do as a newbie programmer is to understand what you do wrong, so you don’t do it again and know how to fix it. To do this, you must first know what the error is actually telling you. And man, it’s telling a lot of useful goodness. You just need to learn how to read them.

Understanding PHP values

Even more so complex is the way PHP treats values when they’re being validated. Validating values is done on a number of occasions.

When PHP …

  • … stores a value into a variable or static. $n = mysql_num_rows();
  • … checks the conditions inside an if-statement. if($a-$b==$c-$d);
  • … checks the conditions of a for- or while-loop. Validation occurs each iteration (each time the loop restarts). for($i=0;$val;$val=$vals[$i];$i++)

In contradict, it only validates the values once when used inside a switch. This is why we prefer switches over if(){} elseif(){} else{} clauses.

Working with PHP values

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<?php
 
$types = array(
    'boolean'   =>  TRUE,
    'integer'   =>  1,
    'double'    =>  1.,
    'string'    =>  'Hello',
    'array'     =>  array(1,2,3,'hello'),
    'object'    =>  new stdClass,
    'resource'  =>  fopen(__FILE__,'r'),
    'NULL'      =>  NULL
);
 
foreach($types AS $type=>$val)
    {
        echo    PHP_EOL. // Create a new line
                'Value: '.var_export($val,true). // Export the data into a readable format
                "\t\t". // Create two tabs
                'Expected: '.$type. // We expect this to be the type
                "\t\t". // Create another two tabs
                'Type: '.gettype($val).
                PHP_EOL.PHP_EOL;
    }
 
?>

And what we actually get in return is:

Value: true             Expected: boolean               Type: boolean

Value: 1                Expected: integer               Type: integer

Value: 1                Expected: double                Type: double

Value: 'Hello'          Expected: string                Type: string

Value: array (
  0 => 1,
  1 => 2,
  2 => 3,
  3 => 'hello',
)               Expected: array         Type: array

Value: stdClass::__set_state(array(
))              Expected: object                Type: object

Value: NULL             Expected: resource              Type: resource

Value: NULL             Expected: NULL          Type: NULL

Now, my whole point being is that, although PHP returns the types nicely, it doesn’t treat them as such. We can prove that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
<?php
 
$vals = array(
 
    "",
    '',
    1,
    1.,
    "12",
    (bool) 1,
    (bool) 12,
    (bool) 0,
    (int) "12"
 
);
 
echo 'VALUE'."\t\t".'numeric'."\t".'int'."\t".'string'."\t".'bool';
 
foreach($vals AS $val)
    {
        echo    PHP_EOL.
                'Value: '.$val.
                "\t".is_numeric($val).
                "\t".is_int($val).
                "\t".is_string($val).
                "\t".is_bool($val)
                ;
    }
 
?>

And this returns us with:

VALUE           numeric int     string  bool
Value:                  1
Value:                  1
Value: 1        1       1
Value: 1        1
Value: 12       1               1
Value: 1                                1
Value: 1                                1
Value:                          1
Value: 12       1       1

Now, this is really interesting.

Validating PHP conditions

As we can see in the output above, there are a number of noteworthy returns. We’ll talk about them in order:

Input value Parsed value is_numeric is_int is_string is_bool Type hits
“” 1 1
1 1
1 1 1 1 2
1. 1 1 1
“12″ 12 1 1 2
TRUE 1 1 1
TRUE 1 1 1
FALSE 1 1
12 12 1 1 2

Now, what I haven’t tested for is_float, but essentially, the floating point would then also hit 2 types. What I did do is shed some light on the problem I’m trying to highlight since the beginning of this article. The problem being: PHP is too darn considerate!

PHP is a language that tries to accommodate new web developers. It tries to let code work in the most suited way it possibly can, to try not to generate all that much errors when first programming in the language. This, however, obviously brings about some optimization issues. It’s slow in validating values and that can be a real pain with static calculation scripts. It’s also weird to give PHP a value of false and get a parsed result of nothing, instead of 0, like the 1 for true. What’s also weird is that some values can be multiple types at the same time. While PHP actually knows what type it is, as we’ve seen in the first piece of script, it turns out that it’s not that simple in the second script.

Some values can be a string and numeric at the same time, while others can be numeric and an int at the same time. Not all that surprising, since numeric is there to check if the value is an int or a string, consisting of just an int. What is surprising is the first two inputs.

Somehow, PHP thinks that an empty string is considered an int and NOT a string. That is really odd.

How to take advantage of this behaviour?

Well, for starters, we should make use of how PHP crunches numbers. In essence, what PHP does when validating a condition is this:

$condition = mysql_num_rows($result);
$validation = (bool) $condition;

Now, if it returns a number greater than or equal to one, it returns true. If it’s a non-empty string, it returns true and if it’s an object that’s set succesfully, it’s true enough for PHP.
However, if it were to be 0 or less, an empty string or a failing object or function, it would return false.

So how do we make use of this? Well, when using MySQL, most PHP devs use the two following tricks:

1
2
3
4
5
6
7
8
9
10
11
12
13
<?php
 
$conn_id = mysql_connect('localhost','root','lamepass');
$result = mysql_query("SELECT * FROM secret_table");
 
if(mysql_num_rows($result)) // If there are more than 0 rows, it will default to true
    {
        while($row=mysql_fetch_assoc($result)) // Eventually, the $row cannot be set anymore and falls back to false
            {
                print_r($row);
            }
    }
?>

And it also applies when handling other streams in PHP, like fopen or the socket stream. Iterating through the results is made really easy by PHP, but please note that when using it with too high of numbers, it’s really gonna cost you CPU time, since, as explained before, PHP needs to validate the values each iteration.

Conclusion

PHP is a darn considerate language that tries to provide for both the newbie programmer and the abstract adventurer, but at a cost. It can be a slow language, when applied to the wrong tasks. Nevertheless it’s a really easygoing language and one that hopefully stays in my toolkit for a few more years.

So, what do you guys think? Any really awkward encounter with PHP and unexpected value parsing/validation? Please share in the comments!

Rating companies using Wufoo forms

One of my clients asked me to make them a simple site with a landing page and a Wufoo form. The landing page contains a brief story about the website, some local news and a local weather report. (hooever.nl)

The Wufoo form is just a simple one, asking questions about the experience customers had with their company. In this small survey, the visitor rates the experience using likert fields.

Now my customer wants to crunch the numbers from the survey and give the company a grade, using the Dutch grading system, seen in High school, rating from 1-10. I thought this was a simple request, but it turns out I really need to digest a number of structural elements using the Wufoo RESTfull API. I just wanted to share the magic with you guys, so we can all benefit from my research ;) In reality I store this result in a cache, based on the result of /api/v3/forms/hash/entries/count.json.

I’m really gratefull for @chriscoyier writing about Wufoo or else I’d have ended up making my own malicious platform for forms.

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
 
<?php
 
class wufoo{
 
    private $api_version = 'v3';
    private $user;
    private $pass = 'footastic';
    private $subdomain;
 
    function __construct($user,$subdomain,$api_version='v3')
        {
            $this->user = $user;
            $this->subdomain = $subdomain;
            $this->api_version = $api_version;
        }
 
    public function request($doc,$params=null)
        {
            $params = ($params!==null) ? ltrim($params,'&?') : '';
            $url = 'https://'.$this->subdomain.'.wufoo.com/api/'.trim($this->api_version,'/').'/'.$doc.'.json'.$params;
 
            $ch = curl_init($url);
            curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
            curl_setopt($ch,CURLOPT_USERPWD,$this->user.':'.$this->pass);
            curl_setopt($ch,CURLOPT_HTTPAUTH,CURLAUTH_ANY);
            curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
            curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
            curl_setopt($ch,CURLOPT_USERAGENT,'Joh Man X: Wufoo API agent');
 
            $response = curl_exec($ch);
            $status = curl_getinfo($ch);
 
            if($status['http_code']==200){return json_decode($response);}
            return false;
        }
 
    public function form($form)
        {
            return $this->request('forms/'.$form);
        }
 
    public function form_fields($form)
        {
            return $this->request('forms/'.$form.'/fields');
        }
 
    public function form_entries($form)
        {
            return $this->request('forms/'.$form.'/entries');
        }
}
 
$form_hash = 'xxxxxx';
$wufoo = new wufoo('XXXX-XXXX-XXXX-XXXX','subdomain');
 
$fields = $wufoo->form_fields($form_hash);
$entries = $wufoo->form_entries($form_hash);
 
$field_a = array();
foreach($fields->Fields AS $field)
    {
        if($field->Type=='likert')
            {
                $choices = array();
                foreach($field->Choices AS $choice)
                    {
                        $choices[$choice->Label] = (int) $choice->Score;
                    }
                foreach($field->SubFields AS $subfield)
                    {
                        $field_a[$subfield->ID] = $choices;
                    }
            }
    }
 
$score = 0;
foreach($entries->Entries AS $entry)
    {
        $entryscore = 0;
        $usable = 0;
        foreach($entry AS $option=>$value)
            {
                if(isset($field_a[$option])&&!empty($value))
                    {
                        $oscore = $field_a[$option][$value];
                        $tscore = (in_array(0,$field_a[$option])) ? count($field_a[$option])-1 : count($field_a[$option]); // 0==not applicable
                        $oscore /= $tscore;
                        $entryscore += $oscore;
                        $usable++;
                    }
            }
        $entryscore /= $usable;
        $score += $entryscore;
    }
$score /= count($entries->Entries);
$score = round(($score*9)+1,1);
 
echo $score;
 
?>

Creating an HTTP proxy for Wikipedia

How often is it that we want to clarify our text by explaining our terms, so most of our readers actually understand what we’re saying and how often is it that we end up with an article that’s either bloated with explenational lyrics to a song called “The article I was supposedly writing” or an article far too in-depth for the common reader. A lot of people stop reading when they find out they have to read a Wiki article every five sentences, but others really love that and spend their late hours on browsing through all that extended information.

I myself came up with a solution. It’s not uncommon to link to Wiki articles and even so uncommon is the absense of a JavaScript library. Given the static link structure of Wiki articles, it’s not that hard to extract the part that says what article you link to. My solution involves jQuery, some raw JavaScript, some PHP, some configuration using .htaccess and either a subdomain or a folder on a working domain.

So, what’s the big idea?

We’re going to make a proxy for Wikipedia, using the TXT records, which we will use to generate balloons with a summary of the Wiki for that keyword. This way we solve three things:

  1. We don’t bloat our article, but do give extra info for difficult terms;
  2. We provide readers who don’t like to constantly be sent to Wikipedia with a good summary of the article;
  3. People whom still wish to read the full article can click on the link and still get all the info needed;

As an addition, it’s a great support for SEO, because linking to articles with extra information on your article is what helps you rank high on “the Google”.

Provide me with the goodies

Okay, so lets assume you either made a new subdomain or a folder on your website where we send the queries to. First we want to prepare our Apache settings:

RewriteEngine On
RewriteBase /home/wiki/public_html/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

The RewriteBase should fall back to the current working directory by default, but sometimes this works and sometimes this doesn’t, so making this static prevents it from failing. The conditions are there to assure the file we requested doesn’t already exist (which we will use later on to make a cache) and the same for directories. Next up we make sure the request in a whole is given to index.php.

Now we want to populate index.php with some code:

<?php
$q=substr($_SERVER['REQUEST_URI'],1);
exec('dig +short txt '.$q.'.wp.dg.cx',$dig);
$dig = str_replace('" "','',implode("\n",$dig));
if(!file_exists($q) && !empty($dig)) { file_put_contents($q,$dig); }
die($dig);
?>

And what does this do? Well, maybe I should give an example of how we request the text first:

If I want to know something about Unix systems, I simply make a request for http://wiki.johmanx.com/unix.
Since I actually went for making a special subdomain, wiki, I can make requests like that.

Back to the code: In the first line we see the variable $q, which is simply the Request URI minus the first character, since in my example that would be /unix and we only want unix. Then comes a system command. Windows users should actually stop reading at this point, because dig is in no way available on a Windows server. No worries, because when I get to the Ajax part, there’s still a lot of fun stuff we can do. The dig command is a tool in Linux that allows us to get information about DNS records from a domain. In this case we don’t request information about the typical A record, but we do so for the TXT record. Now, normally we use that to store some information that clarifies what we do with a certain domain, but something.wp.dg.cx actually stores the wiki summary for the term something. By giving it the +short parameter, we make sure we don’t get any fancy data for that domain, but only the contents of the TXT record.

The only thing that remains in the code is a basic filtering for weird closing and opening quotes and to make sure that when it ever happens that a TXT record exists of multiple lines, we actually make sure all the lines are present. Then we check if we don’t overwrite a file by putting this in cache (god forbid I don’t do this and someone gets the bright idea to create and query the article index.php). Then finally we make sure to stop the script and end with only echoing the text we just queried.

But now we’re stuck with Cross Domain AJAX calls

Hmm. I have to give you that. Well, the reason I went for a subdomain approach is because I can provide Windows users with my proxy as where the Linux users can put this in a folder and get a more optimal aproach. So, provided we stick with the cross domain solution, we need to make something like XD.php for our AJAX. It should contain something like this:

<?php
echo file_get_contents('http://wiki.johmanx.com/'.$_SERVER['QUERY_STRING']);
?>

Now, if we request XD.php?waffles, we should get some text rambling about a “batter based cake”.
For a simple AJAX call, I tend to fall back to this simple piece of code:

var createXMLHttpRequest = function() {
    if (window.ActiveXObject) {
        return function() { xmlHttp = new ActiveXObject("Microsoft.XMLHTTP"); }
    } 
    else if (window.XMLHttpRequest) {
        return function() { xmlHttp = new XMLHttpRequest(); }
    }
}();
 
function GETRequest(queryString,cb) {
    createXMLHttpRequest();
 
    xmlHttp.onreadystatechange = cb;
    xmlHttp.open("GET", queryString, true);
    xmlHttp.send(null);
}
 
function POSTRequest(url,query,cb) {
    createXMLHttpRequest();
 
    xmlHttp.open("POST", url, true);
    xmlHttp.onreadystatechange = cb;
    xmlHttp.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");    
    xmlHttp.send(query);
}
 
function ajaxDone() {
    if(xmlHttp.readyState == 4 && xmlHttp.status == 200) {
        return true;
    }
    return false;
}

Now we can make a very jQuery like request for our article:

$("body")
	.delegate("a","hover",
		function(){
			var keyword = $(this).html();
			GETRequest('XD.php?'+keyword,
				function(){
					if(ajaxDone()){	
						$('#balloon-keyword-'+keyword).html(xmlHttp.responseText).show();
					}
				}
			);
		}
	)
	.delegate(".balloon","mouseout",
		function(){
			$(this).hide();
		}
	);

Provided each summary is given a link to the original article, we could add a filter to automatically make anchor links, but to keep this article as simple as possible for quick implementation, I provide the filter apart from that bit of jQuery:

function txt2html(txt) {
  var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
  return txt.replace(exp,"<a href='$1' target='_blank'>$1</a>");
}

That way you could apply a filter to the summary, which will automatically transform a URL to an anchor link.

Now, say you’d want to do this the proper way and put the wiki script in a folder called /wiki/, you could replace ‘XD.php?’ with ‘/wiki/’ and get the same result. This way you don’t have to make Cross Domain calls and you can make optimal use of the cache I provided.

Some other thoughts

Now we’re finished setting up our fine proxy, some thoughts come to mind.

This would make for a good HTTP proxy to apps for the iPhone, iPad and Android platforms. I could see a number of apps coming from this. On top of that list is an App for quickly getting extra information about the ingredients of your recipe. Somewhat lower I’d suggest a Trivia app that generates random questions and gets the answers via a similar proxy. In my opinion, there’s a whole number of apps that could bloom this way. Anyone up for the challenge?

Finding clients the easy way

This is my first post in English, so stay with me during this test as i progress and make some epic fails.

Finding clients the regular way

Normally i wake up, take a shower, have breakfast and check my mail. If i find myself in the position of not having enough work at the moment, i’ll start browsing job boards and search Twitter for relevant information.

The downside to this is the huge amount of time this takes. Not only that, but it involves browsing to different sites with different layouts and different means of filtering and searching for jobs. This can be a tiny pain in the ass!

Job Listings

Finding clients in under 5 minutes

So, provided i browse all these sites and bore myself to death by watching an endless Twitter stream, it could take me hours to come up with a list of potential clients. This just doesn’t cut it. It’s too consuming and too less rewarding.

Hell, i’m a web developer. Why can’t this be more like reading those nice job adverts in the news paper. Honestly? It’s not that hard to accomplish just that. Now, when i wake up i read my mail and browse through job.johmanx.com for some potential victims to pay me some moneyz.

Building a personal job board

Ofcourse i had to go around a really simple way too solve this “problem” of mine, because time is money and all this had to do was display an overview, so here goes.

First i assembled a list of job boards i wanted to follow:

$boards = array(
	'odesk'			=>	'http://www.odesk.com/jobs/rss?c1=Web+Development',
	'37 signals - design'		=>	'http://jobs.37signals.com/categories/1/jobs.rss',
	'37 signals - programming'	=>	'http://jobs.37signals.com/categories/2/jobs.rss',
	'freelanceswitch - design'	=>	'http://feeds.feedburner.com/FSJobsDesign',
	'freelanceswitch - devel'	=>	'http://feeds.feedburner.com/FSJobsProgramming',
	'freelanceswitch - misc'	=>	'http://feeds.feedburner.com/FSJobsMisc',
	'authentic jobs'		=>	'http://www.authenticjobs.com/rss/custom.php?terms=&amp;type=freelance&amp;cats=3,4,2,5,6,'
);

As you may have noticed, i sticked to specific categories with the most value to me.

I also wanted to follow a few search queries on twitter, so i could follow people in need of a developer:

$twitter_searches = array(
	'need web designer',
	'hire web designer',
	'need web developer',
	'hire web developer',
	'hire php'
);

Then comes the “chooka-chook” and we give those feeds some rails to ride on:

/***
 * @function	parse_universal(var $feed, int $retnr)
 * @description	parse a universal XML feed
 * @return	array array(var $feed, var $name)
 */
 
function parse_universal($feed,$name="")
	{
 
		$feed = simplexml_load_file($feed);
 
		$child = $feed->xpath("channel");
		$child = $child[0]->xpath("item");
		$feed = array();
		foreach($child AS $item)
			{
				$title = $item->xpath("title");
				$date = $item->xpath("pubDate");
				$timestamp = strtotime(substr($date[0][0],0));
				$guid = $item->xpath("link");
				$feed[] = array($timestamp,$name,substr($title[0][0],0),$guid[0][0]);
			}
		return $feed;
	}
 
/***
 * @function	socialdate(int $timestamp)
 * @description	Transform time in human readable difference in time
 * @return	var $val
 */
 
function socialdate($timestamp)
	{
		if($timestamp == 0) { return "?"; }
 
		$diff = time()-$timestamp;
 
		switch($diff)
			{
				case ($diff&lt;120):		$val = round($diff,0)." seconds";			break;
				case ($diff&lt;7200):		$val = round($diff/60,0)." minutes";			break;
				case ($diff&lt;48*3600):	$val = round($diff/3600,0)." hours";			break;
				case ($diff&lt;14*24*3600):	$val = round($diff/(3600*24),0)." days";			break;
				case ($diff&lt;60*24*3600):	$val = round($diff/(3600*24*7),0)." weeks";		break;
				case ($diff&lt;730*24*3600):	$val = round($diff/(3600*24*30),0)." months";		break; 				case ($diff>=365*24*3600):	$val = round($diff/(3600*24*365),0)." years";		break;
			}
 
		return '-'.$val;
	}
 
/***
 * @function	web2txt(var $txt)
 * @description	Transform URLs to Anchor links
 * @return	var $txt
 */
 
function web2txt($txt="")
	{
		$txt = eregi_replace('(((f|ht){1}tp://)[-a-zA-Z0-9@:%_+.~#?&amp;//=]+)', '<a href="\1" target="_blank">\1</a>', $txt);
		return $txt;
	}

web2text() is a simple function to look for URLs and transform them to clickable links.
socialdate() is a function to determine the distance between now and then in human readable text.
parse_universal() needs some explenating on my part.

Parsing an XML feed

While parsing an XML document is a fairly routine task, i made it a bit more complex to suit my needs. For instance, i wanted to order everything by date, including all the boards and twitter queries. This can be done simply by using the rsort() function of PHP. However, i’m using a multidimensional array which needs to be sorted in reverse order. By default, rsort tries to order everything as logically possible, so in case of an array with multiple dimensions, the first entry of an array within the array is sorted with the first entry of the next array within the array. Confusing? I’ll just give an example:

$array = array(
  array(4,6,'amber'),
  array(6,4,'mapel'),
  array(2,5,'chocolate')
);
rsort($array);
print_r($array);

The output:

Array
(
    [0] => Array
        (
            [0] => 6
            [1] => 4
            [2] => mapel
        )

    [1] => Array
        (
            [0] => 4
            [1] => 6
            [2] => amber
        )

    [2] => Array
        (
            [0] => 2
            [1] => 5
            [2] => chocolate
        )

)

As you can see and as i told before, the first entry of each array is used to compare and pick the order. In the function parse_universal() we make use of this by putting the UNIX timestamp as first entry in the array.

Next up we pass it a name, so we can later on determine what the source of the message actually was. This is especially handy for finding resources that just clutter your stream. After that we pass on the text of the message and the place of the original message.

Parsing the resources

Now we finally got everything ready to be processed into our own board, let’s get some magic happening:

$feed = array();
foreach($boards AS $board=>$url)
	{
		$board = @parse_universal($url,$board);
		$feed = @array_merge($feed,$board);
	}
sleep(2);
foreach($twitter_searches AS $query)
	{
		$search = @parse_universal('http://search.twitter.com/search.rss?q='.$query,"twitter: ".$query);
		$feed = @array_merge($feed,$search);
	}
rsort($feed);
 
foreach($feed AS $row)
	{
		print '
<div class="rounded">
<a href="'.$row[3].'" target="blank">'.web2txt($row[2]).'</a>
<div class="bottom"><small>'.socialdate($row[0]).'</small> - <small><em>'.$row[1].'</em></small></div>
</div>
';
	}

Unfortunately we need the array_merge() function, because we want to generate multiple arrays at first and there’s no other way to merge an array. Sounds fair enough. It’s just not an optmized way of handling arrays. The reason i put in a different foreach for the twitter functionality is just because i’m lazy and don’t want to copy and paste that feed url everytime i change my search queries. As you can see i pass in both the feed url and a name, which will be used to identify the messages later on.

When the final feed is processed, we use an array with processed data:

$row[(int)]

  • 0: The timestamp which will be transformed using socialdate()
  • 1: The name of the feed from which it originated
  • 2: The text of the current message, checked on possible links using web2txt()
  • 3: The URL of the original location of the message

Now we’re almost there. We already have the following garbage:
Raw job list

Giving it some style

As it is, this list is garbage. You can’t read it easily and it’s really hard to filter anything usefull. Luckily we can give it some spice, using CSS:

	.rounded{
		-webkit-border-radius: 16px;
		-moz-border-radius: 16px;
		border-radius: 16px;
		background:#eee;
		margin:3px;
		padding:8px;
		float:left;
		width:478px;
		height:90px;
		position:relative;
		font-family:Tahoma;
	}
	.bottom{
		position:absolute;
		bottom:8px;
	}

Looky here:

Styled job list

Now, that’s more like it.

Next up: Automating the shizzle

Ofcourse reading N amount of feeds and search queries takes some time. Time you don’t wanna spend waiting, because this removes the whole purpose of saving time using a personal board. Normally i would go with something like this:

*/15 * * * * php /path/to/parse.php > /home/job.johmanx.com/public_html/cache.html

However, this isn’t going to work, because sometimes a feed has an enormous hold up or inconsistent data. I have tried everything to extend my parsing time, but i always end up with a 500 – Internal server error. On this account you have to minimize your resources and build in a failsafe, like i did:

*/15 * * * * php /path/to/cacher.php

Cacher.php:

$content = file_get_contents('/path/to/parse.php');
if(strlen($content)>1200){file_put_contents('/home/job.johmanx.com/public_html/cache.html',$content);}

This way you don’t get an empty page when it failed, which can be really frustrating.

Index.php:

<!doctype html> 
<html> 
	<head> 
		<title>Job listings | Joh Man X - Clear webdevelopment</title> 
		<link rel="stylesheet" href="stylesheet/style.css" type="text/css" /> 
 
		<!--[if IE]>
			<script src="http://html5shiv.googlecode.com/svn/trunk/html5.js"></script>
		<![endif]-->
 
	</head> 
 
	<body> 
 
		<div id="pagewrap"> 
 
			<section> 
 
				<div class="wrapper">
					<?php print file_get_contents('cache.html'); ?>
					<br class="clear" /> 
				</div> 
 
			</section>
 
		</div> 
	</body> 
</html>

The final product

And there you have it. After some minor alterations, i had my job board.

job.johmanx.com