Creating your own database drivers in PHP

Have you ever come across a situation where you’re dying to make that API for access to the database, only to find out there are multiple database versions and syntaxes to support? Now ofcourse there’s the PDO library, but we don’t want to use that in our case. Why? Because we want to write our own drivers. This allows us to easily implement support for new databases like CouchDB and MongoDB. (We have to note that these two are not similar to classical relational databases, so we might want to write restrictions or fallbacks for them.)

How shall we proceed?

We are going to have a proper thought process on this. Rushing into programming rather abstract PHP isn’t something I can recommend.

    Database representation
  1. List the public methods / creating an abstract class
  2. Decide how to handle drivers
  3. Decide on a file structure
  4. Create a simple database with a test table
  5. Write protected methods
  6. Write an interface for the drivers
  7. Write the first driver
  8. Test the driver
  9. Design a proprietary database system
  10. Write proprietary driver
  11. Test the proprietary driver
  12. Review & refactor
  13. Conclusion

List the public methods / creating an abstract class

To make ourselves a really simple database layer, we’re going to stick with the following few public methods:

abstract class Db_Abstract{
 
	// Force Db class to define these methods
	abstract protected function _prepare($query);
	abstract protected function _query($query);
	abstract protected function _result(); // resource will be stored inside the driver object
	abstract protected function _isConnected();
	abstract protected function _connect($params);
	abstract protected function _affectedRows(); // resource will be stored inside the driver object
	abstract protected function _driver($driver);
 
	/** =Public methods
	 * These functions will define the behavior of our Database class
	 **/
 
	/*
	* @function database
	* @description Select the database
	* @param String $db
	* @return Bool True|False
	*/
	public function database($db){
		return $this->driver->_selectDb($db);
	}
 
	/*
	* @function select
	* @description Select table rows
	* @param Array $fields
	* @param String $table
	* @param String $arguments (optional)
	* @return Array $rows
	*/
	public function select($fields,$table,$arguments=""){
		return $this->driver->select($fields,$table,$arguments);
	}
 
	/*
	* @function insert
	* @description Insert new row(s) into the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @return Int $affectedRows
	*/
	public function insert($keys,$values,$table){
		return $this->driver->insert($keys,$values,$table);
	}
 
	/*
	* @function update
	* @description Update row(s) in the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @param String $arguments (optional)
	* @return Int $affectedRows
	*/
	public function update($keys,$values,$table,$arguments=""){
		return $this->driver->update($keys,$values,$table,$arguments);
	}
 
	/*
	* @function delete
	* @description Delete a row from the table
	* @param String $key
	* @param String $value
	* @param String $table
	* @return Int $affectedRows
	*/
	public function delete($key,$value,$table){
		return $this->driver->delete($key,$value,$table);
	}
}

As you can see, the Database Abstract purely gives us a new abstraction layer. It makes simple aliases for the functions within the driver. This allows us to have proper naming conventions across all the drivers.

Decide how to handle drivers

We must now decide how to work with the driver. What is handled by the driver and what is handled by the database class.

The driver must

  • Set up and maintain a connection with the database
  • Prepare queries
  • Execute queries
  • Store resources
  • Parse results
  • Insert, update and delete rows
  • Handle statistics about the database

The database class must

  • Provide naming conventions
  • Handle I/O between the database and the rest of your application

Decide on a file structure

Since we want to keep it as readable as possible, we’re going for a decent file structure that describes the contents of itself.

For now we’ll use the following naming conventions:

  • Database Abstract class and core class

    ./db.php

  • Database drivers

    ./drivers/{$driver}.php

We take for granted that ./ points to the __DIR__[1] magic constant of db.php.

Create a simple database with a test table

The first driver we will write will be one for MySQL, since I’m most familiar with the MySQL server. We’ll use the following simple test database:

SET SQL_MODE="NO_AUTO_VALUE_ON_ZERO";
 
CREATE DATABASE `test` DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
USE `test`;
 
CREATE TABLE IF NOT EXISTS `test` (
  `key` VARCHAR(255) NOT NULL,
  `value` VARCHAR(255) NOT NULL,
  PRIMARY KEY  (`key`)
) ENGINE = MYISAM CHARACTER SET utf8 COLLATE utf8_unicode_ci;
 
INSERT INTO `test` (`key`, `value`) VALUES
('key1', 'value1'),
('key2', 'value2');

test

key value
key1 value1
key2 value2

Write protected methods

The next step is to provide the I/O layer by writing the protected methods. These methods cannot be called from outside the database class. They will parse input before it is handed over to the driver and prepare the output from the driver before it arrives at your application.

class Db extends Db_Abstract{
 
	var $driver;
 
	/*
	* @function __construct
	* @description Run on object creation
	* @param String $connection (input: 'driver://username:password@server:port/database')
	* @return void
	*/
	function __construct($connection=null){
		if(!defined('__DIR__')){
			define('__DIR__',dirname(__FILE__));
		}
		if(!empty($connection)){
			$connection = parse_url($connection);
			$this->_driver($connection['scheme']);
			$this->_connect($connection);
		}
	}
 
	/*
	* @function _escape
	* @description Emulate the behavior of mysql_real_escape_string() to prevent SQL Injection
	* @param String $query
	* @return String $query
	*/
	protected function _escape($query){
		$unsafe_tokens = array("\x00","\n","\r","\\","'",'"',"\x1a");
		$safe_tokens = array("\\\x00","\\\n","\\\r","\\\\","\\"."'","\\".'"',"\\\x1a");
		return str_replace($unsafe_tokens,$safe_tokens,$query);
	}
 
	/*
	* @function _prepare
	* @description Prepares the query before handing it over to the SQL client
	* @param String $query
	* @return String $query
	*/
	protected function _prepare($query){
		// In case the driver has it's own _prepare method
		if(method_exists($this->driver,'_prepare')){
			return $this->driver->_prepare($query);
		}
		return $this->_escape($query);
	}
 
	/*
	* @function _query
	* @description Runs an internal query request
	* @param String $query
	* @return Int $affectedRows
	*/
	protected function _query($query){
		$this->driver->query($query);
		return $this->_affectedRows();
	}
 
	/*
	* @function _result(void)
	* @description Fetch the result of a query
	* @return mixed $result
	*/
	protected function _result(){
		return $this->driver->_result();
	}
 
	/*
	* @function _isConnected(void)
	* @description Fetch the connection status
	* @return Bool True|False
	*/
	protected function _isConnected(){
		return (bool) is_object($this->driver)&&$this->driver->_isConnected();
	}
 
	/*
	* @function _connect
	* @description Connect to the database
	* @param Array $params (result of parse_url)
	* @return Bool True|False $isConnected
	*/
	protected function _connect($params){
		if(!empty($params['host'])&&!empty($params['user'])&&!empty($params['path'])){
			$host = $params['host'];
			$user = $params['user'];
			$pass = (!empty($params['pass'])) ? $params['pass'] : '';
			$port = (!empty($params['port'])) ? $params['port'] : 3306;
			$database = trim($params['path'],'/');
			$this->driver->_connect($host,$user,$pass,$port);
			$this->driver->_selectDb($database);
		}
		return $this->_isConnected();
	}
 
	/*
	* @function connect
	* @description Connect to the database
	* @param String $connection (input: 'driver://username:password@server:port/database')
	* @return Bool True|False $isConnected
	*/
	public function connect($connection){
		if(!$this->_isConnected()){
			$params = parse_url($connection);
			$this->_driver($params['scheme']);
			return $this->_connect($params);
		}
		return true;
	}
 
	/*
	* @function disconnect(void)
	* @description Disconnect from the database
	* @return void
	*/
	public function disconnect(){
		$this->driver->_disconnect();
	}
 
	/*
	* @function _affectedRows(void)
	* @description Fetch the number of rows affected by the last query
	* @return Int $affectedRows
	*/
	protected function _affectedRows(){
		return $this->driver->_affectedRows();
	}
 
	/*
	* @function _driver
	* @description Use the given driver
	* @param String $driver
	* @return void
	*/
	protected function _driver($driver){
		$driver_class = 'Db_Driver_'.ucfirst(strtolower($driver));
		$driver_file = __DIR__.'/drivers/'.strtolower($driver).'.php';
		if(!class_exists($driver_class)&&file_exists($driver_file)){
			require_once($driver_file);
		}
		if(!is_object($this->driver)||get_class($this->driver)!==$driver_class){
			$this->driver = new $driver_class;
		}
	}
 
}

Some functionality should be explained:

  • _escape($query)

    Normally, to prepare a request for the MySQL client, we use mysql_real_escape_string($query) to prevent SQL Injection. Since this function specifically only works when a MySQL connection is active, we can’t use this to escape queries for other database connections. Therefore, we emulate the same functionality with a manual function. This is specifically done for _prepare($query).

  • _prepare($query)

    This method first looks for a nested _prepare method inside the driver object. If it’s available, then it will execute that method. Otherwise, it will fall back to _escape.

  • _connect($params)

    This method uses the result of parse_url(). I specifically chose that structure, because a URL can store all information needed to select a driver, connect to the database server and select the database.

    parse_url return values

    • scheme (used for driver)
    • host
    • port
    • user
    • pass
    • path
    • query
    • fragment

    query and fragment are not used, but allow for future extending on the connection methods.

    The driver must store the connection status. The database class will not cache the connection status for obvious reasons.

  • _driver($driver)

    If the corresponding driver class does not exist, this will be loaded from the drivers directory.

    When there is no driver set or the driver is not the same as the current driver, a new driver object will be created.

Write an interface for the drivers

Writing an interface is quite similar to writing an abstract class. The difference is that an abstract class can contain methods that will be inherited. An interface can only define methods and variables. It is similar to a set of construction guidelines. It is the bare minimum of which a class should consist, without having predefined what happens inside the methods.

In essence, we want to write down whatever method can be called from within the database class. This way, a driver must consist of at least these methods.

interface Db_Driver_Interface {
 
	public function select($fields,$table,$arguments="");
	public function insert($keys,$values,$table);
	public function update($keys,$values,$table);
	public function delete($key,$value,$table);
 
	public function query($query);
	public function _result();
	public function _affectedRows();
 
	public function _isConnected();
	public function _connect($host,$user,$pass,$port);
	public function _selectDb($database);
	public function _disconnect();
 
}

Write the first driver

As We’re testing the database class on a MySQL server, the first driver is a MySQL driver. It will be stored in ./drivers/mysql.php and the class name will be Db_Driver_Mysql. The naming convention is Db_Driver_$class_name where $class_name = ucfirst(strtolower($driver)); (all lower case and capitalize first character).

class DB_Driver_Mysql implements Db_Driver_Interface {
 
	/*
	* @var $res
	* @description Used to store the connection and result resources
	*/
	var $res;
 
	/*
	* @function __construct(void)
	* @description Populate the resources array
	*/
	function __construct(){
		$this->res = array(
			'connection'	=> null,
			'result'		=> null
		);
	}
 
	/*
	* @function select
	* @description Select table rows
	* @param Array $fields
	* @param String $table
	* @param String $arguments (optional)
	* @return Array $rows
	*/
	public function select($fields,$table,$arguments=""){
		$query = 'SELECT `'.implode('`,`',$fields).'` FROM `'.$table.'`';
		$query .= (!empty($arguments)) ? ' '.$arguments : '';
		$this->query($query);
		return $this->_result();
	}
 
	/*
	* @function insert
	* @description Insert new row(s) into the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @return Int $affectedRows
	*/
	public function insert($keys,$values,$table){
		$query = 'INSERT INTO `'.$table.'` (`'.implode('`, `',$keys).'`) VALUES ';
		if(isset($values[0])&&is_array($values[0])){ // There are multiple rows to be inserted
			$rows = array();
			foreach($values AS $row){
				$rows[] = '\''.implode('\', \'',$row).'\'';
			}
			$query .= '('.implode('), (',$rows).')';
		}
		else{
			$query .= '(\''.implode('\', \'',$values).'\')';
		}
		$query .= ';';
		$this->query($query);
		return $this->_affectedRows();
	}
 
	/*
	* @function update
	* @description Update row(s) in the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @param String $arguments (optional)
	* @return Int $affectedRows
	*/
	public function update($keys,$values,$table,$arguments=""){
 
		$affectedRows = 0;
 
		if(!is_array($values[0])){ $values = array($values); } // Make it multi-dimensional
		for($i=0;$i<count($values);$i++){
			$vals = $values[$i];
			$query = 'UPDATE `'.$table.'` SET ';
			$updates = array();
			for($j=0;$j<count($keys);$j++){
				$val = (is_numeric($vals[$j])) ? $vals[$j] : "'".$this->_prepare($vals[$j])."'";
				$updates[] = '`'.$keys[$j].'` = '.$val;
			}
			$query .= implode(', ',$updates);
			$query .= (!empty($arguments)) ? ' '.$arguments : '';
			$this->query($query);
			$affectedRows += $this->_affectedRows();
		}
 
		return $affectedRows;
	}
 
	/*
	* @function delete
	* @description Delete a row from the table
	* @param String $key
	* @param String $value
	* @param String $table
	* @return Int $affectedRows
	*/
	public function delete($key,$value,$table){
		$value = (is_numeric($value)) ? $value : "'".$value."'";
		$query = 'DELETE FROM `'.$table.'` WHERE `'.$key.'` = '.$value;
		$this->query($query);
		return $this->_affectedRows();
	}
 
	/*
	* @function query
	* @description Perform a native query
	* @param String $query
	* @return void
	*/
	public function query($query){
		$this->res['result'] = mysql_query($query);
	}
 
	/*
	* @function _prepare
	* @description Prepares the query before handing it over to the SQL client
	* @param String $query
	* @return String $query
	*/
	public function _prepare($query){
		return mysql_real_escape_string($query,$this->res['connection']);
	}
 
	/*
	* @function _result(void)
	* @description Fetch the result of a query
	* @return mixed $result
	*/
	public function _result(){
		$result = array();
		while($row=mysql_fetch_assoc($this->res['result'])){
			$result[] = $row;
		}
		return $result;
	}
 
	/*
	* @function _affectedRows(void)
	* @description Fetch the number of rows affected by the last query
	* @return Int $affectedRows
	*/
	public function _affectedRows(){
		return mysql_affected_rows($this->res['connection']);
	}
 
	/*
	* @function _isConnected(void)
	* @description Fetch the connection status
	* @return Bool True|False
	*/
	public function _isConnected(){
		return (bool) is_resource($this->res['connection']);
	}
 
	/*
	* @function _connect
	* @description Connect to the database
	* @param String $host
	* @param String $user
	* @param String $pass
	* @param Int $port
	* @return void
	*/
	public function _connect($host,$user,$pass,$port){
		$this->_disconnect();
		$this->res['connection'] = mysql_connect($host.':'.$port,$user,$pass);
	}
 
	/*
	* @function _selectDb
	* @description Select the database
	* @param String $db
	* @return Bool True|False
	*/
	public function _selectDb($database){
		if($this->_isConnected()){
			return (bool) mysql_select_db($database,$this->res['connection']);
		}
		return false;
	}
 
	/*
	* @function _disconnect(void)
	* @description Disconnect from the database
	* @return void
	*/
	public function _disconnect(){
		if($this->_isConnected()){
			mysql_close($this->res['connection']);
			$this->res['connection'] = null;
		}
	}
 
	/*
	* @function __destruct(void)
	* @description Clean up any open connection
	*/
	function __destruct(){
		$this->_disconnect();
	}
 
}

It’s a whole lot of code, but a lot of it is very common and we’ll run through the methods real fast.

  • __construct()

    First we set up a local array in which we store the connection and result data.

  • select($fields,$table,$arguments="")

    We can use select to select fields from a table and we can use $arguments to provide restrictions to the query, like $arguments = 'WHERE key = \'key1\'';.

  • insert($keys,$values,$table)

    This method can insert values into the table. The values should be in the same order as the keys. Multiple rows can be entered at once. In that case, the structure of the values array is array( $row => array($val,$val), $row => array($val,$val) ).

  • update($keys,$values,$table,$arguments="")

    Update can do single or multiple updates at once. It uses the same structure as insert for the keys and values and can use $argument the same way as the select method can.

  • delete($key,$value,$table)

    This method can delete single or multiple rows, but simply uses 'DELETE FROM `'.$table.'` WHERE `'.$key.'` = '.$value;.

  • query($query)

    This function enables you to perform queries that are native to the system connected to the driver. The result can be retrieved using _result().

  • _prepare($query)

    This method is optional, but since MySQL provides a native function to prepare values, we make use of that.

  • _result()

    All the resulting query data will be put in an associative multi-dimensional array. The first dimension describes the rows and the second dimension consists of the key-value-pair.

  • _affectedRows()

    The number of rows, affected by the last query.

  • _isConnected()

    Tells if a connection with the database is active.

  • _connect($host,$user,$pass,$port)

    Connect to the database.

  • _selectDb($database)

    Select a different database.

  • _disconnect()

    Disconnect from the database.

  • __destruct()

    Clean up any open database connections.

Test the driver

To test our freshly baked driver, we simply have to take the following steps:

  1. Connect to the database
  2. Select all rows
  3. Insert a new row
  4. Select that row
  5. Update that row
  6. Select that row again
  7. Delete that row
  8. Select all rows

Connect to the database

require_once('db.php');
 
$db = new Db;
$db->connect('mysql://root:lamepass@localhost:3306/test');
/*
* Alternatively you could issue:
$db = new Db('mysql://root:lamepass@localhost:3306/test');
*/
 
echo ($db->driver->_isConnected()) ? 'Connection established' : 'Connection failed'; // returns 'Connection established'
Connection established

Select all rows

$s = $db->select(array('key','value'),'test');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key1
            [value] => value1
        )

    [1] => Array
        (
            [key] => key2
            [value] => value2
        )

)

Insert a new row

echo $db->insert(
	array('key','value'),
	array(
		array('key3','value3'),
		array('key4','value4')
	),
	'test'
); // returns affected rows
2

Select that row

$s = $db->select(array('key','value'),'test','WHERE `test`.`key` = \'key3\'');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key3
            [value] => value3
        )

)

Update that row

echo $db->update(
	array('value'),
	array('value3-2'),
	'test',
	'WHERE `test`.`key` = \'key3\''
); // returns affected rows
1

Select that row again

$s = $db->select(array('key','value'),'test','WHERE `test`.`key` = \'key3\'');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key3
            [value] => value3-2
        )

)

Delete that row

echo $db->delete('key','key3','test');
1

And after repeat:

0

This proves that it deleted the row in the first place and couldn’t find it the second time.

Select all rows

$s = $db->select(array('key','value'),'test');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key1
            [value] => value1
        )

    [1] => Array
        (
            [key] => key2
            [value] => value2
        )

    [2] => Array
        (
            [key] => key4
            [value] => value4
        )

)

Design a proprietary database system

Because we’re only going to use this database system once, I went for the name dummy.

I’m sorry for all the lists in this article, but bear with me as we make another one. The proprietary system will have no DBMS[2]. To enable the needed functionality, we will implement features of a DBMS inside our driver. This will have no great influence on how we write the driver. The proprietary system will have the following features:

  • Multiple databases
  • Multiple tables
  • Multiple records
  • A simple key-value-pair structure
  • Authentication
  • Rights per database
  • Rights per table
  • A simple storage system

auth

johmanx:lamepass
[test]:create,drop,flush,update
=>[test]:select,insert,update,delete

dummy:lamepass
[test]:
=>[test]:select

dbs

[test]
=>[test]:[key,value]
==>[key1,value1]
==>[key2,value2]

[test2]
=>[test]:[name,mail]
==>[Jan-Marten de Boer,lievheid@johmanx.com]

Write proprietary driver

class DB_Driver_Dummy implements Db_Driver_Interface {
 
	/*
	* @var $res
	* @description Used to store the connection and result resources
	*/
	var $res;
 
	/*
	* @function __construct(void)
	* @description Populate the resources array
	*/
	function __construct(){
		$this->res = array(
			'connection'	=> false,
			'credentials'	=> array(),
			'result'		=> null,
			'database'		=> array(),
			'rights'		=> array(),
			'affectedRows'	=> 0,
			'dbs'			=> array()
		);
	}
 
	/*
	* @function select
	* @description Select table rows
	* @param Array $fields
	* @param String $table
	* @param String $arguments (optional) ("key=key3,key=key4")
	* @return Array $rows
	*/
	public function select($fields,$table,$arguments=""){
		$t = $this->res['dbs'][$this->res['database']][$table];
		$result = array();
		$keys = $t['keys'];
		$records = $t['records'];
		$ks = $this->res['dbs'][$this->res['database']][$table]['keys'];
		$affectedRows = 0;
 
		if(empty($arguments)){
			foreach($records AS $row){
				$r = array();
				foreach($fields AS $field){
					if(in_array($field,$keys)){
						$r[$field] = $row[$field];
					}
				}
				$result[]=$r;
				$affectedRows++;
			}
		}
		else{
			if(strpos($arguments,',')!==false){
				$keywords = explode(',',$arguments);
			}
			else{
				$keywords = array(trim($arguments));
			}
			$kws = array();
			foreach($keywords AS $keyword){
				$keyword = explode('=',$keyword);
				$kws[$keyword[0]][]=$keyword[1];
			}
 
			for($i=0;$i<count($records);$i++){
				$rac = 0;
				for($j=0;$j<count($ks);$j++){
					$key = $ks[$j];
					$val = $records[$i][$key];
					if(isset($kws[$key])&&in_array($val,$kws[$key])){
						$rac++;
					}
				}				
 
				if($rac==count($kws)){
					$row = $records[$i];
					$r = array();
					foreach($fields AS $field){
						if(in_array($field,$keys)){
							$r[$field] = $row[$field];
						}
					}
					$result[]=$r;
					$affectedRows++;
				}	
			}			
		}
 
		$this->res['result'] = $result;
		$this->res['affectedRows'] = $affectedRows;
		return $this->_result();
	}
 
	/*
	* @function insert
	* @description Insert new row(s) into the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @return Int $affectedRows
	*/
	public function insert($keys,$values,$table){
		if(!is_array($values[0])){
			$values = array($values);
		}
 
		foreach($values AS $record){
			$r = array();
			for($i=0;$i<count($record);$i++){
				$r[$keys[$i]]=$record[$i];
			}
			$this->res['dbs'][$this->res['database']][$table]['records'][] = $r;
		}
 
		$this->res['affectedRows'] = count($values);
		return $this->_affectedRows();
	}
 
	/*
	* @function update
	* @description Update row(s) in the table
	* @param Array $keys
	* @param Array mixed (String $values | Array $values)
	* @param String $table
	* @param String $arguments (optional) ("key=key3,key=key4")
	* @return Int $affectedRows
	*/
	public function update($keys,$values,$table,$arguments=""){
		$records = $this->res['dbs'][$this->res['database']][$table]['records'];
		$ks = $this->res['dbs'][$this->res['database']][$table]['keys'];
		$affectedRows = 0;
 
		if(empty($arguments)){
			for($i=0;$i<count($records);$i++){
				for($j=0;$j<count($keys);$j++){
					$records[$i][$keys[$j]] = $values[$j];
				}
				$affectedRows++;
			}
		}
		else{
			if(strpos($arguments,',')!==false){
				$keywords = explode(',',$arguments);
			}
			else{
				$keywords = array(trim($arguments));
			}
			$kws = array();
			foreach($keywords AS $keyword){
				$keyword = explode('=',$keyword);
				$kws[$keyword[0]][]=$keyword[1];
			}
 
			for($i=0;$i<count($records);$i++){
				$rac = 0;
				for($j=0;$j<count($ks);$j++){
					$key = $ks[$j];
					$val = $records[$i][$key];
					if(isset($kws[$key])&&in_array($val,$kws[$key])){
						$rac++;
					}
				}				
 
				if($rac==count($kws)){
					for($j=0;$j<count($keys);$j++){
						$records[$i][$keys[$j]] = $values[$j];
					}
					$affectedRows++;
				}				
			}
		}
 
		$this->res['dbs'][$this->res['database']][$table]['records'] = $records;
		$this->res['affectedRows'] = $affectedRows;
		return $this->_affectedRows();
	}
 
	/*
	* @function delete
	* @description Delete a row from the table
	* @param String $key
	* @param String $value
	* @param String $table
	* @return Int $affectedRows
	*/
	public function delete($key,$value,$table){
		$records = $this->res['dbs'][$this->res['database']][$table]['records'];
		$affectedRows=0;
		for($i=0;$i<count($records);$i++){
			if($records[$i][$key]==$value){
				unset($this->res['dbs'][$this->res['database']][$table]['records'][$i]);
				$affectedRows++;
			}
		}
		$this->res['affectedRows'] = $affectedRows;
		return $this->_affectedRows();
	}
 
	/*
	* @function query
	* @description Perform a native query
	* @param String $query
	* @return void
	*/
	public function query($query){
		// there is no native query support
	}
 
	/*
	* @function _result(void)
	* @description Fetch the result of a query
	* @return mixed $result
	*/
	public function _result(){
		return $this->res['result'];
	}
 
	/*
	* @function _affectedRows(void)
	* @description Fetch the number of rows affected by the last query
	* @return Int $affectedRows
	*/
	public function _affectedRows(){
		return (int) $this->res['affectedRows'];
	}
 
	/*
	* @function _isConnected(void)
	* @description Fetch the connection status
	* @return Bool True|False
	*/
	public function _isConnected(){
		return (bool) $this->res['connection'];
	}
 
	/*
	* @function _connect
	* @description Connect to the database
	* @param String $host
	* @param String $user
	* @param String $pass
	* @param Int $port
	* @return void
	*/
	public function _connect($host,$user,$pass,$port){
		$this->_disconnect();
 
		$this->res['credentials'] = array(
			'host'		=>	$host,
			'user'		=>	$user,
			'database'	=>	$port
		);
 
		$dummy_file = $host.'/.iamdummy';
		$auth_file = $host.'/auth';
		$dbs_file = $host.'/dbs';
 
		if(file_exists($dummy_file)&&file_exists($auth_file)&&file_exists($dbs_file)){
			$auth = trim(file_get_contents($auth_file));
			$auth = explode("\n\n",$auth);
 
			foreach($auth AS $usr){
				$usr = explode("\n",$usr);
				if($usr[0]==$user.':'.$pass){
					$rights = $usr;
					unset($rights[0]);
 
					$ra = array();
					$cur_db = '';
					foreach($rights AS $rs){
						$rs = explode(':',$rs);
						if(substr($rs[0],0,1)=='['){
							$cur_db = trim($rs[0],'[]');
							$cur_rights = explode(',',$rs[1]);
							$ra[$cur_db] = array(
								'rights'	=>	$cur_rights,
								'tables'	=>	array()
							);
						}
						else{
							$cur_table = substr($rs[0],3,-1);
							$cur_rights = explode(',',$rs[1]);
							$ra[$cur_db]['tables'][$cur_table]=$cur_rights;
						}
					}
					$this->res['rights'] = $ra;
					$this->res['connection'] = true;
				}
			}
 
			$dbs = trim(file_get_contents($dbs_file));
			$dbs = explode("\n\n",$dbs);
 
			$dba = array();
			$cur_db = '';
			foreach($dbs AS $db){
				$db = explode("\n",$db);
				$cur_db = trim(array_shift($db),'[]');
				$dba[$cur_db] = array(); // Database can be empty
				$cur_table = '';
				foreach($db AS $table){
					if(substr($table,0,2)=='=>'){
						$cur_table = substr(trim($table),3,-1);
						$cur_table = explode(']:[',$table);
						$keys = explode(',',rtrim($cur_table[1],']'));
						$cur_table = str_replace('=>[','',$cur_table[0]);
						$dba[$cur_db][$cur_table] = array(
							'keys' => $keys,
							'records' => array()
						);
					}
					else{
						$record = explode(',',substr($table,4,-1));
						$r = array();
						$ks = $dba[$cur_db][$cur_table]['keys'];
						for($i=0;$i<count($record);$i++){
							$r[$ks[$i]]=$record[$i];
						}
						$dba[$cur_db][$cur_table]['records'][]=$r;
					}
				}
			}
			$this->res['dbs']=$dba;
 
			$this->_selectDb($port);
		}
	}
 
	/*
	* @function _selectDb
	* @description Select the database
	* @param String $db
	* @return Bool True|False
	*/
	public function _selectDb($database){
		if(isset($this->res['rights'][$database])&&isset($this->res['dbs'][$database])){
			$this->res['database']=$database;
		}
	}
 
	/*
	* @function _disconnect(void)
	* @description Disconnect from the database
	* @return void
	*/
	public function _disconnect(){
		if($this->_isConnected()){
			$dbs_file = $this->res['credentials']['host'].'/dbs';
			$dbs_data = '';
			foreach($this->res['dbs'] AS $database=>$tables){
				$dbs_data .= '['.$database.']';
				foreach($tables AS $table=>$data){
					$keys = $data['keys'];
					$rows = $data['records'];
 
					$dbs_data .= "\n".'=>['.$table.']:['.implode(',',$keys).']';
					foreach($rows AS $row){
						$ro = array();
						foreach($keys AS $key){
							$ro[] = $row[$key];
						}
						$dbs_data .= "\n".'==>['.implode(',',$ro).']';
					}
				}
				$dbs_data .= "\n\n";
			}
			$dbs_data = rtrim($dbs_data);
			file_put_contents($dbs_file,$dbs_data);
		}
		$this->__construct(); // reset values
	}
 
	/*
	* @function __destruct(void)
	* @description Clean up any open connection
	*/
	function __destruct(){
		$this->_disconnect();
	}
 
}

Test the proprietary driver

And to test the proprietary driver, we simply repeat the tests

  1. Connect to the database
  2. Select all rows
  3. Insert a new row
  4. Select that row
  5. Update that row
  6. Select that row again
  7. Delete that row
  8. Select all rows

Connect to the database

require_once('db.php');
 
$db = new Db('dummy://johmanx:lamepass@dummy/test');
 
echo ($db->driver->_isConnected()) ? 'Connection established' : 'Connection failed'; // returns 'Connection established'
Connection established

Select all rows

$s = $db->select(array('key','value'),'test');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key1
            [value] => value1
        )

    [1] => Array
        (
            [key] => key2
            [value] => value2
        )

)

Insert a new row

echo $db->insert(
	array('key','value'),
	array(
		array('key3','value3'),
		array('key4','value4')
	),
	'test'
); // returns affected rows
2

Select that row

$s = $db->select(array('key','value'),'test','key=key3');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key3
            [value] => value3
        )

)

Update that row

echo $db->update(
	array('value'),
	array('value3-2'),
	'test',
	'key=key3'
); // returns affected rows
1

Select that row again

$s = $db->select(array('key','value'),'test','key=key3');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key3
            [value] => value3-2
        )

)

Delete that row

echo $db->delete('key','key3','test');
1

And after repeat:

0

This proves that it deleted the row in the first place and couldn’t find it the second time.

Select all rows

$s = $db->select(array('key','value'),'test');
print_r($s);
Array
(
    [0] => Array
        (
            [key] => key1
            [value] => value1
        )

    [1] => Array
        (
            [key] => key2
            [value] => value2
        )

    [2] => Array
        (
            [key] => key4
            [value] => value4
        )

)

Review & refactor

This database class has opened up a world of new possibilities concerning the implementation of databases. We can now add on database systems like MongoDB. It’s also easy to write drivers for systems that in essence are no database at all. I can see a driver for ftp, so you could use file trees as databases, their directories as tables and the files as records with the file name and other properties as keys for the table. Select could display a directory, insert could add files, update could modify files and delete could delete them.

What needs refactoring

I would change the arguments for the MySQL driver to correspond the arguments for the Dummy driver. SQL is too complex to be fully supported over all possible drivers and the key=value,key=value syntax is much more generic and could easily substitute the WHERE key = 'value' AND key = 'value' argument.

I have also stripped the rights implementation for the dummy driver. As I wrote the driver, I found that it was a really large amount of code for a simple example and thus for abstractions sake, I removed the rights checks. I created a simple method that checked if $action was inside the rights array. I then made a wrapper if-clause for the select, insert, update and delete methods. When the code would be refactored, that would be the first step to implementing a proper rights system.

On Forrst, I got the comment:

Shouldn’t the driver attribute be protected (you haven’t set the attribute visibility at all), and make an abstraction for the _isConnected method? So you don’t have to do $db->driver->_isConnected(). The end user shouldn’t call methods prefixed with _ IMO

– Horacio J. Peña

 

By the time I was writing the drivers, I detected this flaw, since you shouldn’t have to do this: $db->driver->_isConnected();. However, I found that coding for a tutorial may introduce some flaws, so we can all have a good look at it. Horacio is completely right. When refactoring, we should introduce a new abstraction and make all the methods preceded by _ private or protected.

Conclusion

As it is, the database class is still incomplete and immature, but it has great potential and great extendability (by nature, since it supports it’s own drivers). It could also fit well in a RESTful API where you make extensive use of the HTTP Methods GET, POST, PUT and DELETE, easily interchangeable with select, insert, update and delete.

This tutorial was a great learning process, even for myself. It has opened a number of insights in database systems and what my proprietary driver especially points out is the flaws in such a simple system. It’s hackable and breakable in so many ways, you can’t start counting them. Still, in a secluded and protected environment, the simplicity is a big win, since you can open up your fave text editor and alter the database in a breeze. Really useful when your code breaks the database.

If you find any of this useful or have some thoughts on how I made some horrible mistakes, please feel free to leave a comment. All feedback is appreciated and whenever possible I’ll improve the article correspondingly.

Notes

  1. ^ __DIR__ is a magic constant that defines the current directory in which the script resides. In PHP versions prior to 5.3.0, the same could be achieved using dirname(__FILE__), which extracts the directory name of the current file.
  2. ^ A database management system (DBMS) is a software package with computer programs that controls the creation, maintenance, and the use of a database. We combined the DBMS with the driver to prevent abstraction, in order to maintain a readable tutorial.

Pitfalls of the newbie JavaScripter

In today’s tutorial, I’m going to educate you on the pitfalls of JavaScript, so we can be a better JavaScripter together. We’re going to talk about:

  1. Global variables
  2. Immediately self-invoking anonymous functions
  3. Function scope
  4. this keyword
  5. Hoisting
  6. Programming style
  7. Prototyping
  8. Chaining

In this article, I will assume you have a basic knowledge of programming and have fiddled with JavaScript before.

When we think of JavaScript, we mostly think of development for the web browser. There are other applications of JavaScript, but to keep it simple, we’ll stick to behavior that’s native to web browsers.

Global variables

When we talk about global variables, we talk about variables that get inserted into the root object of the current environment. In the browser, that is the window object. This means that whenever you try to access some global variable, you can do so through window.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
myVar = 'Hello there, dear visitor';
 
console.log(myVar); // 'Hello there, dear visitor'
 
var myFunction = function(){
    console.log(myVar);        // 'Hello there, dear visitor'
    console.log(window.myVar); // 'Hello there, dear visitor'
};
 
myFunction();
 
var mySecondFunction = function(){
    var myVar = 'This differs';
    console.log(myVar);        // 'This differs'
    console.log(window.myVar); // 'Helo there, dear visitor'
};
 
mySecondFunction();

Right at the start, we’ve tackled some behavior unique to window. Everything that can be accessed through window, can be accessed directly in every scope, unless it’s been redeclared, in wich case the locally declared variable will be used.

Lines 1,3,6,7 and 15 use a global reference.
Line 14 used a local reference, defined at line 13 in the local scope of mySecondFunction.

To make sure we don’t pollute the global scope of our application, we always precede our local variables with a var declaration.

var myThirdFunction = function(){
    var localVar = 'This one is local';
    someVar = 'Hello World!';
}
 
console.log(someVar); // 'Hello World!'
console.log(localVar); // undefined

The benefit is that these local variables don’t overwrite existing variables.

Let’s get a real world example.

File: calculator.js

kilo = 1000;
var Calculator = function(i){
    this.value = i || 0;
    this.k = function(v){ var v=v||this.value; return kilo*v; }; 
};

File: translate-metrics.js

ton = 'Ton';
kilo = 'Kilo';
milli = 'Milli';

Example: implementing calculator.js

c = new Calculator(12);
 
// Let's calculate what a thousandfold of these values looks like
m = c.k();   // 12000
n = c.k(13); // 13000
c.value = 14;
o = c.k();   // 14000

But when we combine it with translate-metrics.js, we set the global variable kilo to a string and thus our Calculator.k method will return NaN, which tells us that it tried to calculate something that was Not a Number.

Immediately self-invoking anonymous functions

So, how do we prevent JavaScript from leaking information between libraries? We do this using self-invoking functions or even anonymous self-invoking functions.

JavaScript has a great way of parsing itself, before handing over the result to the next function. This means we can do a lot of cool stuff like chaining and prototyping, but I’ll talk about those later on. What this also enables us to do is directly invoking a function after it’s been declared.

var myFunc = function(){
    console.log('I\'m running wild here!');
}(); // 'I\'m running wild here!'

As you can see, I just declared a function that outputs a line to the console and directly after declaration, I invoke it using (), rather than issuing myFunc(); separately.
The thing is, we could also do that with an anonymous function, like so:

function(){ console.log('Hello'); }(); // 'Hello'

Because JavaScript allows us to place ( ) around pretty much everything, we make use of that to create a function without giving it a name. Thus, the immediately self-invoking anonymous function.

(function(){ console.log('Hello'); })(); // 'Hello'

And finally, this enables us to declare variables that don’t pollute the global scope.

(function(){
    var myLocalVar = 'Hello';
    console.log(myLocalVar); // 'Hello'
})();
 
console.log(myLocalVar); // undefined

Yet inside the anonymous function, we can still make use of the global variables and even declare functions to the global scope. That’s what the window object is all about.

Function scope

Now comes the tricky part. If you’re used to a language like Java or PHP, you’re working on a block scope base. JavaScript, however, uses a function scope.

PHP: Block scope

<?php
$variable = 'Hello';
 
function myFunc(){
    echo $variable; // undefined
}
 
function myFunc2(){
    global $variable;
    echo $variable; // 'Hello'
}
 
myFunc();
myFunc2();
?>

JavaScript: Function scope

var variable = 'Hello';
var myFunc = function(){
    var internal = 'Hiya';
 
    console.log(variable); // 'Hello'
 
    var internalFunc = function(){
        var deepInternal = 'Howdy';
        console.log(deepInternal); // 'Howdy'
        console.log(internal);     // 'Hiya'
        console.log(variable);     // 'Hello'
    }();
 
    console.log(deepInternal); // undefined
}();

As you can see, all the children of the current scope can access what’s in the local scope and in the parent scope, up until the root scope. That, however, works in one direction, as the parent cannot access a variable inside the child scope, unless it gets specifically returned by return.

What you also need to know is that if you close a scope, all the variables that belong to that scope are no longer accessible.

(function(){
    var someVar = 12;
    window.myFunc = function(){
        console.log(someVar);
    };
    myFunc(); // 12
})();
 
myFunc(); // gives a reference error, stating that window.someVar was not declared

The reason why this happens is because someVar was accessible in the anonymous function and the first time we invoke the function, it was still within that scope. Now, when we invoked it later on, since it was declared to the window object, it no longer sees that variable in the current scope and thus references to something that no longer exists.

this keyword

The behavior above can be explained by the behavior of the this keyword. You could rewrite someVar as this.someVar. Now, what is this? this refers to the current scope of the function or object you’re working with. That is really useful when you don’t know the context in which your code will be applied. It is intensely used when one creates an Object in JavaScript.

var Meeting = function(host,guest){
    this.host = host || new Person();
    this.guest = guest || new Person();
    this.handShake = function(cb){
        console.log(this.host.shakeHand()+' I am the host of this meeting.');
        console.log(this.guest.shakeHand()+' I am the guest of this meeting.');
        if(typeof cb == 'function'){
            cb(); // Call this function whenever the handshake is done.
        }
    };
};
 
var Person = function(name,age){
    this.name = name || 'John Doe';
    this.age = age || 0;
    this.shakeHand = function(){
        return 'Hello, my name is '+this.name+' and I\'m '+this.age+' years old.';
    };
};
 
var bob = new Person('Bob',40);
var amy = new Person('Amy',38);
 
var meeting = new Meeting(bob,amy);
meeting.handShake();

This simply outputs:

Hello, my name is Bob and I'm 40 years old. I am the host of this meeting.
Hello, my name is Amy and I'm 38 years old. I am the guest of this meeting.

But what if we play around with this nifty callback function we implemented, so we can run things when the handshake was done?

First, let’s set up a couple of meetings.

meetings = [
    {
        h:{
            n:'Bob',
            a:40
        },
        g:{
            n:'Amy',
            a:38
        }
    },
    {
        h:{
            n:'Janet',
            a:42
        },
        g:{
            n:'Peter',
            a:26
        }
    },
    {
        h:{
            n:'Marlin',
            a:32
        },
        g:{
            n:'John',
            a:38
        }
    }
];
 
for(var i=0;i<meetings.length;i++){
    var m = meetings[i];
    var host = new Person(m.h.n,m.h.a);
    var guest = new Person(m.g.n,m.g.a);
    var meeting = new Meeting(host,guest);
    meeting.handShake();
}

Output:

Hello, my name is Bob and I'm 40 years old. I am the host of this meeting.
Hello, my name is Amy and I'm 38 years old. I am the guest of this meeting.
Hello, my name is Janet and I'm 42 years old. I am the host of this meeting.
Hello, my name is Peter and I'm 26 years old. I am the guest of this meeting.
Hello, my name is Marlin and I'm 32 years old. I am the host of this meeting.
Hello, my name is John and I'm 38 years old. I am the guest of this meeting.
(function(){
for(var i=0;i<meetings.length;i++){
    var m = meetings[i];
    var host = new Person(m.h.n,m.h.a);
    var guest = new Person(m.g.n,m.g.a);
    var meeting = new Meeting(host,guest);
    meeting.handShake(function(){
        console.log(this.i); // undefined
        this.handShake(); // repeats the handShake, without callback
    });
})();

So, that’s rather annoying. When you have a good look at it, you’ll notice that the callback function is actually declared in another scope then where it’s executed, but that’s something we can do by mistake really easily in this case. So, how do we fix this?

(function(){
for(var i=0;i<meetings.length;i++){
    var parent = this;
    var m = meetings[i];
    var host = new Person(m.h.n,m.h.a);
    var guest = new Person(m.g.n,m.g.a);
    var meeting = new Meeting(host,guest);
    meeting.handShake(function(){
        console.log(parent.i); // 0 || 1 || 2
    });
})();

This simply puts the this object into a variable and passes this variable on to the function in the callback. One of the implementations of myself that had this issue, just recently:

...
 
Storage.prototype.script = function(k,f){
    if(!this.scriptNodeExists(k)&&!this.injectScriptNode(k)&&Xhr!==undefined){
        var x = new Xhr,t=this;
        x.parentScope=this;
        x.callback = function(){
            t.setItemEncoded(k,x.response);
            t.injectScriptNode(k);
        };
        x.get(f);
    }
}
 
...

The problem was that setItemEncoded and injectScriptNode are methods I made for the Storage object and not for the Xhr object. I am, however, running a callback function that needs to execute functions of this very same object, whenever the Xhr object finishes getting the file from the server.

Thanks to great support from Pepkin88 in the comments, here’s a simple problem that displays the behavior of this. I made some minor improvements, but props to Pepkin88.

[jsfiddle url="http://jsfiddle.net/johmanx10/hmn76/12/" height="770px" include="js,result"]

Hoisting

You’ve probably read or heard about good programming habits concerning JavaScript. It was mentioned that it’s a good habit to declare variables at the top of a function, but why exactly? Well, that’s because JavaScript loves function and variable hoisting.

No, we’re not talking about some pirate that’s hoisting your goods. JavaScript has this behavior of declaring variables at the top of the scope. Now, what implications does this have?

var someVar = 'Hello';
 
(function(){
    console.log(someVar); // Hello
})();
 
(function(){
    var someVar = 'Hiya';
    console.log(someVar); // Hiya
})();
 
(function(){
    console.log(someVar); // undefined
    var someVar = 'Howdy';
})();

Huh? So, how does that work? Well, JavaScript has some behavior in the background that hoists var declarations to the top of the scope. That is why you can invoke functions you declare later on in the code, as long as you don’t run the function that invokes them before they are declared. This, however, on the background renders the following JavaScript for the last function:

(function(){
    var someVar;
    console.log(someVar); // undefined
    someVar = 'Howdy';
})();

And now you can clearly see that the value of someVar has been reset and not yet defined.

Programming style

It has been a holy war between programmers. Some people like their brackets left, others right. Some people like indentation with tabs, others with spaces. Some like 2 spaces, others like 8 spaces. I like programming with my brackets right and 4 spaces indentation. That has been really different in the past, when I was only focusing on PHP. I program with 2 monitors of 24″ side-by-side, so my indentation used to be really optimized for readability. It was:

function()
    {
        $var = 'hello';
        return $var;
    }

This got some critique really early on by fellow programmers, but I kept on using it like this, until the unexplainable bugs started happening. JavaScript has a .. ahum .. really neat engine that automatically fixes your broken code, where you forgot your semicolon. Really great for starting programmers. Not so great for professional development. So, what happens?

(function(){
    return
        {
            a: 12,
            b: 13
        };
})(); // undefined

How can this be undefined? It’s the only thing in the fracking function, right? Well, what happens is the engine puts a semicolon behind the return, which in turn returns not quite as much as we would like. The object that get’s declared afterwards will be skipped, because return breaks the function.

So, it’s good behavior to put your brackets on the right, so the engine won’t break your code for you.

Prototyping

Now here comes the power of JavaScript. Everything in this beautiful language is extendable. Even the objects and functions that come packed with your environment. So, what is prototyping?

Everything, except for values and expressions, in JavaScript has a prototype. And although a value has no prototype, the Object it originates from does. The prototype can redefine how an Object behaves and can add functionality to the Object. Even the prototype itself has a prototype, which has a prototype, and so on, until the root is reached.

var arr = [1,2,4,5,33]; // This originates from the object Array
Array.prototype.sum = function(){
    var sum = 0;
    for(var i=0;i<this.length;i++){
        sum += this[i];
    }
    return sum;
}
 
console.log(arr.sum()); // 45

Since arr originated from the object Array, we can add functionality to it through changing the prototype of Array. Now we have a function to calculate the total sum of the numbers inside the array. Prototypes can also invoke methods that were declared inside the prototype as such:

Array.prototype.avg = function(){
    return this.sum() / this.length;
};
console.log(arr.avg()); // 9

That’s great for calculating your average income and expenses.

Prototyping has effect on all the variables that originated from that object, even if they were declared before the prototype changed.

n = new Number(12);
console.log(n.valueOf()); // 12
 
// Only works with positive numbers. Just for demo purposes
Number.prototype.power = function(n){var n=n||2,r=this.valueOf();n--;for(var i=0;i<n;i++){r*=this.valueOf();};return r;};
n.power();  // 144
n.power(2); // 144
n.power(3); // 1728

And now we can do some basic math function without issuing the Math object.

Let’s note that simply prototyping may brick third party code, since it may have unexpected behavior. Libraries are almost always built with the underlying thought that the prototype of core functionality stays untouched. In that respect, you should always check if the object exists and if the method you try to prototype wasn’t already declared.

var Proto = function(){
    this.extendable = function(o,m){
        return (window[o]!==undefined&&window[o][m]===undefined&&window[o].prototype[m]===undefined);
    }
    this.exists = function(o,m){
        return (window[o]!==undefined&&(window[o][m]!==undefined||window[o].prototype[m]!==undefined));
    }
    this.extend = function(o,m,f,d){
        if(this.extendable(o,m)&&(d===undefined||this.exists(o,d))){
            window[o].prototype[m]=f;
        }
    }
    return this;
};
 
// If Array exists and has no method sum
Proto().extend('Array','sum',function(){
    var sum = 0;
    for(var i=0;i<this.length;i++){
        sum += this[i];
    }
    return sum;
});
 
// If Array exists, has no method avg and has method sum
Proto().extend('Array','avg',function(){
    return this.sum() / this.length;
},'sum');

At first we made an Object called Proto. This checks if the prototype is extendable and if so, extends upon the prototype. Later on, we create the method Array.prototype.sum which enables us to do the sum function on arrays. Finally we create the Array.prototype.avg method, which depends on Array.prototype.sum. If the dependency exists, it gets declared and because it does, we now also have the avg method to use on arrays.

We could improve this a lot. For instance, we could allow inter-Object dependencies and check multiple dependencies instead of one, but for the sake of the tutorial, I’m not going to build that extra functionality in here.

Chaining

The Proto object we just made was also an example of a short chain. What we did was execute a function, return this and move on with the result of that. The reason we can chain in JavaScript is because it executes and parses functions on the fly, so you can select the result directly in the same line of code. Two that are used a lot in general web development are:

document.getElementByTagName('script')[0];

and

var e = function(e){return document.getElementById(e);};

As you can see in the first example, we first call the document object, ask it to run the method getElementByTagName with the parameter script. This gives us an array of Nodes that fit the tagname of script. From that array, we take the first item by issuing [0] (zero indexing).

A library that makes really good use of this behavior is jQuery.

jQuery example:

$('ul').children('li').each(function(){
    $(this).hide().fadeIn('slow').addClass('fade-in');
});

And it’s near endless how far jQuery let’s you chain up functions. Now, how is this possible?

jQuery stores the selected item that was defined using $() inside the corresponding object. It then passes that item on to the method we called, children, which in turn returns that same item, or the result of that item. In any case, it returns a jQuery object. And that is the interesting part, since the return value can be called again using JavaScript, due to it’s behavior. So now when the method children finishes, it returns a jQuery object. That jQuery object has all the same methods the previous jQuery object had, so all methods are available yet again.

$('body')               //returns jQuery object
.find('#head')          //returns jQuery object
.addClass('js-enabled') // returns jQuery object
.html('Hello');         // returns jQuery object || String object

But we can do that too and on a really small scale as well.

Combining previous code:

[12,3,0].avg().power(3); // 125

First we have an array. That originates from Array. Array has the prototype method avg. That in turn returns a number, which originates from Number. Number has the prototype method power and with the parameter of 3, it does Number^3.

The sum of the array is 15. The length of the array is 3. The average is 5. 5^3 = 5 * 5 * 5 = 125

Conclusion

JavaScript is a really useful and powerful language, that has been misunderstood by many, due to it’s alternative behavior that leads to a lot of bugs and frustrations. Nonetheless, JavaScript is a real mature language that helps us developers out on many occasions. With the ever growing number of stable and helpful libraries like jQuery, MooTools and Dojo, we see a real growth in support and enthusiasm for the language.

JavaScript has a somewhat high learning curve, even for seasoned programmers, but can be a really easy-going language once one is familiar with the quirks. I hope this article helps overcome those quirks and lowers the border for programmers that are new to JavaScript or just haven’t used the language to it’s full potential.

PHP return types: Beginners guide to debugging PHP

I can remember it so clearly. My first “Hello World!” soared through the skies and cried out a song of victory, like only an eagle could. It was only a minute later I encountered my first bug.
Note: This all was years ago at the age of eleven. Some portion of this might be exaggerated.

It’s a pain to understand what goes wrong when being a newbie

That’s for certain. When dealing with some of the most basic programming, you can still get errors like:
[Fri Jul 01 15:16:47 2011] [warn] mod_fcgid: stderr: PHP Catchable fatal error: Argument 1 passed to print_resource_type() must be an instance of resource, string given, called in /home/johmanx/domains/lievheid.nl/public_html/examples/debugging-php/index.php on line 10 and defined in /home/johmanx/domains/lievheid.nl/public_html/examples/debugging-php/index.php on line 5

 

And so you’re like:

Well, I guess it’s a catchable fatal error. That’s not being helpfull at all. Argument 1 passed to print_resource_type(). Hmm. So it’s something I put into that function? And apparently it must be an instance of resource, whatever that may be. A string was given in some file on line 10. Let’s have a look at that:

1
2
3
4
5
6
7
8
9
10
11
12
<?php
 
error_reporting(E_ALL);
 
function print_resource_type(resource $handle)
    {
        print get_resource_type($handle);
    }
 
print_resource_type("hello");
 
?>

So we see that PHP expects a resource to be given on line 5 as the first argument. If handle is anything but a resource, the whole thing trips and gives off a fatal error. A fatal error terminates the whole script.

Introducing PHP types

Or as used in most common languages

PHP uses different types to tell it’s processor what kind of data needs to be processed.

Type Description Example
(string) Data represented as a textual string. It can contain every character in a row. "My name is Jan-Marten"
(int) A whole number, represented as a number and not a string. 12 is an int and “12″ is a string, although PHP is a really forgiving language. More about that later on. Decimals are also stored in an int. 12
(float) The scientific notation of numbers. Floating points are really large numbers. Programming using floats can result in really unexpected errors and return values, so please stick to using int unless the values no longer fit in an int. 12.12553e-18
(array) A collection/stack of values. As PHP doesn’t support tuples or triples, all corresponding operations should be performed using an array. On a note: PHP extends the capabilities of a string with those of an array.. Sorta.. $string = "abc"; can return part of it by treating it as a string. echo $string[1]; // returns "b". array(“Monkey”=>”I am a primate!”,120,”Bananas per day”=>12,’Ooh ooh ooh aaaa!’);
(bool) A boolean is a representation of success or failure. It can either be TRUE or FALSE and is mostly used in an if statement or in loops. TRUE
(double) Double is an alias of Float. As there are more aliases of data types in PHP, I want to give this one special attention, since it’s not a notable alias to the non educated programmer. 9.88752e-22
(null) NULL is a representation of a value if it’s completely empty. Note: a string of “” is not considered NULL, but a string of zero characters long.
(object) An object is an object is an object … but only in PHP. (Ferengi rule of acquisition #17) The definition of an object is best described as a collection of methods and values. An object contains these methods and values which can or cannot be called from outside the object or within the object. Some are protected, others are private and most are public. The point of an object is to normalize a repeated task, to make it more abstract and enable it to be usefull in more scenarios. More information about objects is available on php.net object(Lievheid)#1 (1) { ["lief"]=> string(20) "This is a public var" }
(resource) A resource is a means of tracking a stream, whether it’s a MySQL link, a file or an XML document. It makes it able for PHP to track the progress of a specific stream. resource(2) of type (stream)

Awesome, but why do we need to know all of this? Well.. The first thing you want to do as a newbie programmer is to understand what you do wrong, so you don’t do it again and know how to fix it. To do this, you must first know what the error is actually telling you. And man, it’s telling a lot of useful goodness. You just need to learn how to read them.

Understanding PHP values

Even more so complex is the way PHP treats values when they’re being validated. Validating values is done on a number of occasions.

When PHP …

  • … stores a value into a variable or static. $n = mysql_num_rows();
  • … checks the conditions inside an if-statement. if($a-$b==$c-$d);
  • … checks the conditions of a for- or while-loop. Validation occurs each iteration (each time the loop restarts). for($i=0;$val;$val=$vals[$i];$i++)

In contradict, it only validates the values once when used inside a switch. This is why we prefer switches over if(){} elseif(){} else{} clauses.

Working with PHP values

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
<?php
 
$types = array(
    'boolean'   =>  TRUE,
    'integer'   =>  1,
    'double'    =>  1.,
    'string'    =>  'Hello',
    'array'     =>  array(1,2,3,'hello'),
    'object'    =>  new stdClass,
    'resource'  =>  fopen(__FILE__,'r'),
    'NULL'      =>  NULL
);
 
foreach($types AS $type=>$val)
    {
        echo    PHP_EOL. // Create a new line
                'Value: '.var_export($val,true). // Export the data into a readable format
                "\t\t". // Create two tabs
                'Expected: '.$type. // We expect this to be the type
                "\t\t". // Create another two tabs
                'Type: '.gettype($val).
                PHP_EOL.PHP_EOL;
    }
 
?>

And what we actually get in return is:

Value: true             Expected: boolean               Type: boolean

Value: 1                Expected: integer               Type: integer

Value: 1                Expected: double                Type: double

Value: 'Hello'          Expected: string                Type: string

Value: array (
  0 => 1,
  1 => 2,
  2 => 3,
  3 => 'hello',
)               Expected: array         Type: array

Value: stdClass::__set_state(array(
))              Expected: object                Type: object

Value: NULL             Expected: resource              Type: resource

Value: NULL             Expected: NULL          Type: NULL

Now, my whole point being is that, although PHP returns the types nicely, it doesn’t treat them as such. We can prove that:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
<?php
 
$vals = array(
 
    "",
    '',
    1,
    1.,
    "12",
    (bool) 1,
    (bool) 12,
    (bool) 0,
    (int) "12"
 
);
 
echo 'VALUE'."\t\t".'numeric'."\t".'int'."\t".'string'."\t".'bool';
 
foreach($vals AS $val)
    {
        echo    PHP_EOL.
                'Value: '.$val.
                "\t".is_numeric($val).
                "\t".is_int($val).
                "\t".is_string($val).
                "\t".is_bool($val)
                ;
    }
 
?>

And this returns us with:

VALUE           numeric int     string  bool
Value:                  1
Value:                  1
Value: 1        1       1
Value: 1        1
Value: 12       1               1
Value: 1                                1
Value: 1                                1
Value:                          1
Value: 12       1       1

Now, this is really interesting.

Validating PHP conditions

As we can see in the output above, there are a number of noteworthy returns. We’ll talk about them in order:

Input value Parsed value is_numeric is_int is_string is_bool Type hits
“” 1 1
1 1
1 1 1 1 2
1. 1 1 1
“12″ 12 1 1 2
TRUE 1 1 1
TRUE 1 1 1
FALSE 1 1
12 12 1 1 2

Now, what I haven’t tested for is_float, but essentially, the floating point would then also hit 2 types. What I did do is shed some light on the problem I’m trying to highlight since the beginning of this article. The problem being: PHP is too darn considerate!

PHP is a language that tries to accommodate new web developers. It tries to let code work in the most suited way it possibly can, to try not to generate all that much errors when first programming in the language. This, however, obviously brings about some optimization issues. It’s slow in validating values and that can be a real pain with static calculation scripts. It’s also weird to give PHP a value of false and get a parsed result of nothing, instead of 0, like the 1 for true. What’s also weird is that some values can be multiple types at the same time. While PHP actually knows what type it is, as we’ve seen in the first piece of script, it turns out that it’s not that simple in the second script.

Some values can be a string and numeric at the same time, while others can be numeric and an int at the same time. Not all that surprising, since numeric is there to check if the value is an int or a string, consisting of just an int. What is surprising is the first two inputs.

Somehow, PHP thinks that an empty string is considered an int and NOT a string. That is really odd.

How to take advantage of this behaviour?

Well, for starters, we should make use of how PHP crunches numbers. In essence, what PHP does when validating a condition is this:

$condition = mysql_num_rows($result);
$validation = (bool) $condition;

Now, if it returns a number greater than or equal to one, it returns true. If it’s a non-empty string, it returns true and if it’s an object that’s set succesfully, it’s true enough for PHP.
However, if it were to be 0 or less, an empty string or a failing object or function, it would return false.

So how do we make use of this? Well, when using MySQL, most PHP devs use the two following tricks:

1
2
3
4
5
6
7
8
9
10
11
12
13
<?php
 
$conn_id = mysql_connect('localhost','root','lamepass');
$result = mysql_query("SELECT * FROM secret_table");
 
if(mysql_num_rows($result)) // If there are more than 0 rows, it will default to true
    {
        while($row=mysql_fetch_assoc($result)) // Eventually, the $row cannot be set anymore and falls back to false
            {
                print_r($row);
            }
    }
?>

And it also applies when handling other streams in PHP, like fopen or the socket stream. Iterating through the results is made really easy by PHP, but please note that when using it with too high of numbers, it’s really gonna cost you CPU time, since, as explained before, PHP needs to validate the values each iteration.

Conclusion

PHP is a darn considerate language that tries to provide for both the newbie programmer and the abstract adventurer, but at a cost. It can be a slow language, when applied to the wrong tasks. Nevertheless it’s a really easygoing language and one that hopefully stays in my toolkit for a few more years.

So, what do you guys think? Any really awkward encounter with PHP and unexpected value parsing/validation? Please share in the comments!

Rating companies using Wufoo forms

One of my clients asked me to make them a simple site with a landing page and a Wufoo form. The landing page contains a brief story about the website, some local news and a local weather report. (hooever.nl)

The Wufoo form is just a simple one, asking questions about the experience customers had with their company. In this small survey, the visitor rates the experience using likert fields.

Now my customer wants to crunch the numbers from the survey and give the company a grade, using the Dutch grading system, seen in High school, rating from 1-10. I thought this was a simple request, but it turns out I really need to digest a number of structural elements using the Wufoo RESTfull API. I just wanted to share the magic with you guys, so we can all benefit from my research ;) In reality I store this result in a cache, based on the result of /api/v3/forms/hash/entries/count.json.

I’m really gratefull for @chriscoyier writing about Wufoo or else I’d have ended up making my own malicious platform for forms.

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
 
<?php
 
class wufoo{
 
    private $api_version = 'v3';
    private $user;
    private $pass = 'footastic';
    private $subdomain;
 
    function __construct($user,$subdomain,$api_version='v3')
        {
            $this->user = $user;
            $this->subdomain = $subdomain;
            $this->api_version = $api_version;
        }
 
    public function request($doc,$params=null)
        {
            $params = ($params!==null) ? ltrim($params,'&?') : '';
            $url = 'https://'.$this->subdomain.'.wufoo.com/api/'.trim($this->api_version,'/').'/'.$doc.'.json'.$params;
 
            $ch = curl_init($url);
            curl_setopt($ch,CURLOPT_RETURNTRANSFER,1);
            curl_setopt($ch,CURLOPT_USERPWD,$this->user.':'.$this->pass);
            curl_setopt($ch,CURLOPT_HTTPAUTH,CURLAUTH_ANY);
            curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
            curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
            curl_setopt($ch,CURLOPT_USERAGENT,'Joh Man X: Wufoo API agent');
 
            $response = curl_exec($ch);
            $status = curl_getinfo($ch);
 
            if($status['http_code']==200){return json_decode($response);}
            return false;
        }
 
    public function form($form)
        {
            return $this->request('forms/'.$form);
        }
 
    public function form_fields($form)
        {
            return $this->request('forms/'.$form.'/fields');
        }
 
    public function form_entries($form)
        {
            return $this->request('forms/'.$form.'/entries');
        }
}
 
$form_hash = 'xxxxxx';
$wufoo = new wufoo('XXXX-XXXX-XXXX-XXXX','subdomain');
 
$fields = $wufoo->form_fields($form_hash);
$entries = $wufoo->form_entries($form_hash);
 
$field_a = array();
foreach($fields->Fields AS $field)
    {
        if($field->Type=='likert')
            {
                $choices = array();
                foreach($field->Choices AS $choice)
                    {
                        $choices[$choice->Label] = (int) $choice->Score;
                    }
                foreach($field->SubFields AS $subfield)
                    {
                        $field_a[$subfield->ID] = $choices;
                    }
            }
    }
 
$score = 0;
foreach($entries->Entries AS $entry)
    {
        $entryscore = 0;
        $usable = 0;
        foreach($entry AS $option=>$value)
            {
                if(isset($field_a[$option])&&!empty($value))
                    {
                        $oscore = $field_a[$option][$value];
                        $tscore = (in_array(0,$field_a[$option])) ? count($field_a[$option])-1 : count($field_a[$option]); // 0==not applicable
                        $oscore /= $tscore;
                        $entryscore += $oscore;
                        $usable++;
                    }
            }
        $entryscore /= $usable;
        $score += $entryscore;
    }
$score /= count($entries->Entries);
$score = round(($score*9)+1,1);
 
echo $score;
 
?>

Creating an HTTP proxy for Wikipedia

How often is it that we want to clarify our text by explaining our terms, so most of our readers actually understand what we’re saying and how often is it that we end up with an article that’s either bloated with explenational lyrics to a song called “The article I was supposedly writing” or an article far too in-depth for the common reader. A lot of people stop reading when they find out they have to read a Wiki article every five sentences, but others really love that and spend their late hours on browsing through all that extended information.

I myself came up with a solution. It’s not uncommon to link to Wiki articles and even so uncommon is the absense of a JavaScript library. Given the static link structure of Wiki articles, it’s not that hard to extract the part that says what article you link to. My solution involves jQuery, some raw JavaScript, some PHP, some configuration using .htaccess and either a subdomain or a folder on a working domain.

So, what’s the big idea?

We’re going to make a proxy for Wikipedia, using the TXT records, which we will use to generate balloons with a summary of the Wiki for that keyword. This way we solve three things:

  1. We don’t bloat our article, but do give extra info for difficult terms;
  2. We provide readers who don’t like to constantly be sent to Wikipedia with a good summary of the article;
  3. People whom still wish to read the full article can click on the link and still get all the info needed;

As an addition, it’s a great support for SEO, because linking to articles with extra information on your article is what helps you rank high on “the Google”.

Provide me with the goodies

Okay, so lets assume you either made a new subdomain or a folder on your website where we send the queries to. First we want to prepare our Apache settings:

RewriteEngine On
RewriteBase /home/wiki/public_html/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]

The RewriteBase should fall back to the current working directory by default, but sometimes this works and sometimes this doesn’t, so making this static prevents it from failing. The conditions are there to assure the file we requested doesn’t already exist (which we will use later on to make a cache) and the same for directories. Next up we make sure the request in a whole is given to index.php.

Now we want to populate index.php with some code:

<?php
$q=substr($_SERVER['REQUEST_URI'],1);
exec('dig +short txt '.$q.'.wp.dg.cx',$dig);
$dig = str_replace('" "','',implode("\n",$dig));
if(!file_exists($q) && !empty($dig)) { file_put_contents($q,$dig); }
die($dig);
?>

And what does this do? Well, maybe I should give an example of how we request the text first:

If I want to know something about Unix systems, I simply make a request for http://wiki.johmanx.com/unix.
Since I actually went for making a special subdomain, wiki, I can make requests like that.

Back to the code: In the first line we see the variable $q, which is simply the Request URI minus the first character, since in my example that would be /unix and we only want unix. Then comes a system command. Windows users should actually stop reading at this point, because dig is in no way available on a Windows server. No worries, because when I get to the Ajax part, there’s still a lot of fun stuff we can do. The dig command is a tool in Linux that allows us to get information about DNS records from a domain. In this case we don’t request information about the typical A record, but we do so for the TXT record. Now, normally we use that to store some information that clarifies what we do with a certain domain, but something.wp.dg.cx actually stores the wiki summary for the term something. By giving it the +short parameter, we make sure we don’t get any fancy data for that domain, but only the contents of the TXT record.

The only thing that remains in the code is a basic filtering for weird closing and opening quotes and to make sure that when it ever happens that a TXT record exists of multiple lines, we actually make sure all the lines are present. Then we check if we don’t overwrite a file by putting this in cache (god forbid I don’t do this and someone gets the bright idea to create and query the article index.php). Then finally we make sure to stop the script and end with only echoing the text we just queried.

But now we’re stuck with Cross Domain AJAX calls

Hmm. I have to give you that. Well, the reason I went for a subdomain approach is because I can provide Windows users with my proxy as where the Linux users can put this in a folder and get a more optimal aproach. So, provided we stick with the cross domain solution, we need to make something like XD.php for our AJAX. It should contain something like this:

<?php
echo file_get_contents('http://wiki.johmanx.com/'.$_SERVER['QUERY_STRING']);
?>

Now, if we request XD.php?waffles, we should get some text rambling about a “batter based cake”.
For a simple AJAX call, I tend to fall back to this simple piece of code:

var createXMLHttpRequest = function() {
    if (window.ActiveXObject) {
        return function() { xmlHttp = new ActiveXObject("Microsoft.XMLHTTP"); }
    } 
    else if (window.XMLHttpRequest) {
        return function() { xmlHttp = new XMLHttpRequest(); }
    }
}();
 
function GETRequest(queryString,cb) {
    createXMLHttpRequest();
 
    xmlHttp.onreadystatechange = cb;
    xmlHttp.open("GET", queryString, true);
    xmlHttp.send(null);
}
 
function POSTRequest(url,query,cb) {
    createXMLHttpRequest();
 
    xmlHttp.open("POST", url, true);
    xmlHttp.onreadystatechange = cb;
    xmlHttp.setRequestHeader("Content-Type", "application/x-www-form-urlencoded");    
    xmlHttp.send(query);
}
 
function ajaxDone() {
    if(xmlHttp.readyState == 4 && xmlHttp.status == 200) {
        return true;
    }
    return false;
}

Now we can make a very jQuery like request for our article:

$("body")
	.delegate("a","hover",
		function(){
			var keyword = $(this).html();
			GETRequest('XD.php?'+keyword,
				function(){
					if(ajaxDone()){	
						$('#balloon-keyword-'+keyword).html(xmlHttp.responseText).show();
					}
				}
			);
		}
	)
	.delegate(".balloon","mouseout",
		function(){
			$(this).hide();
		}
	);

Provided each summary is given a link to the original article, we could add a filter to automatically make anchor links, but to keep this article as simple as possible for quick implementation, I provide the filter apart from that bit of jQuery:

function txt2html(txt) {
  var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
  return txt.replace(exp,"<a href='$1' target='_blank'>$1</a>");
}

That way you could apply a filter to the summary, which will automatically transform a URL to an anchor link.

Now, say you’d want to do this the proper way and put the wiki script in a folder called /wiki/, you could replace ‘XD.php?’ with ‘/wiki/’ and get the same result. This way you don’t have to make Cross Domain calls and you can make optimal use of the cache I provided.

Some other thoughts

Now we’re finished setting up our fine proxy, some thoughts come to mind.

This would make for a good HTTP proxy to apps for the iPhone, iPad and Android platforms. I could see a number of apps coming from this. On top of that list is an App for quickly getting extra information about the ingredients of your recipe. Somewhat lower I’d suggest a Trivia app that generates random questions and gets the answers via a similar proxy. In my opinion, there’s a whole number of apps that could bloom this way. Anyone up for the challenge?