Blog Post View

What is PHP?

PHP: Hypertext Preprocessor (PHP for short) is a language made for server-side scripting with a focus on web development; although, it can also be used as a general language tool. Although originally created by Rasmus Lerdorf in 1994, its reference implementation is now managed by The PHP Group. Originally, “PHP” stood for Personal Home Page but since then, it became a recursive acronym for the full name of PHP: Hypertext Preprocessor.

PHP scripts can be implemented within Hypertext Markup Language (HTML) documents or used in combination with some web templating system, web content management system (WCMS), and web framework. The code within these scripts are typically processed by a PHP interpreter which is designated as a module on the relevant web server, a command-line interface (CLI), or as a Common Gateway Interface (CGI) executable. It should also be noted that thanks to the power of the language, it can also be used to implement standalone graphical applications.

The Zend Engine which is the standard PHP interpreter is released for free under the licensing of The PHP Group and as a result, the language has been employed on most web servers and almost all operating systems and platforms.

Different Versions

In its early life, PHP was used by its creator, Rasmus Lerdorf, to maintain his own personal homepage. He accomplished this by writing several CGI programs in the C language which were extended to work with web forms and communicate with databases. These extensions were referred to as Personal Home Page/Forms Interpreter (PHP/FI). This version was announced by Lerdorf to a web forum for the sake of bug reporting and code improvement in 1995 and already had Pearl-like variables, form handling, and the ability to embed HTML. Over time, the number of contributors to the language grew and took shape as a team before releasing PHP/FI 2 in November 1997.

In the same year as PHP/FI 2’s release, PHP’s parser was rewritten by Zeev Suraski and Andi Gutmans which formed the basis for PHP 3. This was the same time the language’s name changed to its now present recursive acronym and the rewriting of PHP’s core began to give birth to the Zend Engine. Shortly afterward, PHP 4 was released in May 2000 and was powered by the Zend Engine v1.0. The next iteration of PHP did not occur until 2004 with PHP 5 which was powered by Zend Engine II with new features such as improved support for object-oriented programming, the birth of the PHP Data Objects (PDO) extension, and numerous performance improvements. Late static binding was added in a later version of PHP 5 as well and eventually, it became the only stable version in 2008. Naturally, many developers updated to PHP 5 thanks in part to the GoPHP5 initiative and over time, there became numerous PHP interpreters for both 32-bit and 64-bit operating systems.

In 2005, Andrei Zmievski spearheaded a project to embed the International Components for Unicode (ICU) library into PHP as the language was often criticized for not having any native Unicode support. This massive project was planned to be the next iteration of PHP and referred to as PHP 6 as there were many other planned new features and internal changes caused by this such as strings being handled as UTF-16 internally. Ultimately, however; the project did not come to fruition officially thanks to many problems caused by confusion of using converting to and from UTF-16 among other things. It was only until 2014 when the language saw its next iteration as PHP 7, albeit, amongst a lot of debate thanks to its name as PHP 6 was never officially released. This iteration saw a massive increase in performance as its goal was focused on the optimization and refactoring of the Zend Engine which resulted in the creation of the Zend Engine 3; the successor to PHP 5’s Zend Engine II. The iteration also included some new features such as return-type declarations for functions which were a welcome addition to work alongside their pre-existing parameter-type declarations.

Syntax

PHP code is identified by its syntax which begins first and foremost with its opening tag and ends with its matching closing tag. These tags (or delimiters) identifies the code within the HTML as the PHP code and thus this code alone will be processed by the Zend Engine. Each variable within the language carry a dollar sign prefix and they do not require the specification of a type beforehand or upon initialization. Thanks to PHP 7 in conjunction with PHP 5, type hinting is a feature focused on this which allows a function to force their parameters as objects, strings, and integers of specific classes, arrays, interfaces, or callback functions.

The language’s syntax bears plenty of similarities to C’s syntax and just like C, requires manual input of semicolons to denote the end of statements. Even some of its data types are similar to C’s such as integers which are stored as either a 62-bit or 32-bit signed integer; yet, unsigned integers are converted to signed values under some situations, a process that’s unlike any other across all languages.

PHP Objects is the language’s way of handling object-oriented programming and was implemented since PHP 3 and further optimized in PHP 4. It's implementation allowed for more flexibility and ease of use for developers to build creative tasks with the language. It was then rewritten entirely in PHP 5 for the sake of performance enhancement and feature set expansion. Here, private and protected member variables and methods of classes was also introduced along with constructors, destructors, exception handling, and abstract and final classes and methods.

Variations and Uses

Over time, there have been some variations of PHP made and used by developers for specific reasons. While PHP, which is sometimes also called Zend PHP to uniquely identify it, is still the single most common one of the group thanks to the power of the Zend Engine, there are others who found some use or means to create alternative interpreters and thus; their own variations of the language.

HipHop Virtual Machine (HHVM) was developed by Facebook and is used to convert PHP code into high-level bytecode and then processed by a just-in-time (JIT) compiler and can see a significant increase in performance. HipHop is also developed by Facebook and is used to convert PHP scripts into C++ code before compiling as it reduced server load significantly.

Parrot is a virtual machine used to run dynamic languages more efficiently by having the code be converted into its own bytecode before execution.

Phalanger is used to compile PHP code into a Common Intermediate Language (CIL) which can then be converted to some other implementation’s bytecode for use by their virtual machine.

Security

The language’s security is not perfect, but it is indeed one of its redeeming features and heavily influences why it is so popular and common. This is because there are very few critical or fatal errors made within its code or libraries which makes it that much more desirable. Be that as it may, features such as taint checking are still being developed for PHP to help with end-user validation to ensure security even further for the language. Oddly enough, however, the inclusion of such a feature has actually been rejected several times in the past versions of PHP.

Although there are relevant security patches such as Suhosin and Hardening-Patch, there are still some notable security risks thanks to some of the very same features it sports. Among these, register_globals made URL parameters PHP variables which allowed attackers the opportunity to set values of uninitialized global variables and potentially interfere with the execution of a script. Since PHP 5.3 however, support for register_globals along with magic_quotes_gpc, which also adds some security flaws, have been deprecated and are no longer supported.

Type conversions which resolve with some incompatible value that is treated incorrectly; against a programmer’s intent, can lead to security issues as well. For example, 0e1234 == 0 can be resolved to true in the language because the right-side value can be treated as a scientific notation which would be interpreted as 0 x 101234 which does work out to zero. Yet, the programmer is expecting an integer, not a string and as such, this resulted as an authentication vulnerability in Typo3 and phpBB whenever there was a comparison made involving MD5 password hashes. To counter this, however; the use of the strcmp function or identity operator (===) is recommended to be used for comparisons instead as the same comparison will return false.

Share this post

Comments (0)

    No comment

Leave a comment

Login To Post Comment