PHP Using DomDocument To Work With HTML

Code Snippets Using DomDocument To Work With HTML


In this code snippet, we’ll see how to work with HTML using DomDocument in PHP.

If you need to change HTML in code or extract some data from it(web scraping) you can use the DomDocuemnt class to help you with that. You simply create a new instance of it and load the HTML and then you can use its functions to access the DOM.

Official DomDocument documentation here.

Let’s see the example below.



//For this demo I just made an HTML string. 
//But you could read it from an .html file or get the it from a webpage with file_get_contents().
$html = "
    <div id='wrapperDiv'>
        <div id='test' class='contentDiv'>
            <p class='text'>Some text</p>
        <div class='contentDiv'>
            <ul id='FirstListOfItems' class='itemList'>
                <li>Item 1</li>
                <li>Item 2</li>
                <li>Item 3</li>

//Make DomDocument from HTML/////////////////////////////

//Make a new DomDocument object.
$dom = new DomDocument; 
//Load the html into the object.
//Discard white space.
$dom->preserveWhiteSpace = false;

//Get the element by id. ///////////////////////////////
$list = $dom->getElementById("FirstListOfItems");
//Get its children.
$listItems = $list->childNodes;
//Iterate over all the <li> children elements and print their values.
foreach($listItems as $item)
    echo $item->nodeValue . " ";

echo "<br>";

//Make XPath from DomDocument.//////////////////////////
$xpath = new DOMXPath($dom);
//We can use xpath->query() to get elements aswell.
$paragraph = $xpath->query("//*[@id='wrapperDiv']/div/p");

//getAttribute() can get any attribute value(class in this case).  
echo "Get class attribute: " . $paragraph->item(0)->getAttribute('class');
echo "<br>";

//Get elements by class with xpath./////////////////////
$contentDivs = $xpath->query("//*[@class='contentDiv']");
//List content in all divs.
foreach($contentDivs as $contentDiv)
    echo  $contentDiv->nodeValue;

echo "<br>";

//Get elements by tag and replace their value.//////////
$titles = $dom->getElementsByTagName("h2");
//For this example lets replace the text.
foreach($titles as $title)
    $title->nodeValue = "New Title";

//Creating a new element./////////////////////////////// 

//Get list.
$parent = $dom->getElementById("FirstListOfItems");
//Create a new element.
$newElement = $dom->createDocumentFragment();
//Add content.
$newElement->appendXML('<li>Item 4</li>');
//Append the new element to the existing list.

//Print out the whole DOM.//////////////////////////////

//Get and print the DomDocument data.
echo $dom->saveHTML();

Resulting Output:


Leave a Reply

Your email address will not be published. Required fields are marked *

The following GDPR rules must be read and accepted:
This form collects your name, email and content so that we can keep track of the comments placed on the website. For more info check our privacy policy where you will get more info on where, how and why we store your data.

Advertisment ad adsense adlogger