About
In this code snippet, we’ll see how to work with HTML using DomDocument in PHP.
If you need to change HTML in code or extract some data from it(web scraping) you can use the DomDocuemnt class to help you with that. You simply create a new instance of it and load the HTML and then you can use its functions to access the DOM.
Official DomDocument documentation here.
Let’s see the example below.
Code:
index.php
<?php //For this demo I just made an HTML string. //But you could read it from an .html file or get the it from a webpage with file_get_contents(). $html = " <div id='wrapperDiv'> <div id='test' class='contentDiv'> <h2>Paragraph</h2> <p class='text'>Some text</p> </div> <div class='contentDiv'> <h2>List</h2> <ul id='FirstListOfItems' class='itemList'> <li>Item 1</li> <li>Item 2</li> <li>Item 3</li> </ul> </div> </div> "; //Make DomDocument from HTML///////////////////////////// //Make a new DomDocument object. $dom = new DomDocument; //Load the html into the object. $dom->loadHTML($html); //Discard white space. $dom->preserveWhiteSpace = false; //Get the element by id. /////////////////////////////// $list = $dom->getElementById("FirstListOfItems"); //Get its children. $listItems = $list->childNodes; //Iterate over all the <li> children elements and print their values. foreach($listItems as $item) echo $item->nodeValue . " "; echo "<br>"; //Make XPath from DomDocument.////////////////////////// $xpath = new DOMXPath($dom); //We can use xpath->query() to get elements aswell. $paragraph = $xpath->query("//*[@id='wrapperDiv']/div/p"); //getAttribute() can get any attribute value(class in this case). echo "Get class attribute: " . $paragraph->item(0)->getAttribute('class'); echo "<br>"; //Get elements by class with xpath.///////////////////// $contentDivs = $xpath->query("//*[@class='contentDiv']"); //List content in all divs. foreach($contentDivs as $contentDiv) echo $contentDiv->nodeValue; echo "<br>"; //Get elements by tag and replace their value.////////// $titles = $dom->getElementsByTagName("h2"); //For this example lets replace the text. foreach($titles as $title) $title->nodeValue = "New Title"; //Creating a new element./////////////////////////////// //Get list. $parent = $dom->getElementById("FirstListOfItems"); //Create a new element. $newElement = $dom->createDocumentFragment(); //Add content. $newElement->appendXML('<li>Item 4</li>'); //Append the new element to the existing list. $parent->appendChild($newElement); //Print out the whole DOM.////////////////////////////// //Get and print the DomDocument data. echo $dom->saveHTML();