Represents document as a tree of nodes
Loads entire document into memory at one time
Works for HTML too