We can easily find out how many URLs there will be when we finish parsing, since they're all in memory.
Single threaded by nature; no benefit to multiple threads since no data will be available until the entire document has been read and parsed.
The character data of each
element needs to be read.
Everything else can be ignored.
getElementsByTagName() method in
Document gives us a quick list of all the
The XML parsing is so straight-forward it can be done inside one method. No extra class is required.