I would like to parse an HTML file to collect information from that website in the most efficient way possible. What is the best way to parse a file, or preferably a String. either way, I would like to be able to remove tags, or access the data contained within them using the DOM.