There may be cases where you want to download the whole audio file before loading it into the HTML5 Audio widget.
In my case, while it is easy to just set the URL to the source of the HTML5 Audio widget, it does not work if the URL is cross-domain.
Downloading the whole binary was the only solution that I came up with.
JSoup is a Java library for working with real-world HTML. It provides a very convenient API for extracting and manipulating data, using the best of DOM, CSS and JQuery-like methods.
I had been using the HTMLParser library for so many years even though the library had been stopped way way before because I found it met my needs. But when it came time to parsing a particular site that I wanted to check, I found it hard to parse the HTML code because the web developers did a stupid job of displaying data.
That was when I decided to try JSoup. With very good results!
JSoup implements the WHATWG HTML5 specification and parses HTML to the same DOM as modern browsers do. Other features include:
scrape and parse HTML from a URL, file, or string
find and extract data, using DOM traversal or CSS selectors
manipulate the HTML elements, attributes, and text
clean user-submitted content against a safe white-list, to prevent XSS attacks
output tidy HTML
JSoup is very easy to use too. Its JQuery like style for searching for HTML portions of the code makes it very convenient and easy to use.
Goodbye old library. JSoup is what I will be using from now on!