它用于其他
Element#absUrl(),以便您可以检索< a href>,< img src>,< link href>,< script src>等的(预期)绝对URL.
for (Element link : document.select("a")) {
System.out.println(link.absUrl("href"));
}
如果您还想下载和/或解析链接的资源,这非常有用.
In the 2nd parse() version, what does “resolve relative URLs to absolute URLs, that occur before the HTML declares a tag” mean? What if a tag never occurs in the page?
一些(不良)网站可能已声明< link>或者< script>在< base>之前使用相对URL标签.或者如果没有< base>的方法.标签,然后只是给定的baseUri将用于解析整个文档的相对URL.
What is the purpose of absolute URL detection? Why does Jsoup need to find the absolute URL?
为了在Element#absUrl()上返回正确的URL.这纯粹是为了最终用户的便利. Jsoup不需要它来成功解析HTML.
Lastly, but most importantly: Is baseUri the full URL of HTML page (as phrased in original documentation) or is it the base URL of the HTML page?
前者.如果是后者,那么文件就会撒谎. baseUri不得与< base href>混淆.