public class HtmlParserUtil
extends Object
| Constructor and Description |
|---|
HtmlParserUtil() |
| Modifier and Type | Method and Description |
|---|---|
static String |
extractText(String html)
Extracts the raw text from the HTML input, compressing its whitespace and
removing all attributes, scripts, and styles.
|
static String |
findAttributeValue(Predicate<Function<String,String>> findValuePredicate,
Function<Function<String,String>,String> returnValueFunction,
String html,
String startTagName) |
static String |
render(String html)
Renders the HTML content into text.
|
public static String extractText(String html)
For example, raw text returned by this method can be stored in a search index.
html - the HTML textnull if the
HTML input is nullpublic static String findAttributeValue(Predicate<Function<String,String>> findValuePredicate,
Function<Function<String,String>,String> returnValueFunction,
String html,
String startTagName)
public static String render(String html)
Using the default settings, the output complies with the
Text/Plain; Format=Flowed (DelSp=No) protocol described in
RFC-3676.
html - the HTML textnull if the HTML text is
null