Android - How To Get Plain HTML Using EvaluateJavascript From Webview? JSOUP Not Able To Parse The Result HTML

You should use JsonReader to parse the value:

webView.evaluateJavascript("(function() {return document.getElementsByTagName('html')[0].outerHTML;})();", new ValueCallback<String>() {     @Override     public void onReceiveValue(final String value) {         JsonReader reader = new JsonReader(new StringReader(value));         reader.setLenient(true);         try {             if(reader.peek() == JsonToken.STRING) {                 String domStr = reader.nextString();                 if(domStr != null) {                     handleResponseSuccessByBody(domStr);                 }             }         } catch (IOException e) {             // handle exception         } finally {             IoUtil.close(reader);         } } 


for remove the UTFCharacthers use this function:

 public static StringBuffer removeUTFCharacters(String data) {         Pattern p = Pattern.compile("\\\\u(\\p{XDigit}{4})");         Matcher m = p.matcher(data);         StringBuffer buf = new StringBuffer(data.length());         while (m.find()) {             String ch = String.valueOf((char) Integer.parseInt(, 16));             m.appendReplacement(buf, Matcher.quoteReplacement(ch));         }         m.appendTail(buf);         return buf;     } 

and call it inside the onReceiveValue(String html) like this:

@Override public void onReceiveValue(String html) { String result = removeUTFCharacters(html).toString(); } 

You will obtain a string with clean html.

unescapeJavaScript is from apache commons-lang

The removeUTFCharacters method provided in the previous answer is not clean enough.There still remain stuffs like \".


