The issue was that the * notation did not support line breaks. This is fixed with V5.6.5.
sourceExtract |<html*</html>
works now.
*
can match \n
on Chrome/Edge now, but not Firefox. Firefox workaround: Use regular expression explicitly, something like regex=/<html(.|\n)*</html>/g