sourceExtract HTML mode

uncommon · April 24, 2020, 4:01am

I have been looking at the command SourceExtract
https://ui.vision/rpa/docs/selenium-ide/sourceextract-sourcesearch
From the documentation and testing is basically looks for the source string between two plain text strings. What I do not get is why not look for the string between the actual source instead of searching for plan text.
2020-04-23_2359

Or is this possible and I missed something?

Note I used a picture of the code because the Post editor said I was posting with too many links…

Plankton · April 24, 2020, 11:02am

Or is this possible and I missed something?

Yeah, it should work exactly like you want it to work.

For example sourceSearch | Tea $*</li> => finds text between Tea $ and "</li>

uncommon · April 25, 2020, 9:01pm

Thank you for your answer by why does a search like this <html* turn up no results?

uncommon · April 28, 2020, 1:28pm

I mean like this <html*</html

admin · April 28, 2020, 3:23pm

Good question => investigating…

admin · May 14, 2020, 11:53am

The issue was that the * notation did not support line breaks. This is fixed with V5.6.5.

sourceExtract |<html*</html> works now.

* can match \n on Chrome/Edge now, but not Firefox. Firefox workaround: Use regular expression explicitly, something like regex=/<html(.|\n)*</html>/g

kolor_blind · October 2, 2021, 2:30am

Dear @admin

If I only want to get th * element, what should I modify?
For example: If the source is:
<html url="https://www.google.com" </html>

when I use this:

sourceExtract | <html*</html> | i
echo | ${i} | green

it does echo all the thing, which is <html url="https://www.google.com" </html>

I only want to echo https://www.google.com, what should I modify, or what command should I add?

Thank you.

jeff_lee · February 22, 2022, 1:10am

have you solved it out ?could you share the way to me ?thanks@kolor_blind

kolor_blind · March 5, 2022, 5:24pm

@jeff_lee

Yes, I solved it. But it’s not what I thought of at the first moment, but it works.
You need a executeScript_Sandbox command. In that command, you put in some Javascript command to chop off the others, so you can get the link at the end. You can learn how to use executeScript_Sandbox in here.

For example, in my example, I should use some substring or slice command to chop off the <html and </html> part. It depends on what sourceExtract pulls out, and then you decide what Javascript command to put in.

You could learn about Javascript a lot on https://www.w3schools.com/

jeff_lee · March 12, 2022, 11:18am

thanks a lot star_struck: