Getting the last element of a split string

Sometimes during web scraping or screen scraping (OCR) you get a text string as the result, split the string and need the last part of it. This could be the last part of an address (e. g. ZIP code) or the last part in a list of extracted numbers.

Here is how to do it. The key part is to use input.substring(input.lastIndexOf('-'). In this example - is the separating character between the elements in the string.

{
  "Command": "store",
  "Target": "111-222-333",
  "Value": "a",
  "Description": ""
},
{
  "Command": "executeScript_Sandbox",
  "Target": "var input = ${a}; var last = input.substring(input.lastIndexOf('-') + 1); return last;",
  "Value": "last_part",
  "Description": ""
},
{
  "Command": "echo",
  "Target": "${Last_Part}",
  "Value": "green",
  "Description": ""
}

Example use case:

For a certain Citrix automation task OCRExtractRelative returns a string with the numbers of a (changing) table of content. It is unknown how many lines are there, but you always need the last entry. This is (“3”) in this example.

In the OCR result \n is the separating character. You can see this by looking at the Variables tab:

image

OCRExtractRelative input image:

toc_dpi_120.png image:

image

Macro code:

{
  "Name": "ocr1",
  "CreationDate": "2021-8-26",
  "Commands": [
    {
      "Command": "open",
      "Target": "https://i.stack.imgur.com/QDbpv.png",
      "Value": "",
      "Description": ""
    },
    {
      "Command": "store",
      "Target": "2",
      "Value": "!ocrengine",
      "Description": "Engine 2 is better for Western characters"
    },
    {
      "Command": "OCRExtractRelative",
      "Target": "toc_dpi_120.png",
      "Value": "a",
      "Description": "grab TOC numbers and OCR them"
    },
    {
      "Command": "executeScript_Sandbox",
      "Target": "var input = ${a};\n\nvar last = input.substring(input.lastIndexOf('\\n') + 1); \n\nreturn last;",
      "Value": "last_part",
      "Description": "split string and take last part"
    },
    {
      "Command": "echo",
      "Target": "${Last_Part}",
      "Value": "green",
      "Description": ""
    }
  ]
}
1 Like

See also Selenium IDE string operations with executeScript_Sandbox

Another solution:

    {
      "Command": "store",
      "Target": "true",
      "Value": "!OCRTableExtraction",
      "Description": ""
    },
    {
      "Command": "OCRExtractRelative",
      "Target": "toc_dpi_120.png",
      "Value": "row_count",
      "Description": ""
    },
    {
      "Command": "echo",
      "Target": "OCR Extract: ${row_count}",
      "Value": "",
      "Description": ""
    },
    {
      "Command": "executeScript_Sandbox",
      "Target": "var count = (${row_count}.match(/\\r\\n/g) || []).length; return Number(count)",
      "Value": "row_count",
      "Description": ""
    },
    {
      "Command": "echo",
      "Target": "${row_count}",
      "Value": "green",
      "Description": ""
    }