Skip to content

Browser Automation

browser_automation Module to automate configuration via a browser interface.

These are typically used as fallback options if no REST API or LLConfig can be used.

This module uses playwright: https://playwright.dev for broweser-based automation and testing.

Core Playwright data types and their relationships:

Playwright └── BrowserType (chromium / firefox / webkit) └── Browser └── BrowserContext └── Page ├── Frame ├── Locator (preferred for element interactions) ├── ElementHandle (lower-level DOM reference) └── JSHandle (handle to any JS object)

Type Description
Playwright Entry point via sync_playwright() or async_playwright(). Gives access to browser types.
BrowserType Represents Chromium, Firefox, or WebKit. Used to launch a Browser.
Browser A running browser instance. Use .new_context() to create sessions.
BrowserContext Isolated incognito-like browser profile. Contains pages.
Page A single browser tab. Main interface for navigation and interaction.
Frame Represents a <frame> or <iframe>. Like a mini Page.
Locator Lazily-evaluated, auto-waiting reference to one or more elements. Preferred for interacting with elements.
ElementHandle Static reference to a single DOM element. Useful for special interactions or JS execution.
JSHandle Handle to any JavaScript object, not just DOM nodes. Returned by evaluate_handle().

Here are few few examples of the most typical page matches with the different selector types:

Element to Match CSS XPath Playwright get_by_* Method
Element with ID #myId //*[@id='myId'] Not available directly; use locator()
Element with class .myClass //*[@class='myClass'] Not available directly; use locator()
Button with exact text button:has-text("Submit") //button[text()='Submit'] get_by_role("button", name="Submit")
Button with partial text button:has-text("Sub") //button[contains(text(), 'Sub')] get_by_text("Sub")
Input with name input[name="email"] //input[@name='email'] Not available directly; use locator()
Link by text a:has-text("Home") //a[text()='Home'] get_by_role("link", name="Home")
Element with title [title="Info"] //*[@title='Info'] get_by_title("Info")
Placeholder text input[placeholder="Search"] //input[@placeholder='Search'] get_by_placeholder("Search")
Label text (form input) label:has-text("Email") //label[text()='Email'] get_by_label("Email")
Alt text (image) img[alt="Logo"] //img[@alt='Logo'] get_by_alt_text("Logo")
Role and name (ARIA) [role="button"][name="Save"] //*[@role='button' and @name='Save'] get_by_role("button", name="Save")
Visible text anywhere :text("Welcome") //*[contains(text(), "Welcome")] get_by_text("Welcome")
nth element in a list ul > li:nth-child(2) (//ul/li)[2] locator("ul > li").nth(1)
Element with attribute [data-test-id="main"] //*[@data-test-id='main'] Not available directly; use locator()
Nested element .container .button //div[@class='container']//div[@class='button'] locator(".container .button")

BrowserAutomation

Class to automate settings via a browser interface.

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
  98
  99
 100
 101
 102
 103
 104
 105
 106
 107
 108
 109
 110
 111
 112
 113
 114
 115
 116
 117
 118
 119
 120
 121
 122
 123
 124
 125
 126
 127
 128
 129
 130
 131
 132
 133
 134
 135
 136
 137
 138
 139
 140
 141
 142
 143
 144
 145
 146
 147
 148
 149
 150
 151
 152
 153
 154
 155
 156
 157
 158
 159
 160
 161
 162
 163
 164
 165
 166
 167
 168
 169
 170
 171
 172
 173
 174
 175
 176
 177
 178
 179
 180
 181
 182
 183
 184
 185
 186
 187
 188
 189
 190
 191
 192
 193
 194
 195
 196
 197
 198
 199
 200
 201
 202
 203
 204
 205
 206
 207
 208
 209
 210
 211
 212
 213
 214
 215
 216
 217
 218
 219
 220
 221
 222
 223
 224
 225
 226
 227
 228
 229
 230
 231
 232
 233
 234
 235
 236
 237
 238
 239
 240
 241
 242
 243
 244
 245
 246
 247
 248
 249
 250
 251
 252
 253
 254
 255
 256
 257
 258
 259
 260
 261
 262
 263
 264
 265
 266
 267
 268
 269
 270
 271
 272
 273
 274
 275
 276
 277
 278
 279
 280
 281
 282
 283
 284
 285
 286
 287
 288
 289
 290
 291
 292
 293
 294
 295
 296
 297
 298
 299
 300
 301
 302
 303
 304
 305
 306
 307
 308
 309
 310
 311
 312
 313
 314
 315
 316
 317
 318
 319
 320
 321
 322
 323
 324
 325
 326
 327
 328
 329
 330
 331
 332
 333
 334
 335
 336
 337
 338
 339
 340
 341
 342
 343
 344
 345
 346
 347
 348
 349
 350
 351
 352
 353
 354
 355
 356
 357
 358
 359
 360
 361
 362
 363
 364
 365
 366
 367
 368
 369
 370
 371
 372
 373
 374
 375
 376
 377
 378
 379
 380
 381
 382
 383
 384
 385
 386
 387
 388
 389
 390
 391
 392
 393
 394
 395
 396
 397
 398
 399
 400
 401
 402
 403
 404
 405
 406
 407
 408
 409
 410
 411
 412
 413
 414
 415
 416
 417
 418
 419
 420
 421
 422
 423
 424
 425
 426
 427
 428
 429
 430
 431
 432
 433
 434
 435
 436
 437
 438
 439
 440
 441
 442
 443
 444
 445
 446
 447
 448
 449
 450
 451
 452
 453
 454
 455
 456
 457
 458
 459
 460
 461
 462
 463
 464
 465
 466
 467
 468
 469
 470
 471
 472
 473
 474
 475
 476
 477
 478
 479
 480
 481
 482
 483
 484
 485
 486
 487
 488
 489
 490
 491
 492
 493
 494
 495
 496
 497
 498
 499
 500
 501
 502
 503
 504
 505
 506
 507
 508
 509
 510
 511
 512
 513
 514
 515
 516
 517
 518
 519
 520
 521
 522
 523
 524
 525
 526
 527
 528
 529
 530
 531
 532
 533
 534
 535
 536
 537
 538
 539
 540
 541
 542
 543
 544
 545
 546
 547
 548
 549
 550
 551
 552
 553
 554
 555
 556
 557
 558
 559
 560
 561
 562
 563
 564
 565
 566
 567
 568
 569
 570
 571
 572
 573
 574
 575
 576
 577
 578
 579
 580
 581
 582
 583
 584
 585
 586
 587
 588
 589
 590
 591
 592
 593
 594
 595
 596
 597
 598
 599
 600
 601
 602
 603
 604
 605
 606
 607
 608
 609
 610
 611
 612
 613
 614
 615
 616
 617
 618
 619
 620
 621
 622
 623
 624
 625
 626
 627
 628
 629
 630
 631
 632
 633
 634
 635
 636
 637
 638
 639
 640
 641
 642
 643
 644
 645
 646
 647
 648
 649
 650
 651
 652
 653
 654
 655
 656
 657
 658
 659
 660
 661
 662
 663
 664
 665
 666
 667
 668
 669
 670
 671
 672
 673
 674
 675
 676
 677
 678
 679
 680
 681
 682
 683
 684
 685
 686
 687
 688
 689
 690
 691
 692
 693
 694
 695
 696
 697
 698
 699
 700
 701
 702
 703
 704
 705
 706
 707
 708
 709
 710
 711
 712
 713
 714
 715
 716
 717
 718
 719
 720
 721
 722
 723
 724
 725
 726
 727
 728
 729
 730
 731
 732
 733
 734
 735
 736
 737
 738
 739
 740
 741
 742
 743
 744
 745
 746
 747
 748
 749
 750
 751
 752
 753
 754
 755
 756
 757
 758
 759
 760
 761
 762
 763
 764
 765
 766
 767
 768
 769
 770
 771
 772
 773
 774
 775
 776
 777
 778
 779
 780
 781
 782
 783
 784
 785
 786
 787
 788
 789
 790
 791
 792
 793
 794
 795
 796
 797
 798
 799
 800
 801
 802
 803
 804
 805
 806
 807
 808
 809
 810
 811
 812
 813
 814
 815
 816
 817
 818
 819
 820
 821
 822
 823
 824
 825
 826
 827
 828
 829
 830
 831
 832
 833
 834
 835
 836
 837
 838
 839
 840
 841
 842
 843
 844
 845
 846
 847
 848
 849
 850
 851
 852
 853
 854
 855
 856
 857
 858
 859
 860
 861
 862
 863
 864
 865
 866
 867
 868
 869
 870
 871
 872
 873
 874
 875
 876
 877
 878
 879
 880
 881
 882
 883
 884
 885
 886
 887
 888
 889
 890
 891
 892
 893
 894
 895
 896
 897
 898
 899
 900
 901
 902
 903
 904
 905
 906
 907
 908
 909
 910
 911
 912
 913
 914
 915
 916
 917
 918
 919
 920
 921
 922
 923
 924
 925
 926
 927
 928
 929
 930
 931
 932
 933
 934
 935
 936
 937
 938
 939
 940
 941
 942
 943
 944
 945
 946
 947
 948
 949
 950
 951
 952
 953
 954
 955
 956
 957
 958
 959
 960
 961
 962
 963
 964
 965
 966
 967
 968
 969
 970
 971
 972
 973
 974
 975
 976
 977
 978
 979
 980
 981
 982
 983
 984
 985
 986
 987
 988
 989
 990
 991
 992
 993
 994
 995
 996
 997
 998
 999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
class BrowserAutomation:
    """Class to automate settings via a browser interface."""

    logger: logging.Logger = default_logger

    def __init__(
        self,
        base_url: str = "",
        user_name: str = "",
        user_password: str = "",
        download_directory: str | None = None,
        take_screenshots: bool = False,
        automation_name: str = "",
        headless: bool = True,
        logger: logging.Logger = default_logger,
        wait_until: str | None = None,
        browser: str | None = None,
    ) -> None:
        """Initialize the object.

        Args:
            base_url (str, optional):
                The base URL of the website to automate. Defaults to "".
            user_name (str, optional):
                If an authentication at the web site is required, this is the user name.
                Defaults to "".
            user_password (str, optional):
                If an authentication at the web site is required, this is the user password.
                Defaults to "".
            download_directory (str | None, optional):
                A download directory used for download links. If None,
                a temporary directory is automatically used.
            take_screenshots (bool, optional):
                For debugging purposes, screenshots can be taken.
                Defaults to False.
            automation_name (str, optional):
                The name of the automation. Defaults to "".
            headless (bool, optional):
                If True, the browser will be started in headless mode. Defaults to True.
            wait_until (str | None, optional):
                Wait until a certain condition. Options are:
                * "commit" - does not wait at all - commit the request and continue
                * "load" - waits for the load event (after all resources like images/scripts load)
                * "networkidle" - waits until there are no network connections for at least 500 ms.
                * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed,
                  but subresources may still load).
            logger (logging.Logger, optional):
                The logging object to use for all log messages. Defaults to default_logger.
            browser (str | None, optional):
                The browser to use. Defaults to None, which takes the global default or from the ENV "BROWSER".

        """

        if not download_directory:
            download_directory = os.path.join(
                tempfile.gettempdir(),
                "browser_automations",
                self.sanitize_filename(filename=automation_name),
                "downloads",
            )

        if logger != default_logger:
            self.logger = logger.getChild("browserautomation")
            for logfilter in logger.filters:
                self.logger.addFilter(logfilter)

        self.base_url = base_url
        self.user_name = user_name
        self.user_password = user_password
        self.logged_in = False
        self.download_directory = download_directory
        self.headless = headless

        # Screenshot configurations:
        self.take_screenshots = take_screenshots
        self.screenshot_names = self.sanitize_filename(filename=automation_name)
        self.screenshot_counter = 1
        self.screenshot_full_page = True

        self.wait_until = wait_until if wait_until else DEFAULT_WAIT_UNTIL_STRATEGY

        self.screenshot_directory = os.path.join(
            tempfile.gettempdir(),
            "browser_automations",
            self.screenshot_names,
            "screenshots",
        )
        self.logger.debug("Creating screenshot directory... -> %s", self.screenshot_directory)
        if self.take_screenshots and not os.path.exists(self.screenshot_directory):
            os.makedirs(self.screenshot_directory)

        self.proxy = None
        if os.getenv("HTTP_PROXY"):
            self.proxy = {
                "server": os.getenv("HTTP_PROXY"),
            }
            self.logger.info("Using HTTP proxy -> %s", os.getenv("HTTP_PROXY"))

        browser = browser or os.getenv("BROWSER", "webkit")
        self.logger.info("Using browser -> '%s'...", browser)

        if not self.setup_playwright(browser=browser):
            self.logger.error("Failed to initialize Playwright browser automation!")
            return

        self.logger.info("Creating browser context...")
        self.context: BrowserContext = self.browser.new_context(
            accept_downloads=True,
        )

        self.logger.info("Creating page...")
        self.page: Page = self.context.new_page()
        self.main_page = self.page
        self.logger.info("Browser automation initialized.")

    # end method definition

    def setup_playwright(self, browser: str) -> bool:
        """Initialize Playwright browser automation.

        Args:
            browser (str):
                Name of the browser engine. Supported:
                * chromium
                * chrome
                * msedge
                * webkit
                * firefox

        Returns:
            bool:
                True = Success, False = Error.

        """

        try:
            self.logger.debug("Creating Playwright instance...")
            self.playwright = sync_playwright().start()
        except Exception:
            self.logger.error("Failed to start Playwright!")
            return False

        result = True

        # Install and launch the selected browser in Playwright:
        match browser:
            case "chromium":
                try:
                    self.browser: Browser = self.playwright.chromium.launch(
                        headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                    )
                except Exception:
                    result = self.install_browser(browser=browser)
                    if result:
                        self.browser: Browser = self.playwright.chromium.launch(
                            headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                        )

            case "chrome":
                try:
                    self.browser: Browser = self.playwright.chromium.launch(
                        channel="chrome",
                        headless=self.headless,
                        slow_mo=100 if not self.headless else None,
                        proxy=self.proxy,
                    )
                except Exception:
                    result = self.install_browser(browser=browser)
                    if result:
                        self.browser: Browser = self.playwright.chromium.launch(
                            channel="chrome",
                            headless=self.headless,
                            slow_mo=100 if not self.headless else None,
                            proxy=self.proxy,
                        )

            case "msedge":
                try:
                    self.browser: Browser = self.playwright.chromium.launch(
                        channel="msedge",
                        headless=self.headless,
                        slow_mo=100 if not self.headless else None,
                        proxy=self.proxy,
                    )
                except Exception:
                    result = self.install_browser(browser=browser)
                    if result:
                        self.browser: Browser = self.playwright.chromium.launch(
                            channel="msedge",
                            headless=self.headless,
                            slow_mo=100 if not self.headless else None,
                            proxy=self.proxy,
                        )

            case "webkit":
                try:
                    self.browser: Browser = self.playwright.webkit.launch(
                        headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                    )
                except Exception:
                    result = self.install_browser(browser=browser)
                    if result:
                        self.browser: Browser = self.playwright.webkit.launch(
                            headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                        )

            case "firefox":
                try:
                    self.browser: Browser = self.playwright.firefox.launch(
                        headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                    )
                except Exception:
                    result = self.install_browser(browser=browser)
                    if result:
                        self.browser: Browser = self.playwright.firefox.launch(
                            headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                        )
            case _:
                self.logger.error("Unknown browser -> '%s'. Cannot install and launch it.", browser)
                result = False

        return result

    # end method definition

    def install_browser(self, browser: str) -> bool:
        """Install a browser with a provided name in Playwright.

        Args:
            browser (str):
                Name of the browser to be installed.

        Returns:
            bool: True = installation successful, False = installation failed.

        """

        self.logger.info("Installing Browser -> '%s'...", browser)
        process = subprocess.Popen(
            ["playwright", "install", browser],  # noqa: S607
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            shell=False,
        )
        output, error = process.communicate()
        if process.returncode == 0:  # 0 = success
            self.logger.info("Successfuly completed installation of browser -> '%s'.", browser)
            self.logger.debug(output.decode())
        else:
            self.logger.error("Installation of browser -> '%s' failed! Error -> %s", browser, error.decode())
            self.logger.error(output.decode())
            return False

        return True

    # end method definition

    def sanitize_filename(self, filename: str) -> str:
        """Sanitize a string to be safe for use as a filename.

        - Replaces spaces with underscores
        - Removes unsafe characters
        - Converts to lowercase
        - Trims length and dots

        Args:
            filename (str):
                The filename to sanitize.

        """

        filename = filename.lower()
        filename = filename.replace(" ", "_")
        filename = re.sub(r'[<>:"/\\|?*\x00-\x1F]', "", filename)  # Remove unsafe chars
        filename = re.sub(r"\.+$", "", filename)  # Remove trailing dots
        filename = filename.strip()
        if not filename:
            filename = "untitled"

        return filename

    # end method definition

    def take_screenshot(self) -> bool:
        """Take a screenshot of the current browser window and save it as PNG file.

        Returns:
            bool:
                True if successful, False otherwise

        """

        screenshot_file = "{}/{}-{:02d}.png".format(
            self.screenshot_directory,
            self.screenshot_names,
            self.screenshot_counter,
        )
        self.logger.debug("Save browser screenshot to -> %s", screenshot_file)

        try:
            self.page.screenshot(path=screenshot_file, full_page=self.screenshot_full_page)
            self.screenshot_counter += 1
        except Exception as e:
            self.logger.error("Failed to take screenshot; error -> %s", e)
            return False

        return True

    # end method definition

    def get_page(self, url: str = "", wait_until: str | None = None) -> bool:
        """Load a page into the browser based on a given URL.

        Args:
            url (str):
                URL to load. If empty just the base URL will be used.
            wait_until (str | None, optional):
                Wait until a certain condition. Options are:
                * "commit" - does not wait at all - commit the request and continue
                * "load" - waits for the load event (after all resources like images/scripts load)
                  This is the safest strategy for pages that keep loading content in the background
                  like Salesforce.
                * "networkidle" - waits until there are no network connections for at least 500 ms.
                  This seems to be the safest one for OpenText Content Server.
                * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed,
                  but subresources may still load).

        Returns:
            bool:
                True if successful, False otherwise.

        """

        # If no specific wait until strategy is provided in the
        # parameter, we take the one from the browser automation class:
        if wait_until is None:
            wait_until = self.wait_until

        page_url = self.base_url + url

        try:
            self.logger.debug("Load page -> %s (wait until -> '%s')", page_url, wait_until)

            # The Playwright Response object is different from the requests.response object!
            response = self.page.goto(page_url, wait_until=wait_until)
            if response is None:
                self.logger.warning("Loading of page -> %s completed but no response object was returned.", page_url)
            elif not response.ok:
                # Try to get standard phrase, fall back if unknown
                try:
                    phrase = HTTPStatus(response.status).phrase
                except ValueError:
                    phrase = "Unknown Status"
                self.logger.error(
                    "Response for page -> %s is not OK. Status -> %s/%s",
                    page_url,
                    response.status,
                    phrase,
                )
                return False

        except PlaywrightError as e:
            self.logger.error("Navigation to page -> %s has failed; error -> %s", page_url, str(e))
            return False

        if self.take_screenshots:
            self.take_screenshot()

        return True

    # end method definition

    def get_title(
        self,
        wait_until: str | None = None,
    ) -> str | None:
        """Get the browser title.

        This is handy to validate a certain page is loaded after get_page()

        Retry-safe way to get the page title, even if there's an in-flight navigation.

        Args:
            wait_until (str | None, optional):
                Wait until a certain condition. Options are:
                * "commit" - does not wait at all - commit the request and continue
                * "load" - waits for the load event (after all resources like images/scripts load)
                  This is the safest strategy for pages that keep loading content in the background
                  like Salesforce.
                * "networkidle" - waits until there are no network connections for at least 500 ms.
                  This seems to be the safest one for OpenText Content Server.
                * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed,
                  but subresources may still load).

        Returns:
            str:
                The title of the browser page.

        """

        for attempt in range(REQUEST_MAX_RETRIES):
            try:
                if wait_until:
                    self.page.wait_for_load_state(state=wait_until, timeout=REQUEST_TIMEOUT)
                title = self.page.title()
                if title:
                    return title
                time.sleep(REQUEST_RETRY_DELAY)
                self.logger.info("Retry attempt %d/%d", attempt + 1, REQUEST_MAX_RETRIES)
            except Exception as e:
                if "Execution context was destroyed" in str(e):
                    self.logger.info(
                        "Execution context was destroyed, retrying after %s seconds...", REQUEST_RETRY_DELAY
                    )
                    time.sleep(REQUEST_RETRY_DELAY)
                    self.logger.info("Retry attempt %d/%d", attempt + 1, REQUEST_MAX_RETRIES)
                    continue
                self.logger.error("Could not get page title; error -> %s", str(e))
                break

        return None

    # end method definition

    def scroll_to_element(self, element: Locator) -> None:
        """Scroll an element into view to make it clickable.

        Args:
            element (Locator):
                Web element that has been identified before.

        """

        if not element:
            self.logger.error("Undefined element! Cannot scroll to it.")
            return

        try:
            element.scroll_into_view_if_needed()
        except PlaywrightError as e:
            self.logger.warning("Cannot scroll element -> %s into view; error -> %s", str(element), str(e))

    # end method definition

    def get_locator(
        self,
        selector: str,
        selector_type: str,
        role_type: str | None = None,
        exact_match: bool | None = None,
        iframe: str | None = None,
        regex: bool = False,
        filter_has_text: str | None = None,
        filter_has: Locator | None = None,
        filter_has_not_text: str | None = None,
        filter_has_not: Locator | None = None,
    ) -> Locator | None:
        """Determine the locator for the given selector type and (optional) role type.

        Args:
            selector (str):
                The selector to find the element on the page.
            selector_type (str):
                One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
                "label", "placeholder", "alt".
                When using css, the selector becomes a raw CSS selector, and you can skip attribute
                and value filtering entirely if your selector already narrows it down.
                Examples for CSS:
                * selector="img" - find all img tags (images)
                * selector="img[title]" - find all img tags (images) that have a title attribute - independent of its value
                * selector="img[title*='Microsoft Teams']" - find all images with a title that contains "Microsoft Teams"
                * selector=".toolbar button" - find all buttons inside a .toolbar class
            role_type (str | None, optional):
                ARIA role when using selector_type="role", e.g., "button", "textbox".
                If irrelevant then None should be passed for role_type.
            exact_match (bool | None, optional):
                 Controls whether the text or name must match exactly.
                 Default is None (not set, i.e. using playwrights default).
            iframe (str | None):
                Is the element in an iFrame? Then provide the name of the iframe with this parameter.
            regex (bool, optional):
                Should the name be interpreted as a regular expression?
            filter_has_text (str | None, optional):
                Applies `locator.filter(has_text=...)` to narrow the selection based on text content.
            filter_has (Locator | None, optional):
                Applies `locator.filter(has=...)` to match elements containing a descendant matching the given Locator.
            filter_has_not_text (str | None, optional):
                Applies `locator.filter(has_not_text=...)` to exclude elements with matching text content.
            filter_has_not (Locator | None, optional):
                Applies `locator.filter(has_not=...)` to exclude elements containing a matching descendant.

        """

        try:
            name_or_text = re.compile(selector) if regex else selector

            match selector_type:
                case "id":
                    locator = self.page.locator("#{}".format(selector))
                case "name":
                    locator = self.page.locator("[name='{}']".format(selector))
                case "class_name":
                    locator = self.page.locator(".{}".format(selector))
                case "xpath":
                    locator = self.page.locator("xpath={}".format(selector))
                case "css":
                    if iframe is None:
                        locator = self.page.locator(selector)
                    else:
                        locator = self.page.locator("iframe[name='{}']".format(iframe)).content_frame.locator(selector)
                case "text":
                    if iframe is None:
                        locator = self.page.get_by_text(text=name_or_text)
                    else:
                        locator = self.page.locator("iframe[name='{}']".format(iframe)).content_frame.get_by_text(
                            name_or_text
                        )
                case "title":
                    locator = self.page.get_by_title(text=name_or_text)
                case "label":
                    locator = self.page.get_by_label(text=name_or_text)
                case "placeholder":
                    locator = self.page.get_by_placeholder(text=name_or_text)
                case "alt":
                    locator = self.page.get_by_alt_text(text=name_or_text)
                case "role":
                    if not role_type:
                        self.logger.error("Role type must be specified when using find method 'role'!")
                        return None
                    if iframe is None:
                        if regex:
                            locator = self.page.get_by_role(role=role_type, name=name_or_text)
                        else:
                            locator = self.page.get_by_role(role=role_type, name=selector, exact=exact_match)
                    else:
                        content_frame = self.page.locator("iframe[name='{}']".format(iframe)).content_frame
                        if regex:
                            locator = content_frame.get_by_role(role=role_type, name=name_or_text)
                        else:
                            locator = content_frame.get_by_role(role=role_type, name=selector, exact=exact_match)
                case _:
                    self.logger.error("Unsupported selector type -> '%s'", selector_type)
                    return None

            # Apply filter if needed
            if any([filter_has_text, filter_has, filter_has_not_text, filter_has_not]):
                locator = locator.filter(
                    has_text=filter_has_text, has=filter_has, has_not_text=filter_has_not_text, has_not=filter_has_not
                )

        except PlaywrightError as e:
            self.logger.error("Failure to determine page locator; error -> %s", str(e))
            return None

        return locator

    # end method definition

    def find_elem(
        self,
        selector: str,
        selector_type: str = "id",
        role_type: str | None = None,
        wait_state: str = "visible",
        exact_match: bool | None = None,
        regex: bool = False,
        occurrence: int = 1,
        iframe: str | None = None,
        repeat_reload: int | None = None,
        repeat_reload_delay: int = 60,
        show_error: bool = True,
    ) -> Locator | None:
        """Find a page element.

        Args:
            selector (str):
                The name of the page element or accessible name (for role).
            selector_type (str, optional):
                One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
                "label", "placeholder", "alt".
            role_type (str | None, optional):
                ARIA role when using selector_type="role", e.g., "button", "textbox".
                If irrelevant then None should be passed for role_type.
            wait_state (str, optional):
                Defines if we wait for attached (element is part of DOM) or
                if we wait for elem to be visible (attached, displayed, and has non-zero size).
            exact_match (bool | None, optional):
                If an exact matching is required. Default is None (not set).
            regex (bool, optional):
                Should the name be interpreted as a regular expression?
            occurrence (int, optional):
                If multiple elements match the selector, this defines which one to return.
                Default is 1 (the first one).
            iframe (str | None):
                Is the element in an iFrame? Then provide the name of the iframe with this parameter.
            repeat_reload (int | None):
                For pages that are not dynamically updated and require a reload to show an update
                a number of page reloads can be configured.
            repeat_reload_delay (float | None):
                Number of seconds to wait.
            show_error (bool, optional):
                Show an error if not found or not visible.


        Returns:
            Locator:
                The web element or None in case an error occured.

        """

        failure_message = "Cannot find page element with selector -> '{}' ({}){}{}{}".format(
            selector,
            selector_type,
            " and role type -> '{}'".format(role_type) if role_type else "",
            " in iframe -> '{}'".format(iframe) if iframe else "",
            ", occurrence -> {}".format(occurrence) if occurrence > 1 else "",
        )
        success_message = "Found page element with selector -> '{}' ('{}'){}{}{}".format(
            selector,
            selector_type,
            " and role type -> '{}'".format(role_type) if role_type else "",
            " in iframe -> '{}'".format(iframe) if iframe else "",
            ", occurrence -> {}".format(occurrence) if occurrence > 1 else "",
        )

        def do_find() -> Locator | None:
            # Determine the locator for the element:
            locator = self.get_locator(
                selector=selector,
                selector_type=selector_type,
                role_type=role_type,
                exact_match=exact_match,
                iframe=iframe,
                regex=regex,
            )
            if not locator:
                if show_error:
                    self.logger.error(failure_message)
                else:
                    self.logger.warning(failure_message)
                return None

            # Wait for the element to be visible - don't use logic like
            # locator.count() as this does not wait but fail immideately if elements
            # are not yet loaded:

            try:
                index = occurrence - 1  # convert to 0-based index
                if index < 0:  # basic validation
                    self.logger.error("Occurrence must be >= 1")
                    return None
                self.logger.debug(
                    "Wait for locator to find %selement with selector -> '%s' (%s%s%s) and state -> '%s'%s...",
                    "occurrence #{} of ".format(occurrence) if occurrence > 1 else "",
                    selector,
                    "selector type -> '{}'".format(selector_type),
                    ", role type -> '{}'".format(role_type) if role_type else "",
                    ", using regular expression" if regex else "",
                    wait_state,
                    " in iframe -> '{}'".format(iframe) if iframe else "",
                )

                locator = locator.first if occurrence == 1 else locator.nth(index)
                # Wait for the element to be in the desired state:
                locator.wait_for(state=wait_state)
            except PlaywrightError as pe:
                if show_error and repeat_reload is None:
                    self.logger.error("%s (%s)", failure_message, str(pe))
                else:
                    self.logger.warning("%s", failure_message)
                return None
            else:
                self.logger.debug(success_message)

            return locator

        # end def do_find():

        locator = do_find()

        # Retry logic for pages that are not updated dynamically:
        if locator is None and repeat_reload is not None:
            for i in range(repeat_reload):
                self.logger.warning(
                    "Wait %f seconds before reloading page -> %s to retrieve updates from server...",
                    repeat_reload_delay,
                    self.page.url,
                )
                time.sleep(repeat_reload_delay)
                self.logger.warning(
                    "Reloading page -> %s (retry %d) to retrieve updates from server...", self.page.url, i + 1
                )
                self.page.reload()
                locator = do_find()
                if locator:
                    break
            else:
                self.logger.error(failure_message)

        return locator

    # end method definition

    def find_elem_and_click(
        self,
        selector: str,
        selector_type: str = "id",
        role_type: str | None = None,
        occurrence: int = 1,
        scroll_to_element: bool = True,
        desired_checkbox_state: bool | None = None,
        is_navigation_trigger: bool = False,
        is_popup_trigger: bool = False,
        is_page_close_trigger: bool = False,
        wait_until: str | None = None,
        wait_time: float = 0.0,
        exact_match: bool | None = None,
        regex: bool = False,
        hover_only: bool = False,
        iframe: str | None = None,
        force: bool | None = None,
        click_button: str | None = None,
        click_count: int | None = None,
        click_modifiers: list | None = None,
        repeat_reload: int | None = None,
        repeat_reload_delay: float = 60.0,
        show_error: bool = True,
    ) -> bool:
        """Find a page element and click it.

        Args:
            selector (str):
                The selector of the page element.
            selector_type (str, optional):
                One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
                "label", "placeholder", "alt".
            role_type (str | None, optional):
                ARIA role when using selector_type="role", e.g., "button", "textbox".
                If irrelevant then None should be passed for role_type.
            occurrence (int, optional):
                If multiple elements match the selector, this defines which one to return.
                Default is 1 (the first one).
            scroll_to_element (bool, optional):
                Scroll the element into view.
            desired_checkbox_state (bool | None, optional):
                If True/False, ensures checkbox matches state.
                If None then click it in any case.
            is_navigation_trigger (bool, optional):
                Is the click causing a navigation. Default is False.
            is_popup_trigger (bool, optional):
                Is the click causing a new browser window to open?
            is_page_close_trigger (bool, optional):
                Is the click causing the page to close?
            wait_until (str | None, optional):
                Wait until a certain condition. Options are:
                * "commit" - does not wait at all - commit the request and continue
                * "load" - waits for the load event (after all resources like images/scripts load)
                  This is the safest strategy for pages that keep loading content in the background
                  like Salesforce.
                * "networkidle" - waits until there are no network connections for at least 500 ms.
                  This seems to be the safest one for OpenText Content Server.
                * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed,
                  but subresources may still load).
            wait_time (float):
                Time in seconds to wait for elements to appear.
            exact_match (bool | None, optional):
                If an exact matching is required. Default is None (not set).
            regex (bool, optional):
                Should the name be interpreted as a regular expression?
            hover_only (bool, optional):
                Should we only hover over the element and not click it? Helpful for
                menus that are opening on hovering.
            iframe (str | None, optional):
                Is the element in an iFrame? Then provide the name of the iframe with this parameter.
            force (bool | None, optional):
                If sure the element is interactable and visible (even partly), you can bypass visibility checks
                by setting this option to True. Default is None (undefined, i.e. using the playwright default which is False)
            click_button (Literal['left', 'middle', 'right'] | None, optional):
                Which mouse button to use to do the click. The default is "left". This will be used by playwright if None
                is passed.
            click_count (int | None, optional):
                Number of clicks. E.g. 2 for a "double-click".
            click_modifiers (list | None, optional):
                Key pressed together with the mouse click.
                Possible values:'Alt', 'Control', 'ControlOrMeta', 'Meta', 'Shift'.
                Default is None = no key pressed.
            repeat_reload (int | None):
                For pages that are not dynamically updated and require a reload to show an update
                a number of page reloads can be configured.
            repeat_reload_delay (float | None):
                Number of seconds to wait.
            show_error (bool, optional):
                Show an error if the element is not found or not clickable.

        Returns:
            bool:
                True if click is successful (or checkbox already in desired state),
                False otherwise.

        """

        success = True  # Final return value

        # If no specific wait until strategy is provided in the
        # parameter, we take the one from the browser automation class:
        if wait_until is None:
            wait_until = self.wait_until

        # Some operations that are done server-side and dynamically update
        # the page may require a waiting time:
        if wait_time > 0.0:
            self.logger.info("Wait for %d milliseconds before clicking...", wait_time * 1000)
            self.page.wait_for_timeout(wait_time * 1000)

        if not selector:
            failure_message = "Missing element selector! Cannot find page element!"
            if show_error:
                self.logger.error(failure_message)
            else:
                self.logger.warning(failure_message)
            return False

        elem = self.find_elem(
            selector=selector,
            selector_type=selector_type,
            role_type=role_type,
            exact_match=exact_match,
            regex=regex,
            occurrence=occurrence,
            iframe=iframe,
            repeat_reload=repeat_reload,
            repeat_reload_delay=repeat_reload_delay,
            show_error=show_error,
        )
        if not elem:
            return not show_error

        try:
            if scroll_to_element:
                self.scroll_to_element(elem)

            # Handle checkboxes if requested:
            if desired_checkbox_state is not None and elem.get_attribute("type") == "checkbox":
                # Let Playwright handle checkbox state:
                elem.set_checked(desired_checkbox_state)
                self.logger.debug("Set checkbox -> '%s' to value -> %s.", selector, desired_checkbox_state)
            # Handle non-checkboxes:
            else:
                # Will this click trigger a naviagation?
                if is_navigation_trigger:
                    self.logger.debug(
                        "Clicking on navigation-triggering element -> '%s' (%s%s) and wait until -> '%s'...",
                        selector,
                        "selector type -> '{}'".format(selector_type),
                        ", role type -> '{}'".format(role_type) if role_type else "",
                        wait_until,
                    )
                    with self.page.expect_navigation(wait_until=wait_until):
                        elem.click(force=force, button=click_button, click_count=click_count, modifiers=click_modifiers)
                # Will this click trigger a a new popup window?
                elif is_popup_trigger:
                    with self.page.expect_popup() as popup_info:
                        elem.click(force=force, button=click_button, click_count=click_count, modifiers=click_modifiers)
                    if not popup_info or not popup_info.value:
                        self.logger.info("Popup window did not open as expected!")
                        success = False
                    else:
                        self.page = popup_info.value
                        self.logger.info("Move browser automation to popup window -> %s...", self.page.url)
                elif hover_only:
                    self.logger.debug(
                        "Hovering over element -> '%s' (%s%s)...",
                        selector,
                        "selector type -> '{}'".format(selector_type),
                        ", role type -> '{}'".format(role_type) if role_type else "",
                    )
                    elem.hover()
                else:
                    self.logger.debug(
                        "Clicking on non-navigating element -> '%s' (%s%s)...",
                        selector,
                        "selector type -> '{}'".format(selector_type),
                        ", role type -> '{}'".format(role_type) if role_type else "",
                    )
                    elem.click(force=force, button=click_button, click_count=click_count, modifiers=click_modifiers)
                    time.sleep(1)
                if success:
                    self.logger.debug(
                        "Successfully %s element -> '%s' (%s%s)",
                        "clicked" if not hover_only else "hovered over",
                        selector,
                        "selector type -> '{}'".format(selector_type),
                        ", role type -> '{}'".format(role_type) if role_type else "",
                    )

        except PlaywrightError as e:
            if show_error:
                self.logger.error(
                    "Cannot click page element -> '%s' (%s); error -> %s", selector, selector_type, str(e)
                )
            else:
                self.logger.warning(
                    "Cannot click page element -> '%s' (%s); warning -> %s", selector, selector_type, str(e)
                )
            success = not show_error

        if is_page_close_trigger:
            if self.page == self.main_page:
                self.logger.error("Unexpected try to close main page! Popup page not active! This is not supported!")
                success = False
            else:
                self.page = self.main_page
                self.logger.info("Move browser automation back to main window -> %s...", self.page.url)

        if self.take_screenshots:
            self.take_screenshot()

        return success

    # end method definition

    def find_elem_and_set(
        self,
        selector: str,
        value: str | bool,
        selector_type: str = "id",
        role_type: str | None = None,
        occurrence: int = 1,
        is_sensitive: bool = False,
        press_enter: bool = False,
        exact_match: bool | None = None,
        regex: bool = False,
        iframe: str | None = None,
        typing: bool = False,
        show_error: bool = True,
    ) -> bool:
        """Find an page element and fill it with a new text.

        Args:
            selector (str):
                The name of the page element.
            value (str | bool):
                The new value (text string) for the page element.
            selector_type (str, optional):
                One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
                "label", "placeholder", "alt".
            role_type (str | None, optional):
                ARIA role when using selector_type="role", e.g., "button", "textbox".
                If irrelevant then None should be passed for role_type.
            occurrence (int, optional):
                If multiple elements match the selector, this defines which one to return.
                Default is 1 (the first one).
            is_sensitive (bool, optional):
                True for suppressing sensitive information in logging.
            press_enter (bool, optional):
                Whether or not to press "Enter" after entering
            exact_match (bool | None, optional):
                If an exact matching is required. Default is None (not set).
            regex (bool, optional):
                Should the name be interpreted as a regular expression?
            iframe (str | None):
                Is the element in an iFrame? Then provide the name of the iframe with this parameter.
            typing (bool, optional):
                Not just set the value of the elem but simulate real typing.
                This is required for pages with fields that do react in a "typeahead" manner.
            show_error (bool, optional):
                Show an error if the element is not found or not clickable.

        Returns:
            bool:
                True if successful, False otherwise

        """

        success = False  # Final return value

        elem = self.find_elem(
            selector=selector,
            selector_type=selector_type,
            role_type=role_type,
            exact_match=exact_match,
            regex=regex,
            occurrence=occurrence,
            iframe=iframe,
            show_error=True,
        )
        if not elem:
            return not show_error

        is_enabled = elem.is_enabled()
        if not is_enabled:
            message = "Cannot set elem -> '{}' ({}) to value -> '{}'. It is not enabled!".format(
                selector, selector_type, value
            )
            if show_error:
                self.logger.error(message)
            else:
                self.logger.warning(message)

            if self.take_screenshots:
                self.take_screenshot()

            return False

        self.logger.info(
            "Set element -> '%s' to value -> '%s'...", selector, value if not is_sensitive else "<sensitive>"
        )

        try:
            # HTML '<select>' can only be identified based on its tag name:
            tag_name = elem.evaluate("el => el.tagName.toLowerCase()")
            # Checkboxes have tag name '<input type="checkbox">':
            input_type = elem.get_attribute("type")

            if tag_name == "select":
                options = elem.locator("option")
                options_count = options.count()
                option_values = [options.nth(i).inner_text().strip().replace("\n", "") for i in range(options_count)]

                if value not in option_values:
                    self.logger.warning(
                        "Provided value -> '%s' is not in available drop-down options -> %s. Cannot set it!",
                        value,
                        option_values,
                    )
                else:
                    # We set the value over the (visible) label:
                    elem.select_option(label=value)
                    success = True
            elif tag_name == "input" and input_type == "checkbox":
                # Handle checkbox
                if not isinstance(value, bool):
                    self.logger.error("Checkbox value must be a boolean!")
                else:
                    retry = 0
                    while elem.is_checked() != value and retry < 5:
                        try:
                            elem.set_checked(checked=value)
                        except Exception:
                            self.logger.warning("Cannot set checkbox to value -> '%s'. (retry %s).", value, retry)
                        finally:
                            retry += 1

                    success = retry < 5  # True is less than 5 retries were needed
            else:
                if typing:
                    elem.type(value, delay=50)
                else:
                    elem.fill(value)
                if press_enter:
                    self.page.keyboard.press("Enter")
                success = True
        except PlaywrightError as e:
            message = "Cannot set page element selected by -> '{}' ({}) to value -> '{}'; error -> {}".format(
                selector, selector_type, value, str(e)
            )
            if show_error:
                self.logger.error(message)
            else:
                self.logger.warning(message)
            success = not show_error

        if self.take_screenshots:
            self.take_screenshot()

        return success

    # end method definition

    def find_element_and_download(
        self,
        selector: str,
        selector_type: str = "id",
        role_type: str | None = None,
        exact_match: bool | None = None,
        regex: bool = False,
        iframe: str | None = None,
        download_time: int = 30,
    ) -> str | None:
        """Click a page element to initiate a download.

        Args:
            selector (str):
                The page element to click for download.
            selector_type (str, optional):
                One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
                "label", "placeholder", "alt".
            role_type (str | None, optional):
                ARIA role when using selector_type="role", e.g., "button", "textbox".
                If irrelevant then None should be passed for role_type.
            exact_match (bool | None, optional):
                If an exact matching is required. Default is None (not set).
            regex (bool, optional):
                Should the name be interpreted as a regular expression?
            iframe (str | None):
                Is the element in an iFrame? Then provide the name of the iframe with this parameter.
            download_time (int, optional):
                Time in seconds to wait for the download to complete.

        Returns:
            str | None:
                The full file path of the downloaded file.

        """

        try:
            with self.page.expect_download(timeout=download_time * 1000) as download_info:
                clicked = self.find_elem_and_click(
                    selector=selector,
                    selector_type=selector_type,
                    role_type=role_type,
                    exact_match=exact_match,
                    regex=regex,
                    iframe=iframe,
                )
                if not clicked:
                    self.logger.error("Element not found to initiate download.")
                    return None

            download = download_info.value
            filename = download.suggested_filename
            save_path = os.path.join(self.download_directory, filename)
            download.save_as(save_path)
        except Exception as e:
            self.logger.error("Download failed; error -> %s", str(e))
            return None

        self.logger.info("Download file to -> %s", save_path)

        return save_path

    # end method definition

    def check_elems_exist(
        self,
        selector: str,
        selector_type: str = "id",
        role_type: str | None = None,
        value: str | None = None,
        exact_match: bool | None = None,
        attribute: str | None = None,
        substring: bool = True,
        iframe: str | None = None,
        min_count: int = 1,
        wait_time: float = 0.0,
        wait_state: str = "visible",
        show_error: bool = True,
    ) -> tuple[bool | None, int]:
        """Check if (multiple) elements with defined attributes exist on page and return the number.

        Args:
            selector (str):
                The selector to find the element on the page.
            selector_type (str):
                One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
                "label", "placeholder", "alt".
                When using css, the selector becomes a raw CSS selector, and you can skip attribute
                and value filtering entirely if your selector already narrows it down.
                Examples for CSS:
                * selector="img" - find all img tags (images)
                * selector="img[title]" - find all img tags (images) that have a title attribute - independent of its value
                * selector="img[title*='Microsoft Teams']" - find all images with a title that contains "Microsoft Teams"
                * selector=".toolbar button" - find all buttons inside a .toolbar class
            role_type (str | None, optional):
                ARIA role when using selector_type="role", e.g., "button", "textbox".
                If irrelevant then None should be passed for role_type.
            value (str, optional):
                Value to match in attribute or element content.
            exact_match (bool | None, optional):
                If an exact matching is required. Default is None (not set).
            attribute (str, optional):
                Attribute name to inspect. If None, uses element's text.
            substring (bool):
                If True, allow partial match.
            iframe (str | None):
                Is the element in an iFrame? Then provide the name of the iframe with this parameter.
            min_count (int):
                Minimum number of required matches (# elements on page).
            wait_time (float):
                Time in seconds to wait for elements to appear.
            wait_state (str, optional):
                Defines if we wait for attached (element is part of DOM) or
                if we wait for elem to be visible (attached, displayed, and has non-zero size).
            show_error (bool):
                Whether to log warnings/errors.

        Returns:
            bool | None:
                True if sufficient elements exist. False otherwise.
                None if an error occurs.
            int:
                Number of matched elements.

        """

        failure_message = "No matching page element found with selector -> '{}' ({}){}{}".format(
            selector,
            selector_type,
            " and role type -> '{}'".format(role_type) if role_type else "",
            " in iframe -> '{}'".format(iframe) if iframe else "",
        )

        # Determine the locator for the elements:
        locator = self.get_locator(
            selector=selector,
            selector_type=selector_type,
            role_type=role_type,
            exact_match=exact_match,
            iframe=iframe,
        )
        if not locator:
            if show_error:
                self.logger.error(
                    "Failed to check if elements -> '%s' (%s) exist! Locator is undefined.", selector, selector_type
                )
            return (None, 0)

        self.logger.info(
            "Check if at least %d element%s found by selector -> '%s' (%s%s)%s%s%s...",
            min_count,
            "s are" if min_count > 1 else " is",
            selector,
            "selector type -> '{}'".format(selector_type),
            ", role type -> {}".format(role_type) if role_type else "",
            " with value -> '{}'".format(value) if value else "",
            " in attribute -> '{}'".format(attribute) if attribute and value else "",
            " in iframe -> '{}'".format(iframe) if iframe else "",
        )

        # Wait for the element to be visible - don't immediately use logic like
        # locator.count() as this does not wait but then fail immideately
        try:
            self.logger.info(
                "Wait for locator to find first matching element with selector -> '%s' (%s%s) and state -> '%s'%s...",
                selector,
                "selector type -> '{}'".format(selector_type),
                ", role type -> {}".format(role_type) if role_type else "",
                wait_state,
                " in iframe -> '{}'".format(iframe) if iframe else "",
            )
            self.logger.info("Locator count before waiting: %d", locator.count())

            # IMPORTANT: We wait for the FIRST element. otherwise we get errors like
            # 'Locator.wait_for: Error: strict mode violation'.
            # IMPORTANT: if the first match does not comply to the
            # wait_state this will block and then timeout. Check your
            # selector to make sure it delivers a visible first element!
            locator.first.wait_for(state=wait_state)
        except PlaywrightError as e:
            # This is typically a timeout error indicating the element does not exist
            # in the defined timeout period.
            if show_error:
                self.logger.error("%s (timeout); error -> %s", failure_message, str(e))
            else:
                self.logger.warning("%s (timeout)", failure_message)
            return (None, 0)

        # Some operations that are done server-side and dynamically update
        # the page with additional matching elements that may require a waiting time:
        if wait_time > 0.0:
            self.logger.info("Wait additional %d milliseconds before checking...", wait_time * 1000)
            self.page.wait_for_timeout(wait_time * 1000)

        count = locator.count()
        if count == 0:
            if show_error:
                self.logger.error("No elements found using selector -> '%s' ('%s')", selector, selector_type)

            if self.take_screenshots:
                self.take_screenshot()

            return (None, 0)

        self.logger.info(
            "Found %s elements matching selector -> '%s' (%s%s).",
            count,
            selector,
            "selector type -> '{}'".format(selector_type),
            ", role type -> '{}'".format(role_type) if role_type else "",
        )

        if value:
            self.logger.info(
                "Checking if their %s %s -> '%s'...",
                "attribute -> '{}'".format(attribute) if attribute else "content",
                "has value" if not substring else "contains",
                value,
            )

        matching_elems = []

        # Iterate over all elements found by the locator and checkif
        # they comply with the additional value conditions (if provided).
        # We collect all matching elements in a list:
        for i in range(count):
            elem = locator.nth(i)
            if not elem:
                continue

            if value is None:
                # If value is None we do no filtering, accept all elements:
                matching_elems.append(elem)
                continue

            # Get attribute or text content
            attr_value = elem.get_attribute(attribute) if attribute else elem.text_content()

            if not attr_value:
                # Nothing to compare with - continue:
                continue

            # If substring is True we check with "in" otherwise we use the eual operator (==):
            if (substring and value in attr_value) or (not substring and value == attr_value):
                matching_elems.append(elem)

        matching_elements_count = len(matching_elems)

        if matching_elements_count < min_count:
            success = False
            if show_error:
                self.logger.error(
                    "%s matching element%s found, expected at least %d",
                    "Only {}".format(matching_elements_count) if matching_elems else "No",
                    "s" if matching_elements_count > 1 else "",
                    min_count,
                )
        else:
            success = True
            self.logger.info(
                "Found %d matching elements.%s",
                matching_elements_count,
                " This is {} the minimum {} element{} probed for.".format(
                    "exactly" if matching_elements_count == min_count else "more than",
                    min_count,
                    "s" if min_count > 1 else "",
                ),
            )

        if self.take_screenshots:
            self.take_screenshot()

        return (success, matching_elements_count)

    # end method definition

    def run_login(
        self,
        user_field: str = "otds_username",
        password_field: str = "otds_password",
        login_button: str = "loginbutton",
        page: str = "",
        wait_until: str | None = None,
        selector_type: str = "id",
    ) -> bool:
        """Login to target system via the browser.

        Args:
            user_field (str, optional):
                The name of the web HTML field to enter the user name. Defaults to "otds_username".
            password_field (str, optional):
                The name of the HTML field to enter the password. Defaults to "otds_password".
            login_button (str, optional):
                The name of the HTML login button. Defaults to "loginbutton".
            page (str, optional):
                The URL to the login page. Defaults to "".
            wait_until (str | None, optional):
                Wait until a certain condition. Options are:
                * "commit" - does not wait at all - commit the request and continue
                * "load" - waits for the load event (after all resources like images/scripts load)
                  This is the safest strategy for pages that keep loading content in the background
                  like Salesforce.
                * "networkidle" - waits until there are no network connections for at least 500 ms.
                  This seems to be the safest one for OpenText Content Server.
                * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed,
                  but subresources may still load).
            selector_type (str, optional):
                One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
                "label", "placeholder", "alt".
                Default is "id".

        Returns:
            bool:
                True = success, False = error.

        """

        # If no specific wait until strategy is provided in the
        # parameter, we take the one from the browser automation class:
        if wait_until is None:
            wait_until = self.wait_until

        self.logged_in = False

        if (
            not self.get_page(url=page, wait_until=wait_until)
            or not self.find_elem_and_set(selector=user_field, selector_type=selector_type, value=self.user_name)
            or not self.find_elem_and_set(
                selector=password_field, selector_type=selector_type, value=self.user_password, is_sensitive=True
            )
            or not self.find_elem_and_click(
                selector=login_button, selector_type=selector_type, is_navigation_trigger=True, wait_until=wait_until
            )
        ):
            self.logger.error(
                "Cannot log into target system using URL -> %s and user -> '%s'!",
                self.base_url,
                self.user_name,
            )
            return False

        self.logger.debug("Wait for -> '%s' to assure login is completed and target page is loaded.", wait_until)
        self.page.wait_for_load_state(wait_until)

        title = self.get_title()
        if not title:
            self.logger.error(
                "Cannot read page title after login - you may have the wrong 'wait until' strategy configured! Strategy user -> '%s'.",
                wait_until,
            )
            return False

        if "Verify" in title:
            self.logger.error("Site is asking for a verification token. You may need to whitelist your IP!")
            return False
        if "Login" in title:
            self.logger.error("Authentication failed. You may have given the wrong password!")
            return False

        self.logger.info("Login completed successfully! Page title -> '%s'", title)
        self.logged_in = True

        return True

    # end method definition

    def set_timeout(self, wait_time: float) -> None:
        """Wait for the browser to finish tasks (e.g. fully loading a page).

        This setting is valid for the whole browser session and not just
        for a single command.

        Args:
            wait_time (float):
                The time in seconds to wait.

        """

        self.logger.debug("Setting default timeout to -> %s seconds...", str(wait_time))
        self.page.set_default_timeout(wait_time * 1000)
        self.logger.debug("Setting navigation timeout to -> %s seconds...", str(wait_time))
        self.page.set_default_navigation_timeout(wait_time * 1000)

    # end method definition

    def end_session(self) -> None:
        """End the browser session and close the browser."""

        self.logger.info("Close browser page...")
        self.page.close()
        self.logger.info("Close browser context...")
        self.context.close()
        self.logger.info("Close browser...")
        self.browser.close()
        self.logged_in = False
        self.logger.info("Stop Playwright instance...")
        self.playwright.stop()
        self.logger.info("Browser automation has ended.")

    # end method definition

    def __enter__(self) -> object:
        """Enable use with 'with' statement (context manager block)."""

        return self

    # end method definition

    def __exit__(
        self, exc_type: type[BaseException] | None, exc_value: BaseException | None, traceback_obj: TracebackType | None
    ) -> None:
        """Handle cleanup when exiting a context manager block ('with' statement).

        Ensures all browser-related resources are released. If an unhandled exception
        occurs within the context block, it will be logged before cleanup.

        Args:
            exc_type (type[BaseException] | None):
                The class of the raised exception, if any.
            exc_value (BaseException | None):
                The exception instance raised, if any.
            traceback_obj (TracebackType | None):
                The traceback object associated with the exception, if any.

        """

        if exc_type is not None:
            self.logger.error(
                "Unhandled exception in browser automation context -> %s",
                "".join(traceback.format_exception(exc_type, exc_value, traceback_obj)),
            )

        self.end_session()

__enter__()

Enable use with 'with' statement (context manager block).

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def __enter__(self) -> object:
    """Enable use with 'with' statement (context manager block)."""

    return self

__exit__(exc_type, exc_value, traceback_obj)

Handle cleanup when exiting a context manager block ('with' statement).

Ensures all browser-related resources are released. If an unhandled exception occurs within the context block, it will be logged before cleanup.

Parameters:

Name Type Description Default
exc_type type[BaseException] | None

The class of the raised exception, if any.

required
exc_value BaseException | None

The exception instance raised, if any.

required
traceback_obj TracebackType | None

The traceback object associated with the exception, if any.

required
Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def __exit__(
    self, exc_type: type[BaseException] | None, exc_value: BaseException | None, traceback_obj: TracebackType | None
) -> None:
    """Handle cleanup when exiting a context manager block ('with' statement).

    Ensures all browser-related resources are released. If an unhandled exception
    occurs within the context block, it will be logged before cleanup.

    Args:
        exc_type (type[BaseException] | None):
            The class of the raised exception, if any.
        exc_value (BaseException | None):
            The exception instance raised, if any.
        traceback_obj (TracebackType | None):
            The traceback object associated with the exception, if any.

    """

    if exc_type is not None:
        self.logger.error(
            "Unhandled exception in browser automation context -> %s",
            "".join(traceback.format_exception(exc_type, exc_value, traceback_obj)),
        )

    self.end_session()

__init__(base_url='', user_name='', user_password='', download_directory=None, take_screenshots=False, automation_name='', headless=True, logger=default_logger, wait_until=None, browser=None)

Initialize the object.

Parameters:

Name Type Description Default
base_url str

The base URL of the website to automate. Defaults to "".

''
user_name str

If an authentication at the web site is required, this is the user name. Defaults to "".

''
user_password str

If an authentication at the web site is required, this is the user password. Defaults to "".

''
download_directory str | None

A download directory used for download links. If None, a temporary directory is automatically used.

None
take_screenshots bool

For debugging purposes, screenshots can be taken. Defaults to False.

False
automation_name str

The name of the automation. Defaults to "".

''
headless bool

If True, the browser will be started in headless mode. Defaults to True.

True
wait_until str | None

Wait until a certain condition. Options are: * "commit" - does not wait at all - commit the request and continue * "load" - waits for the load event (after all resources like images/scripts load) * "networkidle" - waits until there are no network connections for at least 500 ms. * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed, but subresources may still load).

None
logger Logger

The logging object to use for all log messages. Defaults to default_logger.

default_logger
browser str | None

The browser to use. Defaults to None, which takes the global default or from the ENV "BROWSER".

None
Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def __init__(
    self,
    base_url: str = "",
    user_name: str = "",
    user_password: str = "",
    download_directory: str | None = None,
    take_screenshots: bool = False,
    automation_name: str = "",
    headless: bool = True,
    logger: logging.Logger = default_logger,
    wait_until: str | None = None,
    browser: str | None = None,
) -> None:
    """Initialize the object.

    Args:
        base_url (str, optional):
            The base URL of the website to automate. Defaults to "".
        user_name (str, optional):
            If an authentication at the web site is required, this is the user name.
            Defaults to "".
        user_password (str, optional):
            If an authentication at the web site is required, this is the user password.
            Defaults to "".
        download_directory (str | None, optional):
            A download directory used for download links. If None,
            a temporary directory is automatically used.
        take_screenshots (bool, optional):
            For debugging purposes, screenshots can be taken.
            Defaults to False.
        automation_name (str, optional):
            The name of the automation. Defaults to "".
        headless (bool, optional):
            If True, the browser will be started in headless mode. Defaults to True.
        wait_until (str | None, optional):
            Wait until a certain condition. Options are:
            * "commit" - does not wait at all - commit the request and continue
            * "load" - waits for the load event (after all resources like images/scripts load)
            * "networkidle" - waits until there are no network connections for at least 500 ms.
            * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed,
              but subresources may still load).
        logger (logging.Logger, optional):
            The logging object to use for all log messages. Defaults to default_logger.
        browser (str | None, optional):
            The browser to use. Defaults to None, which takes the global default or from the ENV "BROWSER".

    """

    if not download_directory:
        download_directory = os.path.join(
            tempfile.gettempdir(),
            "browser_automations",
            self.sanitize_filename(filename=automation_name),
            "downloads",
        )

    if logger != default_logger:
        self.logger = logger.getChild("browserautomation")
        for logfilter in logger.filters:
            self.logger.addFilter(logfilter)

    self.base_url = base_url
    self.user_name = user_name
    self.user_password = user_password
    self.logged_in = False
    self.download_directory = download_directory
    self.headless = headless

    # Screenshot configurations:
    self.take_screenshots = take_screenshots
    self.screenshot_names = self.sanitize_filename(filename=automation_name)
    self.screenshot_counter = 1
    self.screenshot_full_page = True

    self.wait_until = wait_until if wait_until else DEFAULT_WAIT_UNTIL_STRATEGY

    self.screenshot_directory = os.path.join(
        tempfile.gettempdir(),
        "browser_automations",
        self.screenshot_names,
        "screenshots",
    )
    self.logger.debug("Creating screenshot directory... -> %s", self.screenshot_directory)
    if self.take_screenshots and not os.path.exists(self.screenshot_directory):
        os.makedirs(self.screenshot_directory)

    self.proxy = None
    if os.getenv("HTTP_PROXY"):
        self.proxy = {
            "server": os.getenv("HTTP_PROXY"),
        }
        self.logger.info("Using HTTP proxy -> %s", os.getenv("HTTP_PROXY"))

    browser = browser or os.getenv("BROWSER", "webkit")
    self.logger.info("Using browser -> '%s'...", browser)

    if not self.setup_playwright(browser=browser):
        self.logger.error("Failed to initialize Playwright browser automation!")
        return

    self.logger.info("Creating browser context...")
    self.context: BrowserContext = self.browser.new_context(
        accept_downloads=True,
    )

    self.logger.info("Creating page...")
    self.page: Page = self.context.new_page()
    self.main_page = self.page
    self.logger.info("Browser automation initialized.")

check_elems_exist(selector, selector_type='id', role_type=None, value=None, exact_match=None, attribute=None, substring=True, iframe=None, min_count=1, wait_time=0.0, wait_state='visible', show_error=True)

Check if (multiple) elements with defined attributes exist on page and return the number.

Parameters:

Name Type Description Default
selector str

The selector to find the element on the page.

required
selector_type str

One of "id", "name", "class_name", "xpath", "css", "role", "text", "title", "label", "placeholder", "alt". When using css, the selector becomes a raw CSS selector, and you can skip attribute and value filtering entirely if your selector already narrows it down. Examples for CSS: * selector="img" - find all img tags (images) * selector="img[title]" - find all img tags (images) that have a title attribute - independent of its value * selector="img[title*='Microsoft Teams']" - find all images with a title that contains "Microsoft Teams" * selector=".toolbar button" - find all buttons inside a .toolbar class

'id'
role_type str | None

ARIA role when using selector_type="role", e.g., "button", "textbox". If irrelevant then None should be passed for role_type.

None
value str

Value to match in attribute or element content.

None
exact_match bool | None

If an exact matching is required. Default is None (not set).

None
attribute str

Attribute name to inspect. If None, uses element's text.

None
substring bool

If True, allow partial match.

True
iframe str | None

Is the element in an iFrame? Then provide the name of the iframe with this parameter.

None
min_count int

Minimum number of required matches (# elements on page).

1
wait_time float

Time in seconds to wait for elements to appear.

0.0
wait_state str

Defines if we wait for attached (element is part of DOM) or if we wait for elem to be visible (attached, displayed, and has non-zero size).

'visible'
show_error bool

Whether to log warnings/errors.

True

Returns:

Name Type Description
bool | None

bool | None: True if sufficient elements exist. False otherwise. None if an error occurs.

int int

Number of matched elements.

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def check_elems_exist(
    self,
    selector: str,
    selector_type: str = "id",
    role_type: str | None = None,
    value: str | None = None,
    exact_match: bool | None = None,
    attribute: str | None = None,
    substring: bool = True,
    iframe: str | None = None,
    min_count: int = 1,
    wait_time: float = 0.0,
    wait_state: str = "visible",
    show_error: bool = True,
) -> tuple[bool | None, int]:
    """Check if (multiple) elements with defined attributes exist on page and return the number.

    Args:
        selector (str):
            The selector to find the element on the page.
        selector_type (str):
            One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
            "label", "placeholder", "alt".
            When using css, the selector becomes a raw CSS selector, and you can skip attribute
            and value filtering entirely if your selector already narrows it down.
            Examples for CSS:
            * selector="img" - find all img tags (images)
            * selector="img[title]" - find all img tags (images) that have a title attribute - independent of its value
            * selector="img[title*='Microsoft Teams']" - find all images with a title that contains "Microsoft Teams"
            * selector=".toolbar button" - find all buttons inside a .toolbar class
        role_type (str | None, optional):
            ARIA role when using selector_type="role", e.g., "button", "textbox".
            If irrelevant then None should be passed for role_type.
        value (str, optional):
            Value to match in attribute or element content.
        exact_match (bool | None, optional):
            If an exact matching is required. Default is None (not set).
        attribute (str, optional):
            Attribute name to inspect. If None, uses element's text.
        substring (bool):
            If True, allow partial match.
        iframe (str | None):
            Is the element in an iFrame? Then provide the name of the iframe with this parameter.
        min_count (int):
            Minimum number of required matches (# elements on page).
        wait_time (float):
            Time in seconds to wait for elements to appear.
        wait_state (str, optional):
            Defines if we wait for attached (element is part of DOM) or
            if we wait for elem to be visible (attached, displayed, and has non-zero size).
        show_error (bool):
            Whether to log warnings/errors.

    Returns:
        bool | None:
            True if sufficient elements exist. False otherwise.
            None if an error occurs.
        int:
            Number of matched elements.

    """

    failure_message = "No matching page element found with selector -> '{}' ({}){}{}".format(
        selector,
        selector_type,
        " and role type -> '{}'".format(role_type) if role_type else "",
        " in iframe -> '{}'".format(iframe) if iframe else "",
    )

    # Determine the locator for the elements:
    locator = self.get_locator(
        selector=selector,
        selector_type=selector_type,
        role_type=role_type,
        exact_match=exact_match,
        iframe=iframe,
    )
    if not locator:
        if show_error:
            self.logger.error(
                "Failed to check if elements -> '%s' (%s) exist! Locator is undefined.", selector, selector_type
            )
        return (None, 0)

    self.logger.info(
        "Check if at least %d element%s found by selector -> '%s' (%s%s)%s%s%s...",
        min_count,
        "s are" if min_count > 1 else " is",
        selector,
        "selector type -> '{}'".format(selector_type),
        ", role type -> {}".format(role_type) if role_type else "",
        " with value -> '{}'".format(value) if value else "",
        " in attribute -> '{}'".format(attribute) if attribute and value else "",
        " in iframe -> '{}'".format(iframe) if iframe else "",
    )

    # Wait for the element to be visible - don't immediately use logic like
    # locator.count() as this does not wait but then fail immideately
    try:
        self.logger.info(
            "Wait for locator to find first matching element with selector -> '%s' (%s%s) and state -> '%s'%s...",
            selector,
            "selector type -> '{}'".format(selector_type),
            ", role type -> {}".format(role_type) if role_type else "",
            wait_state,
            " in iframe -> '{}'".format(iframe) if iframe else "",
        )
        self.logger.info("Locator count before waiting: %d", locator.count())

        # IMPORTANT: We wait for the FIRST element. otherwise we get errors like
        # 'Locator.wait_for: Error: strict mode violation'.
        # IMPORTANT: if the first match does not comply to the
        # wait_state this will block and then timeout. Check your
        # selector to make sure it delivers a visible first element!
        locator.first.wait_for(state=wait_state)
    except PlaywrightError as e:
        # This is typically a timeout error indicating the element does not exist
        # in the defined timeout period.
        if show_error:
            self.logger.error("%s (timeout); error -> %s", failure_message, str(e))
        else:
            self.logger.warning("%s (timeout)", failure_message)
        return (None, 0)

    # Some operations that are done server-side and dynamically update
    # the page with additional matching elements that may require a waiting time:
    if wait_time > 0.0:
        self.logger.info("Wait additional %d milliseconds before checking...", wait_time * 1000)
        self.page.wait_for_timeout(wait_time * 1000)

    count = locator.count()
    if count == 0:
        if show_error:
            self.logger.error("No elements found using selector -> '%s' ('%s')", selector, selector_type)

        if self.take_screenshots:
            self.take_screenshot()

        return (None, 0)

    self.logger.info(
        "Found %s elements matching selector -> '%s' (%s%s).",
        count,
        selector,
        "selector type -> '{}'".format(selector_type),
        ", role type -> '{}'".format(role_type) if role_type else "",
    )

    if value:
        self.logger.info(
            "Checking if their %s %s -> '%s'...",
            "attribute -> '{}'".format(attribute) if attribute else "content",
            "has value" if not substring else "contains",
            value,
        )

    matching_elems = []

    # Iterate over all elements found by the locator and checkif
    # they comply with the additional value conditions (if provided).
    # We collect all matching elements in a list:
    for i in range(count):
        elem = locator.nth(i)
        if not elem:
            continue

        if value is None:
            # If value is None we do no filtering, accept all elements:
            matching_elems.append(elem)
            continue

        # Get attribute or text content
        attr_value = elem.get_attribute(attribute) if attribute else elem.text_content()

        if not attr_value:
            # Nothing to compare with - continue:
            continue

        # If substring is True we check with "in" otherwise we use the eual operator (==):
        if (substring and value in attr_value) or (not substring and value == attr_value):
            matching_elems.append(elem)

    matching_elements_count = len(matching_elems)

    if matching_elements_count < min_count:
        success = False
        if show_error:
            self.logger.error(
                "%s matching element%s found, expected at least %d",
                "Only {}".format(matching_elements_count) if matching_elems else "No",
                "s" if matching_elements_count > 1 else "",
                min_count,
            )
    else:
        success = True
        self.logger.info(
            "Found %d matching elements.%s",
            matching_elements_count,
            " This is {} the minimum {} element{} probed for.".format(
                "exactly" if matching_elements_count == min_count else "more than",
                min_count,
                "s" if min_count > 1 else "",
            ),
        )

    if self.take_screenshots:
        self.take_screenshot()

    return (success, matching_elements_count)

end_session()

End the browser session and close the browser.

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def end_session(self) -> None:
    """End the browser session and close the browser."""

    self.logger.info("Close browser page...")
    self.page.close()
    self.logger.info("Close browser context...")
    self.context.close()
    self.logger.info("Close browser...")
    self.browser.close()
    self.logged_in = False
    self.logger.info("Stop Playwright instance...")
    self.playwright.stop()
    self.logger.info("Browser automation has ended.")

find_elem(selector, selector_type='id', role_type=None, wait_state='visible', exact_match=None, regex=False, occurrence=1, iframe=None, repeat_reload=None, repeat_reload_delay=60, show_error=True)

Find a page element.

Parameters:

Name Type Description Default
selector str

The name of the page element or accessible name (for role).

required
selector_type str

One of "id", "name", "class_name", "xpath", "css", "role", "text", "title", "label", "placeholder", "alt".

'id'
role_type str | None

ARIA role when using selector_type="role", e.g., "button", "textbox". If irrelevant then None should be passed for role_type.

None
wait_state str

Defines if we wait for attached (element is part of DOM) or if we wait for elem to be visible (attached, displayed, and has non-zero size).

'visible'
exact_match bool | None

If an exact matching is required. Default is None (not set).

None
regex bool

Should the name be interpreted as a regular expression?

False
occurrence int

If multiple elements match the selector, this defines which one to return. Default is 1 (the first one).

1
iframe str | None

Is the element in an iFrame? Then provide the name of the iframe with this parameter.

None
repeat_reload int | None

For pages that are not dynamically updated and require a reload to show an update a number of page reloads can be configured.

None
repeat_reload_delay float | None

Number of seconds to wait.

60
show_error bool

Show an error if not found or not visible.

True

Returns:

Name Type Description
Locator Locator | None

The web element or None in case an error occured.

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def find_elem(
    self,
    selector: str,
    selector_type: str = "id",
    role_type: str | None = None,
    wait_state: str = "visible",
    exact_match: bool | None = None,
    regex: bool = False,
    occurrence: int = 1,
    iframe: str | None = None,
    repeat_reload: int | None = None,
    repeat_reload_delay: int = 60,
    show_error: bool = True,
) -> Locator | None:
    """Find a page element.

    Args:
        selector (str):
            The name of the page element or accessible name (for role).
        selector_type (str, optional):
            One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
            "label", "placeholder", "alt".
        role_type (str | None, optional):
            ARIA role when using selector_type="role", e.g., "button", "textbox".
            If irrelevant then None should be passed for role_type.
        wait_state (str, optional):
            Defines if we wait for attached (element is part of DOM) or
            if we wait for elem to be visible (attached, displayed, and has non-zero size).
        exact_match (bool | None, optional):
            If an exact matching is required. Default is None (not set).
        regex (bool, optional):
            Should the name be interpreted as a regular expression?
        occurrence (int, optional):
            If multiple elements match the selector, this defines which one to return.
            Default is 1 (the first one).
        iframe (str | None):
            Is the element in an iFrame? Then provide the name of the iframe with this parameter.
        repeat_reload (int | None):
            For pages that are not dynamically updated and require a reload to show an update
            a number of page reloads can be configured.
        repeat_reload_delay (float | None):
            Number of seconds to wait.
        show_error (bool, optional):
            Show an error if not found or not visible.


    Returns:
        Locator:
            The web element or None in case an error occured.

    """

    failure_message = "Cannot find page element with selector -> '{}' ({}){}{}{}".format(
        selector,
        selector_type,
        " and role type -> '{}'".format(role_type) if role_type else "",
        " in iframe -> '{}'".format(iframe) if iframe else "",
        ", occurrence -> {}".format(occurrence) if occurrence > 1 else "",
    )
    success_message = "Found page element with selector -> '{}' ('{}'){}{}{}".format(
        selector,
        selector_type,
        " and role type -> '{}'".format(role_type) if role_type else "",
        " in iframe -> '{}'".format(iframe) if iframe else "",
        ", occurrence -> {}".format(occurrence) if occurrence > 1 else "",
    )

    def do_find() -> Locator | None:
        # Determine the locator for the element:
        locator = self.get_locator(
            selector=selector,
            selector_type=selector_type,
            role_type=role_type,
            exact_match=exact_match,
            iframe=iframe,
            regex=regex,
        )
        if not locator:
            if show_error:
                self.logger.error(failure_message)
            else:
                self.logger.warning(failure_message)
            return None

        # Wait for the element to be visible - don't use logic like
        # locator.count() as this does not wait but fail immideately if elements
        # are not yet loaded:

        try:
            index = occurrence - 1  # convert to 0-based index
            if index < 0:  # basic validation
                self.logger.error("Occurrence must be >= 1")
                return None
            self.logger.debug(
                "Wait for locator to find %selement with selector -> '%s' (%s%s%s) and state -> '%s'%s...",
                "occurrence #{} of ".format(occurrence) if occurrence > 1 else "",
                selector,
                "selector type -> '{}'".format(selector_type),
                ", role type -> '{}'".format(role_type) if role_type else "",
                ", using regular expression" if regex else "",
                wait_state,
                " in iframe -> '{}'".format(iframe) if iframe else "",
            )

            locator = locator.first if occurrence == 1 else locator.nth(index)
            # Wait for the element to be in the desired state:
            locator.wait_for(state=wait_state)
        except PlaywrightError as pe:
            if show_error and repeat_reload is None:
                self.logger.error("%s (%s)", failure_message, str(pe))
            else:
                self.logger.warning("%s", failure_message)
            return None
        else:
            self.logger.debug(success_message)

        return locator

    # end def do_find():

    locator = do_find()

    # Retry logic for pages that are not updated dynamically:
    if locator is None and repeat_reload is not None:
        for i in range(repeat_reload):
            self.logger.warning(
                "Wait %f seconds before reloading page -> %s to retrieve updates from server...",
                repeat_reload_delay,
                self.page.url,
            )
            time.sleep(repeat_reload_delay)
            self.logger.warning(
                "Reloading page -> %s (retry %d) to retrieve updates from server...", self.page.url, i + 1
            )
            self.page.reload()
            locator = do_find()
            if locator:
                break
        else:
            self.logger.error(failure_message)

    return locator

find_elem_and_click(selector, selector_type='id', role_type=None, occurrence=1, scroll_to_element=True, desired_checkbox_state=None, is_navigation_trigger=False, is_popup_trigger=False, is_page_close_trigger=False, wait_until=None, wait_time=0.0, exact_match=None, regex=False, hover_only=False, iframe=None, force=None, click_button=None, click_count=None, click_modifiers=None, repeat_reload=None, repeat_reload_delay=60.0, show_error=True)

Find a page element and click it.

Parameters:

Name Type Description Default
selector str

The selector of the page element.

required
selector_type str

One of "id", "name", "class_name", "xpath", "css", "role", "text", "title", "label", "placeholder", "alt".

'id'
role_type str | None

ARIA role when using selector_type="role", e.g., "button", "textbox". If irrelevant then None should be passed for role_type.

None
occurrence int

If multiple elements match the selector, this defines which one to return. Default is 1 (the first one).

1
scroll_to_element bool

Scroll the element into view.

True
desired_checkbox_state bool | None

If True/False, ensures checkbox matches state. If None then click it in any case.

None
is_navigation_trigger bool

Is the click causing a navigation. Default is False.

False
is_popup_trigger bool

Is the click causing a new browser window to open?

False
is_page_close_trigger bool

Is the click causing the page to close?

False
wait_until str | None

Wait until a certain condition. Options are: * "commit" - does not wait at all - commit the request and continue * "load" - waits for the load event (after all resources like images/scripts load) This is the safest strategy for pages that keep loading content in the background like Salesforce. * "networkidle" - waits until there are no network connections for at least 500 ms. This seems to be the safest one for OpenText Content Server. * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed, but subresources may still load).

None
wait_time float

Time in seconds to wait for elements to appear.

0.0
exact_match bool | None

If an exact matching is required. Default is None (not set).

None
regex bool

Should the name be interpreted as a regular expression?

False
hover_only bool

Should we only hover over the element and not click it? Helpful for menus that are opening on hovering.

False
iframe str | None

Is the element in an iFrame? Then provide the name of the iframe with this parameter.

None
force bool | None

If sure the element is interactable and visible (even partly), you can bypass visibility checks by setting this option to True. Default is None (undefined, i.e. using the playwright default which is False)

None
click_button Literal[left, middle, right] | None

Which mouse button to use to do the click. The default is "left". This will be used by playwright if None is passed.

None
click_count int | None

Number of clicks. E.g. 2 for a "double-click".

None
click_modifiers list | None

Key pressed together with the mouse click. Possible values:'Alt', 'Control', 'ControlOrMeta', 'Meta', 'Shift'. Default is None = no key pressed.

None
repeat_reload int | None

For pages that are not dynamically updated and require a reload to show an update a number of page reloads can be configured.

None
repeat_reload_delay float | None

Number of seconds to wait.

60.0
show_error bool

Show an error if the element is not found or not clickable.

True

Returns:

Name Type Description
bool bool

True if click is successful (or checkbox already in desired state), False otherwise.

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def find_elem_and_click(
    self,
    selector: str,
    selector_type: str = "id",
    role_type: str | None = None,
    occurrence: int = 1,
    scroll_to_element: bool = True,
    desired_checkbox_state: bool | None = None,
    is_navigation_trigger: bool = False,
    is_popup_trigger: bool = False,
    is_page_close_trigger: bool = False,
    wait_until: str | None = None,
    wait_time: float = 0.0,
    exact_match: bool | None = None,
    regex: bool = False,
    hover_only: bool = False,
    iframe: str | None = None,
    force: bool | None = None,
    click_button: str | None = None,
    click_count: int | None = None,
    click_modifiers: list | None = None,
    repeat_reload: int | None = None,
    repeat_reload_delay: float = 60.0,
    show_error: bool = True,
) -> bool:
    """Find a page element and click it.

    Args:
        selector (str):
            The selector of the page element.
        selector_type (str, optional):
            One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
            "label", "placeholder", "alt".
        role_type (str | None, optional):
            ARIA role when using selector_type="role", e.g., "button", "textbox".
            If irrelevant then None should be passed for role_type.
        occurrence (int, optional):
            If multiple elements match the selector, this defines which one to return.
            Default is 1 (the first one).
        scroll_to_element (bool, optional):
            Scroll the element into view.
        desired_checkbox_state (bool | None, optional):
            If True/False, ensures checkbox matches state.
            If None then click it in any case.
        is_navigation_trigger (bool, optional):
            Is the click causing a navigation. Default is False.
        is_popup_trigger (bool, optional):
            Is the click causing a new browser window to open?
        is_page_close_trigger (bool, optional):
            Is the click causing the page to close?
        wait_until (str | None, optional):
            Wait until a certain condition. Options are:
            * "commit" - does not wait at all - commit the request and continue
            * "load" - waits for the load event (after all resources like images/scripts load)
              This is the safest strategy for pages that keep loading content in the background
              like Salesforce.
            * "networkidle" - waits until there are no network connections for at least 500 ms.
              This seems to be the safest one for OpenText Content Server.
            * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed,
              but subresources may still load).
        wait_time (float):
            Time in seconds to wait for elements to appear.
        exact_match (bool | None, optional):
            If an exact matching is required. Default is None (not set).
        regex (bool, optional):
            Should the name be interpreted as a regular expression?
        hover_only (bool, optional):
            Should we only hover over the element and not click it? Helpful for
            menus that are opening on hovering.
        iframe (str | None, optional):
            Is the element in an iFrame? Then provide the name of the iframe with this parameter.
        force (bool | None, optional):
            If sure the element is interactable and visible (even partly), you can bypass visibility checks
            by setting this option to True. Default is None (undefined, i.e. using the playwright default which is False)
        click_button (Literal['left', 'middle', 'right'] | None, optional):
            Which mouse button to use to do the click. The default is "left". This will be used by playwright if None
            is passed.
        click_count (int | None, optional):
            Number of clicks. E.g. 2 for a "double-click".
        click_modifiers (list | None, optional):
            Key pressed together with the mouse click.
            Possible values:'Alt', 'Control', 'ControlOrMeta', 'Meta', 'Shift'.
            Default is None = no key pressed.
        repeat_reload (int | None):
            For pages that are not dynamically updated and require a reload to show an update
            a number of page reloads can be configured.
        repeat_reload_delay (float | None):
            Number of seconds to wait.
        show_error (bool, optional):
            Show an error if the element is not found or not clickable.

    Returns:
        bool:
            True if click is successful (or checkbox already in desired state),
            False otherwise.

    """

    success = True  # Final return value

    # If no specific wait until strategy is provided in the
    # parameter, we take the one from the browser automation class:
    if wait_until is None:
        wait_until = self.wait_until

    # Some operations that are done server-side and dynamically update
    # the page may require a waiting time:
    if wait_time > 0.0:
        self.logger.info("Wait for %d milliseconds before clicking...", wait_time * 1000)
        self.page.wait_for_timeout(wait_time * 1000)

    if not selector:
        failure_message = "Missing element selector! Cannot find page element!"
        if show_error:
            self.logger.error(failure_message)
        else:
            self.logger.warning(failure_message)
        return False

    elem = self.find_elem(
        selector=selector,
        selector_type=selector_type,
        role_type=role_type,
        exact_match=exact_match,
        regex=regex,
        occurrence=occurrence,
        iframe=iframe,
        repeat_reload=repeat_reload,
        repeat_reload_delay=repeat_reload_delay,
        show_error=show_error,
    )
    if not elem:
        return not show_error

    try:
        if scroll_to_element:
            self.scroll_to_element(elem)

        # Handle checkboxes if requested:
        if desired_checkbox_state is not None and elem.get_attribute("type") == "checkbox":
            # Let Playwright handle checkbox state:
            elem.set_checked(desired_checkbox_state)
            self.logger.debug("Set checkbox -> '%s' to value -> %s.", selector, desired_checkbox_state)
        # Handle non-checkboxes:
        else:
            # Will this click trigger a naviagation?
            if is_navigation_trigger:
                self.logger.debug(
                    "Clicking on navigation-triggering element -> '%s' (%s%s) and wait until -> '%s'...",
                    selector,
                    "selector type -> '{}'".format(selector_type),
                    ", role type -> '{}'".format(role_type) if role_type else "",
                    wait_until,
                )
                with self.page.expect_navigation(wait_until=wait_until):
                    elem.click(force=force, button=click_button, click_count=click_count, modifiers=click_modifiers)
            # Will this click trigger a a new popup window?
            elif is_popup_trigger:
                with self.page.expect_popup() as popup_info:
                    elem.click(force=force, button=click_button, click_count=click_count, modifiers=click_modifiers)
                if not popup_info or not popup_info.value:
                    self.logger.info("Popup window did not open as expected!")
                    success = False
                else:
                    self.page = popup_info.value
                    self.logger.info("Move browser automation to popup window -> %s...", self.page.url)
            elif hover_only:
                self.logger.debug(
                    "Hovering over element -> '%s' (%s%s)...",
                    selector,
                    "selector type -> '{}'".format(selector_type),
                    ", role type -> '{}'".format(role_type) if role_type else "",
                )
                elem.hover()
            else:
                self.logger.debug(
                    "Clicking on non-navigating element -> '%s' (%s%s)...",
                    selector,
                    "selector type -> '{}'".format(selector_type),
                    ", role type -> '{}'".format(role_type) if role_type else "",
                )
                elem.click(force=force, button=click_button, click_count=click_count, modifiers=click_modifiers)
                time.sleep(1)
            if success:
                self.logger.debug(
                    "Successfully %s element -> '%s' (%s%s)",
                    "clicked" if not hover_only else "hovered over",
                    selector,
                    "selector type -> '{}'".format(selector_type),
                    ", role type -> '{}'".format(role_type) if role_type else "",
                )

    except PlaywrightError as e:
        if show_error:
            self.logger.error(
                "Cannot click page element -> '%s' (%s); error -> %s", selector, selector_type, str(e)
            )
        else:
            self.logger.warning(
                "Cannot click page element -> '%s' (%s); warning -> %s", selector, selector_type, str(e)
            )
        success = not show_error

    if is_page_close_trigger:
        if self.page == self.main_page:
            self.logger.error("Unexpected try to close main page! Popup page not active! This is not supported!")
            success = False
        else:
            self.page = self.main_page
            self.logger.info("Move browser automation back to main window -> %s...", self.page.url)

    if self.take_screenshots:
        self.take_screenshot()

    return success

find_elem_and_set(selector, value, selector_type='id', role_type=None, occurrence=1, is_sensitive=False, press_enter=False, exact_match=None, regex=False, iframe=None, typing=False, show_error=True)

Find an page element and fill it with a new text.

Parameters:

Name Type Description Default
selector str

The name of the page element.

required
value str | bool

The new value (text string) for the page element.

required
selector_type str

One of "id", "name", "class_name", "xpath", "css", "role", "text", "title", "label", "placeholder", "alt".

'id'
role_type str | None

ARIA role when using selector_type="role", e.g., "button", "textbox". If irrelevant then None should be passed for role_type.

None
occurrence int

If multiple elements match the selector, this defines which one to return. Default is 1 (the first one).

1
is_sensitive bool

True for suppressing sensitive information in logging.

False
press_enter bool

Whether or not to press "Enter" after entering

False
exact_match bool | None

If an exact matching is required. Default is None (not set).

None
regex bool

Should the name be interpreted as a regular expression?

False
iframe str | None

Is the element in an iFrame? Then provide the name of the iframe with this parameter.

None
typing bool

Not just set the value of the elem but simulate real typing. This is required for pages with fields that do react in a "typeahead" manner.

False
show_error bool

Show an error if the element is not found or not clickable.

True

Returns:

Name Type Description
bool bool

True if successful, False otherwise

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def find_elem_and_set(
    self,
    selector: str,
    value: str | bool,
    selector_type: str = "id",
    role_type: str | None = None,
    occurrence: int = 1,
    is_sensitive: bool = False,
    press_enter: bool = False,
    exact_match: bool | None = None,
    regex: bool = False,
    iframe: str | None = None,
    typing: bool = False,
    show_error: bool = True,
) -> bool:
    """Find an page element and fill it with a new text.

    Args:
        selector (str):
            The name of the page element.
        value (str | bool):
            The new value (text string) for the page element.
        selector_type (str, optional):
            One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
            "label", "placeholder", "alt".
        role_type (str | None, optional):
            ARIA role when using selector_type="role", e.g., "button", "textbox".
            If irrelevant then None should be passed for role_type.
        occurrence (int, optional):
            If multiple elements match the selector, this defines which one to return.
            Default is 1 (the first one).
        is_sensitive (bool, optional):
            True for suppressing sensitive information in logging.
        press_enter (bool, optional):
            Whether or not to press "Enter" after entering
        exact_match (bool | None, optional):
            If an exact matching is required. Default is None (not set).
        regex (bool, optional):
            Should the name be interpreted as a regular expression?
        iframe (str | None):
            Is the element in an iFrame? Then provide the name of the iframe with this parameter.
        typing (bool, optional):
            Not just set the value of the elem but simulate real typing.
            This is required for pages with fields that do react in a "typeahead" manner.
        show_error (bool, optional):
            Show an error if the element is not found or not clickable.

    Returns:
        bool:
            True if successful, False otherwise

    """

    success = False  # Final return value

    elem = self.find_elem(
        selector=selector,
        selector_type=selector_type,
        role_type=role_type,
        exact_match=exact_match,
        regex=regex,
        occurrence=occurrence,
        iframe=iframe,
        show_error=True,
    )
    if not elem:
        return not show_error

    is_enabled = elem.is_enabled()
    if not is_enabled:
        message = "Cannot set elem -> '{}' ({}) to value -> '{}'. It is not enabled!".format(
            selector, selector_type, value
        )
        if show_error:
            self.logger.error(message)
        else:
            self.logger.warning(message)

        if self.take_screenshots:
            self.take_screenshot()

        return False

    self.logger.info(
        "Set element -> '%s' to value -> '%s'...", selector, value if not is_sensitive else "<sensitive>"
    )

    try:
        # HTML '<select>' can only be identified based on its tag name:
        tag_name = elem.evaluate("el => el.tagName.toLowerCase()")
        # Checkboxes have tag name '<input type="checkbox">':
        input_type = elem.get_attribute("type")

        if tag_name == "select":
            options = elem.locator("option")
            options_count = options.count()
            option_values = [options.nth(i).inner_text().strip().replace("\n", "") for i in range(options_count)]

            if value not in option_values:
                self.logger.warning(
                    "Provided value -> '%s' is not in available drop-down options -> %s. Cannot set it!",
                    value,
                    option_values,
                )
            else:
                # We set the value over the (visible) label:
                elem.select_option(label=value)
                success = True
        elif tag_name == "input" and input_type == "checkbox":
            # Handle checkbox
            if not isinstance(value, bool):
                self.logger.error("Checkbox value must be a boolean!")
            else:
                retry = 0
                while elem.is_checked() != value and retry < 5:
                    try:
                        elem.set_checked(checked=value)
                    except Exception:
                        self.logger.warning("Cannot set checkbox to value -> '%s'. (retry %s).", value, retry)
                    finally:
                        retry += 1

                success = retry < 5  # True is less than 5 retries were needed
        else:
            if typing:
                elem.type(value, delay=50)
            else:
                elem.fill(value)
            if press_enter:
                self.page.keyboard.press("Enter")
            success = True
    except PlaywrightError as e:
        message = "Cannot set page element selected by -> '{}' ({}) to value -> '{}'; error -> {}".format(
            selector, selector_type, value, str(e)
        )
        if show_error:
            self.logger.error(message)
        else:
            self.logger.warning(message)
        success = not show_error

    if self.take_screenshots:
        self.take_screenshot()

    return success

find_element_and_download(selector, selector_type='id', role_type=None, exact_match=None, regex=False, iframe=None, download_time=30)

Click a page element to initiate a download.

Parameters:

Name Type Description Default
selector str

The page element to click for download.

required
selector_type str

One of "id", "name", "class_name", "xpath", "css", "role", "text", "title", "label", "placeholder", "alt".

'id'
role_type str | None

ARIA role when using selector_type="role", e.g., "button", "textbox". If irrelevant then None should be passed for role_type.

None
exact_match bool | None

If an exact matching is required. Default is None (not set).

None
regex bool

Should the name be interpreted as a regular expression?

False
iframe str | None

Is the element in an iFrame? Then provide the name of the iframe with this parameter.

None
download_time int

Time in seconds to wait for the download to complete.

30

Returns:

Type Description
str | None

str | None: The full file path of the downloaded file.

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def find_element_and_download(
    self,
    selector: str,
    selector_type: str = "id",
    role_type: str | None = None,
    exact_match: bool | None = None,
    regex: bool = False,
    iframe: str | None = None,
    download_time: int = 30,
) -> str | None:
    """Click a page element to initiate a download.

    Args:
        selector (str):
            The page element to click for download.
        selector_type (str, optional):
            One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
            "label", "placeholder", "alt".
        role_type (str | None, optional):
            ARIA role when using selector_type="role", e.g., "button", "textbox".
            If irrelevant then None should be passed for role_type.
        exact_match (bool | None, optional):
            If an exact matching is required. Default is None (not set).
        regex (bool, optional):
            Should the name be interpreted as a regular expression?
        iframe (str | None):
            Is the element in an iFrame? Then provide the name of the iframe with this parameter.
        download_time (int, optional):
            Time in seconds to wait for the download to complete.

    Returns:
        str | None:
            The full file path of the downloaded file.

    """

    try:
        with self.page.expect_download(timeout=download_time * 1000) as download_info:
            clicked = self.find_elem_and_click(
                selector=selector,
                selector_type=selector_type,
                role_type=role_type,
                exact_match=exact_match,
                regex=regex,
                iframe=iframe,
            )
            if not clicked:
                self.logger.error("Element not found to initiate download.")
                return None

        download = download_info.value
        filename = download.suggested_filename
        save_path = os.path.join(self.download_directory, filename)
        download.save_as(save_path)
    except Exception as e:
        self.logger.error("Download failed; error -> %s", str(e))
        return None

    self.logger.info("Download file to -> %s", save_path)

    return save_path

get_locator(selector, selector_type, role_type=None, exact_match=None, iframe=None, regex=False, filter_has_text=None, filter_has=None, filter_has_not_text=None, filter_has_not=None)

Determine the locator for the given selector type and (optional) role type.

Parameters:

Name Type Description Default
selector str

The selector to find the element on the page.

required
selector_type str

One of "id", "name", "class_name", "xpath", "css", "role", "text", "title", "label", "placeholder", "alt". When using css, the selector becomes a raw CSS selector, and you can skip attribute and value filtering entirely if your selector already narrows it down. Examples for CSS: * selector="img" - find all img tags (images) * selector="img[title]" - find all img tags (images) that have a title attribute - independent of its value * selector="img[title*='Microsoft Teams']" - find all images with a title that contains "Microsoft Teams" * selector=".toolbar button" - find all buttons inside a .toolbar class

required
role_type str | None

ARIA role when using selector_type="role", e.g., "button", "textbox". If irrelevant then None should be passed for role_type.

None
exact_match bool | None

Controls whether the text or name must match exactly. Default is None (not set, i.e. using playwrights default).

None
iframe str | None

Is the element in an iFrame? Then provide the name of the iframe with this parameter.

None
regex bool

Should the name be interpreted as a regular expression?

False
filter_has_text str | None

Applies locator.filter(has_text=...) to narrow the selection based on text content.

None
filter_has Locator | None

Applies locator.filter(has=...) to match elements containing a descendant matching the given Locator.

None
filter_has_not_text str | None

Applies locator.filter(has_not_text=...) to exclude elements with matching text content.

None
filter_has_not Locator | None

Applies locator.filter(has_not=...) to exclude elements containing a matching descendant.

None
Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def get_locator(
    self,
    selector: str,
    selector_type: str,
    role_type: str | None = None,
    exact_match: bool | None = None,
    iframe: str | None = None,
    regex: bool = False,
    filter_has_text: str | None = None,
    filter_has: Locator | None = None,
    filter_has_not_text: str | None = None,
    filter_has_not: Locator | None = None,
) -> Locator | None:
    """Determine the locator for the given selector type and (optional) role type.

    Args:
        selector (str):
            The selector to find the element on the page.
        selector_type (str):
            One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
            "label", "placeholder", "alt".
            When using css, the selector becomes a raw CSS selector, and you can skip attribute
            and value filtering entirely if your selector already narrows it down.
            Examples for CSS:
            * selector="img" - find all img tags (images)
            * selector="img[title]" - find all img tags (images) that have a title attribute - independent of its value
            * selector="img[title*='Microsoft Teams']" - find all images with a title that contains "Microsoft Teams"
            * selector=".toolbar button" - find all buttons inside a .toolbar class
        role_type (str | None, optional):
            ARIA role when using selector_type="role", e.g., "button", "textbox".
            If irrelevant then None should be passed for role_type.
        exact_match (bool | None, optional):
             Controls whether the text or name must match exactly.
             Default is None (not set, i.e. using playwrights default).
        iframe (str | None):
            Is the element in an iFrame? Then provide the name of the iframe with this parameter.
        regex (bool, optional):
            Should the name be interpreted as a regular expression?
        filter_has_text (str | None, optional):
            Applies `locator.filter(has_text=...)` to narrow the selection based on text content.
        filter_has (Locator | None, optional):
            Applies `locator.filter(has=...)` to match elements containing a descendant matching the given Locator.
        filter_has_not_text (str | None, optional):
            Applies `locator.filter(has_not_text=...)` to exclude elements with matching text content.
        filter_has_not (Locator | None, optional):
            Applies `locator.filter(has_not=...)` to exclude elements containing a matching descendant.

    """

    try:
        name_or_text = re.compile(selector) if regex else selector

        match selector_type:
            case "id":
                locator = self.page.locator("#{}".format(selector))
            case "name":
                locator = self.page.locator("[name='{}']".format(selector))
            case "class_name":
                locator = self.page.locator(".{}".format(selector))
            case "xpath":
                locator = self.page.locator("xpath={}".format(selector))
            case "css":
                if iframe is None:
                    locator = self.page.locator(selector)
                else:
                    locator = self.page.locator("iframe[name='{}']".format(iframe)).content_frame.locator(selector)
            case "text":
                if iframe is None:
                    locator = self.page.get_by_text(text=name_or_text)
                else:
                    locator = self.page.locator("iframe[name='{}']".format(iframe)).content_frame.get_by_text(
                        name_or_text
                    )
            case "title":
                locator = self.page.get_by_title(text=name_or_text)
            case "label":
                locator = self.page.get_by_label(text=name_or_text)
            case "placeholder":
                locator = self.page.get_by_placeholder(text=name_or_text)
            case "alt":
                locator = self.page.get_by_alt_text(text=name_or_text)
            case "role":
                if not role_type:
                    self.logger.error("Role type must be specified when using find method 'role'!")
                    return None
                if iframe is None:
                    if regex:
                        locator = self.page.get_by_role(role=role_type, name=name_or_text)
                    else:
                        locator = self.page.get_by_role(role=role_type, name=selector, exact=exact_match)
                else:
                    content_frame = self.page.locator("iframe[name='{}']".format(iframe)).content_frame
                    if regex:
                        locator = content_frame.get_by_role(role=role_type, name=name_or_text)
                    else:
                        locator = content_frame.get_by_role(role=role_type, name=selector, exact=exact_match)
            case _:
                self.logger.error("Unsupported selector type -> '%s'", selector_type)
                return None

        # Apply filter if needed
        if any([filter_has_text, filter_has, filter_has_not_text, filter_has_not]):
            locator = locator.filter(
                has_text=filter_has_text, has=filter_has, has_not_text=filter_has_not_text, has_not=filter_has_not
            )

    except PlaywrightError as e:
        self.logger.error("Failure to determine page locator; error -> %s", str(e))
        return None

    return locator

get_page(url='', wait_until=None)

Load a page into the browser based on a given URL.

Parameters:

Name Type Description Default
url str

URL to load. If empty just the base URL will be used.

''
wait_until str | None

Wait until a certain condition. Options are: * "commit" - does not wait at all - commit the request and continue * "load" - waits for the load event (after all resources like images/scripts load) This is the safest strategy for pages that keep loading content in the background like Salesforce. * "networkidle" - waits until there are no network connections for at least 500 ms. This seems to be the safest one for OpenText Content Server. * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed, but subresources may still load).

None

Returns:

Name Type Description
bool bool

True if successful, False otherwise.

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def get_page(self, url: str = "", wait_until: str | None = None) -> bool:
    """Load a page into the browser based on a given URL.

    Args:
        url (str):
            URL to load. If empty just the base URL will be used.
        wait_until (str | None, optional):
            Wait until a certain condition. Options are:
            * "commit" - does not wait at all - commit the request and continue
            * "load" - waits for the load event (after all resources like images/scripts load)
              This is the safest strategy for pages that keep loading content in the background
              like Salesforce.
            * "networkidle" - waits until there are no network connections for at least 500 ms.
              This seems to be the safest one for OpenText Content Server.
            * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed,
              but subresources may still load).

    Returns:
        bool:
            True if successful, False otherwise.

    """

    # If no specific wait until strategy is provided in the
    # parameter, we take the one from the browser automation class:
    if wait_until is None:
        wait_until = self.wait_until

    page_url = self.base_url + url

    try:
        self.logger.debug("Load page -> %s (wait until -> '%s')", page_url, wait_until)

        # The Playwright Response object is different from the requests.response object!
        response = self.page.goto(page_url, wait_until=wait_until)
        if response is None:
            self.logger.warning("Loading of page -> %s completed but no response object was returned.", page_url)
        elif not response.ok:
            # Try to get standard phrase, fall back if unknown
            try:
                phrase = HTTPStatus(response.status).phrase
            except ValueError:
                phrase = "Unknown Status"
            self.logger.error(
                "Response for page -> %s is not OK. Status -> %s/%s",
                page_url,
                response.status,
                phrase,
            )
            return False

    except PlaywrightError as e:
        self.logger.error("Navigation to page -> %s has failed; error -> %s", page_url, str(e))
        return False

    if self.take_screenshots:
        self.take_screenshot()

    return True

get_title(wait_until=None)

Get the browser title.

This is handy to validate a certain page is loaded after get_page()

Retry-safe way to get the page title, even if there's an in-flight navigation.

Parameters:

Name Type Description Default
wait_until str | None

Wait until a certain condition. Options are: * "commit" - does not wait at all - commit the request and continue * "load" - waits for the load event (after all resources like images/scripts load) This is the safest strategy for pages that keep loading content in the background like Salesforce. * "networkidle" - waits until there are no network connections for at least 500 ms. This seems to be the safest one for OpenText Content Server. * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed, but subresources may still load).

None

Returns:

Name Type Description
str str | None

The title of the browser page.

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def get_title(
    self,
    wait_until: str | None = None,
) -> str | None:
    """Get the browser title.

    This is handy to validate a certain page is loaded after get_page()

    Retry-safe way to get the page title, even if there's an in-flight navigation.

    Args:
        wait_until (str | None, optional):
            Wait until a certain condition. Options are:
            * "commit" - does not wait at all - commit the request and continue
            * "load" - waits for the load event (after all resources like images/scripts load)
              This is the safest strategy for pages that keep loading content in the background
              like Salesforce.
            * "networkidle" - waits until there are no network connections for at least 500 ms.
              This seems to be the safest one for OpenText Content Server.
            * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed,
              but subresources may still load).

    Returns:
        str:
            The title of the browser page.

    """

    for attempt in range(REQUEST_MAX_RETRIES):
        try:
            if wait_until:
                self.page.wait_for_load_state(state=wait_until, timeout=REQUEST_TIMEOUT)
            title = self.page.title()
            if title:
                return title
            time.sleep(REQUEST_RETRY_DELAY)
            self.logger.info("Retry attempt %d/%d", attempt + 1, REQUEST_MAX_RETRIES)
        except Exception as e:
            if "Execution context was destroyed" in str(e):
                self.logger.info(
                    "Execution context was destroyed, retrying after %s seconds...", REQUEST_RETRY_DELAY
                )
                time.sleep(REQUEST_RETRY_DELAY)
                self.logger.info("Retry attempt %d/%d", attempt + 1, REQUEST_MAX_RETRIES)
                continue
            self.logger.error("Could not get page title; error -> %s", str(e))
            break

    return None

install_browser(browser)

Install a browser with a provided name in Playwright.

Parameters:

Name Type Description Default
browser str

Name of the browser to be installed.

required

Returns:

Name Type Description
bool bool

True = installation successful, False = installation failed.

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def install_browser(self, browser: str) -> bool:
    """Install a browser with a provided name in Playwright.

    Args:
        browser (str):
            Name of the browser to be installed.

    Returns:
        bool: True = installation successful, False = installation failed.

    """

    self.logger.info("Installing Browser -> '%s'...", browser)
    process = subprocess.Popen(
        ["playwright", "install", browser],  # noqa: S607
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        shell=False,
    )
    output, error = process.communicate()
    if process.returncode == 0:  # 0 = success
        self.logger.info("Successfuly completed installation of browser -> '%s'.", browser)
        self.logger.debug(output.decode())
    else:
        self.logger.error("Installation of browser -> '%s' failed! Error -> %s", browser, error.decode())
        self.logger.error(output.decode())
        return False

    return True

run_login(user_field='otds_username', password_field='otds_password', login_button='loginbutton', page='', wait_until=None, selector_type='id')

Login to target system via the browser.

Parameters:

Name Type Description Default
user_field str

The name of the web HTML field to enter the user name. Defaults to "otds_username".

'otds_username'
password_field str

The name of the HTML field to enter the password. Defaults to "otds_password".

'otds_password'
login_button str

The name of the HTML login button. Defaults to "loginbutton".

'loginbutton'
page str

The URL to the login page. Defaults to "".

''
wait_until str | None

Wait until a certain condition. Options are: * "commit" - does not wait at all - commit the request and continue * "load" - waits for the load event (after all resources like images/scripts load) This is the safest strategy for pages that keep loading content in the background like Salesforce. * "networkidle" - waits until there are no network connections for at least 500 ms. This seems to be the safest one for OpenText Content Server. * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed, but subresources may still load).

None
selector_type str

One of "id", "name", "class_name", "xpath", "css", "role", "text", "title", "label", "placeholder", "alt". Default is "id".

'id'

Returns:

Name Type Description
bool bool

True = success, False = error.

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def run_login(
    self,
    user_field: str = "otds_username",
    password_field: str = "otds_password",
    login_button: str = "loginbutton",
    page: str = "",
    wait_until: str | None = None,
    selector_type: str = "id",
) -> bool:
    """Login to target system via the browser.

    Args:
        user_field (str, optional):
            The name of the web HTML field to enter the user name. Defaults to "otds_username".
        password_field (str, optional):
            The name of the HTML field to enter the password. Defaults to "otds_password".
        login_button (str, optional):
            The name of the HTML login button. Defaults to "loginbutton".
        page (str, optional):
            The URL to the login page. Defaults to "".
        wait_until (str | None, optional):
            Wait until a certain condition. Options are:
            * "commit" - does not wait at all - commit the request and continue
            * "load" - waits for the load event (after all resources like images/scripts load)
              This is the safest strategy for pages that keep loading content in the background
              like Salesforce.
            * "networkidle" - waits until there are no network connections for at least 500 ms.
              This seems to be the safest one for OpenText Content Server.
            * "domcontentloaded" - waits for the DOMContentLoaded event (HTML is parsed,
              but subresources may still load).
        selector_type (str, optional):
            One of "id", "name", "class_name", "xpath", "css", "role", "text", "title",
            "label", "placeholder", "alt".
            Default is "id".

    Returns:
        bool:
            True = success, False = error.

    """

    # If no specific wait until strategy is provided in the
    # parameter, we take the one from the browser automation class:
    if wait_until is None:
        wait_until = self.wait_until

    self.logged_in = False

    if (
        not self.get_page(url=page, wait_until=wait_until)
        or not self.find_elem_and_set(selector=user_field, selector_type=selector_type, value=self.user_name)
        or not self.find_elem_and_set(
            selector=password_field, selector_type=selector_type, value=self.user_password, is_sensitive=True
        )
        or not self.find_elem_and_click(
            selector=login_button, selector_type=selector_type, is_navigation_trigger=True, wait_until=wait_until
        )
    ):
        self.logger.error(
            "Cannot log into target system using URL -> %s and user -> '%s'!",
            self.base_url,
            self.user_name,
        )
        return False

    self.logger.debug("Wait for -> '%s' to assure login is completed and target page is loaded.", wait_until)
    self.page.wait_for_load_state(wait_until)

    title = self.get_title()
    if not title:
        self.logger.error(
            "Cannot read page title after login - you may have the wrong 'wait until' strategy configured! Strategy user -> '%s'.",
            wait_until,
        )
        return False

    if "Verify" in title:
        self.logger.error("Site is asking for a verification token. You may need to whitelist your IP!")
        return False
    if "Login" in title:
        self.logger.error("Authentication failed. You may have given the wrong password!")
        return False

    self.logger.info("Login completed successfully! Page title -> '%s'", title)
    self.logged_in = True

    return True

sanitize_filename(filename)

Sanitize a string to be safe for use as a filename.

  • Replaces spaces with underscores
  • Removes unsafe characters
  • Converts to lowercase
  • Trims length and dots

Parameters:

Name Type Description Default
filename str

The filename to sanitize.

required
Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def sanitize_filename(self, filename: str) -> str:
    """Sanitize a string to be safe for use as a filename.

    - Replaces spaces with underscores
    - Removes unsafe characters
    - Converts to lowercase
    - Trims length and dots

    Args:
        filename (str):
            The filename to sanitize.

    """

    filename = filename.lower()
    filename = filename.replace(" ", "_")
    filename = re.sub(r'[<>:"/\\|?*\x00-\x1F]', "", filename)  # Remove unsafe chars
    filename = re.sub(r"\.+$", "", filename)  # Remove trailing dots
    filename = filename.strip()
    if not filename:
        filename = "untitled"

    return filename

scroll_to_element(element)

Scroll an element into view to make it clickable.

Parameters:

Name Type Description Default
element Locator

Web element that has been identified before.

required
Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def scroll_to_element(self, element: Locator) -> None:
    """Scroll an element into view to make it clickable.

    Args:
        element (Locator):
            Web element that has been identified before.

    """

    if not element:
        self.logger.error("Undefined element! Cannot scroll to it.")
        return

    try:
        element.scroll_into_view_if_needed()
    except PlaywrightError as e:
        self.logger.warning("Cannot scroll element -> %s into view; error -> %s", str(element), str(e))

set_timeout(wait_time)

Wait for the browser to finish tasks (e.g. fully loading a page).

This setting is valid for the whole browser session and not just for a single command.

Parameters:

Name Type Description Default
wait_time float

The time in seconds to wait.

required
Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def set_timeout(self, wait_time: float) -> None:
    """Wait for the browser to finish tasks (e.g. fully loading a page).

    This setting is valid for the whole browser session and not just
    for a single command.

    Args:
        wait_time (float):
            The time in seconds to wait.

    """

    self.logger.debug("Setting default timeout to -> %s seconds...", str(wait_time))
    self.page.set_default_timeout(wait_time * 1000)
    self.logger.debug("Setting navigation timeout to -> %s seconds...", str(wait_time))
    self.page.set_default_navigation_timeout(wait_time * 1000)

setup_playwright(browser)

Initialize Playwright browser automation.

Parameters:

Name Type Description Default
browser str

Name of the browser engine. Supported: * chromium * chrome * msedge * webkit * firefox

required

Returns:

Name Type Description
bool bool

True = Success, False = Error.

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def setup_playwright(self, browser: str) -> bool:
    """Initialize Playwright browser automation.

    Args:
        browser (str):
            Name of the browser engine. Supported:
            * chromium
            * chrome
            * msedge
            * webkit
            * firefox

    Returns:
        bool:
            True = Success, False = Error.

    """

    try:
        self.logger.debug("Creating Playwright instance...")
        self.playwright = sync_playwright().start()
    except Exception:
        self.logger.error("Failed to start Playwright!")
        return False

    result = True

    # Install and launch the selected browser in Playwright:
    match browser:
        case "chromium":
            try:
                self.browser: Browser = self.playwright.chromium.launch(
                    headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                )
            except Exception:
                result = self.install_browser(browser=browser)
                if result:
                    self.browser: Browser = self.playwright.chromium.launch(
                        headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                    )

        case "chrome":
            try:
                self.browser: Browser = self.playwright.chromium.launch(
                    channel="chrome",
                    headless=self.headless,
                    slow_mo=100 if not self.headless else None,
                    proxy=self.proxy,
                )
            except Exception:
                result = self.install_browser(browser=browser)
                if result:
                    self.browser: Browser = self.playwright.chromium.launch(
                        channel="chrome",
                        headless=self.headless,
                        slow_mo=100 if not self.headless else None,
                        proxy=self.proxy,
                    )

        case "msedge":
            try:
                self.browser: Browser = self.playwright.chromium.launch(
                    channel="msedge",
                    headless=self.headless,
                    slow_mo=100 if not self.headless else None,
                    proxy=self.proxy,
                )
            except Exception:
                result = self.install_browser(browser=browser)
                if result:
                    self.browser: Browser = self.playwright.chromium.launch(
                        channel="msedge",
                        headless=self.headless,
                        slow_mo=100 if not self.headless else None,
                        proxy=self.proxy,
                    )

        case "webkit":
            try:
                self.browser: Browser = self.playwright.webkit.launch(
                    headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                )
            except Exception:
                result = self.install_browser(browser=browser)
                if result:
                    self.browser: Browser = self.playwright.webkit.launch(
                        headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                    )

        case "firefox":
            try:
                self.browser: Browser = self.playwright.firefox.launch(
                    headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                )
            except Exception:
                result = self.install_browser(browser=browser)
                if result:
                    self.browser: Browser = self.playwright.firefox.launch(
                        headless=self.headless, slow_mo=100 if not self.headless else None, proxy=self.proxy
                    )
        case _:
            self.logger.error("Unknown browser -> '%s'. Cannot install and launch it.", browser)
            result = False

    return result

take_screenshot()

Take a screenshot of the current browser window and save it as PNG file.

Returns:

Name Type Description
bool bool

True if successful, False otherwise

Source code in packages/pyxecm/src/pyxecm_customizer/browser_automation.py
def take_screenshot(self) -> bool:
    """Take a screenshot of the current browser window and save it as PNG file.

    Returns:
        bool:
            True if successful, False otherwise

    """

    screenshot_file = "{}/{}-{:02d}.png".format(
        self.screenshot_directory,
        self.screenshot_names,
        self.screenshot_counter,
    )
    self.logger.debug("Save browser screenshot to -> %s", screenshot_file)

    try:
        self.page.screenshot(path=screenshot_file, full_page=self.screenshot_full_page)
        self.screenshot_counter += 1
    except Exception as e:
        self.logger.error("Failed to take screenshot; error -> %s", e)
        return False

    return True