If you’re here because you now have to fix the method that had this link commented in it, I’m sorry.

The Hard-Knocks of XPATH 1.0 Life

Work at a current client has lead to – for a various amount of restrictions with reasons that I won’t get into – the need for using dynamically generated XPATHs in order to locate elements on the page. These elements lack any ids or other defining attributes that would make them easy to spot, and not wanting to build fragile, dynasty based xpath selectors, we’ve settled on using the text itself as our unique identifiers.



@Then("^the header with the text \"([^\"]*)\"$")
public void seeHeader(String header) {
    String xpath = "//span[contains(text(),'" + header + "')]";
    // Whatever you want to do with it


Given I have a new user
When I go to page <number>
Then the header with text "<header>" appears

| number | header          |
| 1      | Kendrick Lamar  |
| 2      | Childish Gambino|

The Complication

Though irritating when this text changes, we have created unique selectors out of nothing. This is especially useful for writing tests for this application, where a lot of our XPATHs need to be able to select unique options on the page to interact with them. While we start simple, as we progress further through the application, these paths keep breaking for seemingly no reason. The text exactly matches, but the XPATH won’t execute correctly. After ripping out a few hairs, we discovered that the app’s encoding is in UTF-8, and is full of similar-but-not-really-the-same characters. Multiple spaces that are really half spaces, capital Ys that are actually similar looking Greek counterparts, etc. Typically, XPATH 2.0 has a method that takes care of this, but our framework is only capable of using XPATH 1.0 as of now. So, we’re limited to using XPATH 1.0’s translate function which has its own issues with special characters, namely quotes.

The Sanitized Solution

XPATH 1.0 can’t used escaped quotes like Java can, it relies on use of concatenate function to get things working in that case. See the fruit of my labor, and if you want to save yourself a frustrating afternoon of getting quotes and parens to correctly match, I recommend you not modify it much. This was designed so that hopefully, you will only ever need to expand the constant variables that contain the UTF-8 characters you want to standardize.


String APOSTROPHES = "\u02BC";
String SPACES = "\u2000\u2001\u2002\u2003\u2004\u2005\u2006\u2007\u2008\u2009\u200A";
String QUOTES = "\u201C\u201D";

public String getSanitizedTranslate(String field, String value, boolean equals) {
    String sanitizedApostrophes = "";
    String sanitizedSpaces = "";
    String sanitizedQuotes = "";
    // Builds sanitized versions based on length of unsanitary entries.
    for (int i = 0; i < APOSTROPHES.length(); i++) {
        sanitizedApostrophes += "'";
    for (int i = 0; i < SPACES.length(); i++) {
        sanitizedSpaces += " ";
    for (int i = 0; i < QUOTES.length(); i++) {
        sanitizedQuotes += "\"";
    String equality = equals ? "=" : ",";
    // Touch this if you dare
    return "translate(" + field + ",\"ABCDEFGHIJKLMNOPQRSTUVWXYZ" + APOSTROPHES + SPACES + QUOTES +
            "\", concat(\"abcdefghijklmnopqrstuvwxyz" + sanitizedApostrophes + sanitizedSpaces + "\",'" +
            sanitizedQuotes + "'))" + equality + "\"" + value.toLowerCase() + "\"";

XPATH’s translate function uses a 1 to 1 matching strategy depending on the order of things (above, A translates to a). So we build our sanitized versions of the text based on the length of our UTF-8 strings which contain all those odd characters we will come across in the app.

The equals variable is used depending on if you want your xpath to be exact (true), or are using a function like contains (false).

In Use

String exactMatchPath = "//element[" + getSanitizedTranslate("@value", "Kendrick Lamar", true) + "]";

String containsMatchPath = "//element[contains(" + getSanitizedTranslate("@value", "Kendrick", false) + ")]"


Leave a comment

Your email address will not be published. Required fields are marked *