Substring component settings allow you to perform multiple string manipulation methods to obtain the desired dimension items in reports.
Substring is available only on dimensions, and is retroactive to the data it is applied to. It is an immediate data transformation that happens before filtering or other analysis operations are applied.
Take a part of a string based on its position to the beginning or end of a string. From the Left and From the Right methods provide two drop-down lists: From (where the output starts) and To (where the output ends).
1
represents the first match. If the index is higher than the number of matches available, No value options apply.Use this method for fields that use a delimiter to separate multiple string values. You can either extract an individual element to use as the output, or convert the string into an object array schema element.
"Fox,Turtle,Rabbit,Wolf"
with an index of 3, the output is "Rabbit"
. If the index is higher than the number of delimited elements, No value options apply.For use with fields that contain URLs. Using the example URL https://example.com/store/index.html?cid=campaign#cart
, the following options are available:
"https://"
."example.com"
."store/index.html"
."cid"
query key, the output is "campaign"
."cart"
.If the input is not a valid URL or if the desired URL component is not present, No value options apply.
Trim white space or special characters from the string.
Apply regular expressions to a dimension to retrieve the desired value.
Customer Journey Analytics uses a subset of the Perl regex syntax. If the input does not match the regular expression and the Output format is blank, No value options apply. The following expressions are supported:
Expression | Description |
---|---|
a |
A single character a . |
a|b |
A single character a or b . |
[abc] |
A single character a , b , or c . |
[^abc] |
Any single character except a , b , or c . |
[a-z] |
Any single character in the range of a -z . |
[a-zA-Z0-9] |
Any single character in the range of a -z , A -Z , or digits 0 -9 . |
^ |
Matches the beginning of the line. |
$ |
Matches the end of the line. |
\A |
Start of string. |
\z |
End of string. |
. |
Matches any character. |
\s |
Any whitespace character. |
\S |
Any non-whitespace character. |
\d |
Any digit. |
\D |
Any non-digit. |
\w |
Any letter, number, or underscore. |
\W |
Any non-word character. |
\b |
Any word boundary. |
\B |
Any character that is not a word boundary. |
\< |
Start of word. |
\> |
End of word. |
(...) |
Capture everything enclosed. |
(?:...) |
Non-marking capture. Prevents the match from being referenced in the output string. |
a? |
Zero or one of a . |
a* |
Zero or more of a . |
a+ |
One ore more of a . |
a{3} |
Exactly 3 of a . |
a{3,} |
3 or more of a . |
a{3,6} |
Between 3 and 6 of a . |
Output placeholders are also supported. You can use these sequences in the Output format any number of times and in any order to achieve the desired string output.
Output placeholder sequence | Description |
---|---|
$& |
Outputs what matched the whole expression. |
$n |
Outputs what matched the nth sub expression. For example, $1 outputs the first sub expression. |
$` |
Outputs the text between the end of the last match found (or the start of the text if no previous match was found), and the start of the current match. |
$+ |
Outputs what matched the last marked sub expression in the regular expression. |
$$ |
Outputs the string character "$" . |