dbt hub

dbt test: expect_column_values_to_not_match_regex_list

USE CASE

String matching

APPLIES TO

Column

This page describes the expect_column_values_to_not_match_regex_list test from the dbt-expectations package. The test verifies that strings in a specified column do not match any patterns from a list of regular expressions. This is particularly useful when you need to enforce certain formatting rules or prevent specific patterns in string data.

How it Works

The test checks each entry in a selected column against a list of regular expressions, ensuring that none of the entries match any of the provided patterns. This validation is crucial to avoid unwanted or malformed data entering your system.

Steps and Conditions:

  1. Column Selection: Choose the column whose values you want to validate against the regex list.
  2. Regex List Configuration: Provide a list of regular expressions that should not match any part of the column's entries.
  3. Optional Configuration:
    • is_raw: Set to True if the provided regex patterns are raw strings that need escaping.
    • flags: Specify any regex flags like i for case-insensitive matching.
    • match_on: Define whether to apply 'OR' (default) or 'AND' logic across the regex patterns.
    • Row Condition: Apply conditions to select specific rows for testing.
  4. Execution: The test scans each value in the column and checks it against the regex conditions using the specified logic.
  5. Outcome:
    • Pass: If no values in the column match any regex patterns from the list, the test passes.
    • Fail: If any value matches a regex pattern, the test fails.

Example Usage: E-commerce

In an E-commerce platform, ensuring that product descriptions are free from unwanted characters or patterns is crucial for maintaining content quality.

Consider a scenario where the product_descriptions table has a description_text column that should not contain reference codes or any HTML-like tags.


models:
  - name: product_descriptions
    columns:
      - name: description_text
        tests:
          - dbt_expectations.expect_column_values_to_not_match_regex_list:
              regex_list: ["[<>/]", "\[\[.*?\]\]"]  # Reject HTML-like tags and wiki-style links
              is_raw: True
              flags: i

In this example, the test ensures that no product description includes patterns like <tag> or [[link]], which are typical in HTML or markup languages. By preventing these patterns, the E-commerce site maintains a clear and professionally formatted product catalogue.

The only data observability platform built into your dbt code

  • Get monitors on your production tables out-of-the-box with zero configuration
  • Add tests to your code in bulk with a simple UI
  • Track test results over time
  • Set owners and create meaningful alerts
  • Triage incidents faster using our end-to-end column-level lineage graph
An illustration showing a debugging console with dbt run commands and an error message. The image includes stylized icons: a smiling orange bar chart, the purple dbt logo, and a magnifying glass containing a pink error icon.