Skip to content

Personalised customer emails 📨

Tip

Solution to the following problem:

Result Set

Regardless of the database, the result set should look like:

company_name company_email_address salutation_name
Fractal Factory billiam@fractal-factory.co.uk William
Friends For Hire joe.trib@f4hire.com Joey
Some Company admin@somecompany.com null
Expand for the DDL
solution(company_name, company_email_address, salutation_name) as (
    values
        ('Fractal Factory',  'billiam@fractal-factory.co.uk', 'William'),
        ('Friends For Hire', 'joe.trib@f4hire.com',           'Joey'),
        ('Some Company',     'admin@somecompany.com',         null)
)

Solution

Some SQL solutions per database are provided below.

DuckDB

This DuckDB solution uses the Jaccard similarity with a 25% match threshold, but this isn't the only way to solve this problem.

with contact_preferences as (
    select
        company.full_name as company_name,
        company.email_address as company_email_address,
        individual.first_name,

        /* Could also use `jaro_similarity`, `jaro_winkler_similarity`, `levenshtein` */
        jaccard(
            lower(split_part(company.email_address, '@', 1)),
            lower(individual.full_name)
        ) as similarity,

        row_number() over(
            partition by company_name
            order by similarity desc
        ) as contact_preference
    from customer_relationships as relationships
        left join customers as company
            on relationships.parent_customer_id = company.customer_id
        left join customers as individual
            on relationships.child_customer_id = individual.customer_id
)

select
    company_name,
    company_email_address,
    /* Set a match threshold of 25% */
    if(similarity >= 0.25, first_name, null) as salutation_name
from contact_preferences
where contact_preference = 1
order by company_name

SQL Server

This SQL Server solution uses the Soundex differences with a 3 (out of 4) match threshold, but this isn't the only way to solve this problem.

with contact_preferences as (
    select
        company.full_name as company_name,
        company.email_address as company_email_address,
        individual.first_name,
        individual.last_name,

        difference(
            left(company.email_address, -1 + charindex('@', company.email_address)),
            individual.full_name
        ) as similarity,

        row_number() over(
            partition by company.full_name
            order by difference(
                left(company.email_address, -1 + charindex('@', company.email_address)),
                individual.full_name
            ) desc
        ) as contact_preference
    from customer_relationships as relationships
        left join customers as company
            on relationships.parent_customer_id = company.customer_id
        left join customers as individual
            on relationships.child_customer_id = individual.customer_id
)

select
    company_name,
    company_email_address,
    /* Set a match threshold of 3 */
    iif(similarity >= 3, first_name, null) as salutation_name
from contact_preferences
where contact_preference = 1
order by company_name