79193212

Date: 2024-11-15 16:33:52
Score: 3
Natty:
Report link

Tldr;

As mentioned in the comment you might want to look at the synonyms API which made it way in the stack starting version 8.10

Creating synonyms is as easy as this:

PUT _synonyms/my-synonyms-set
{
  "synonyms_set": [
    {
      "id": "test-1",
      "synonyms": "hello, hi, ciao"
    }
  ]
}

Demo

Based on your specific case I am creating the following synonyms

PUT _synonyms/sport_teams_synonyms
{
  "synonyms_set": [
    {
      "synonyms": "dallas mavericks => mavs, dallasmavs, mavericks"
    },
    {
        "synonyms": "portland trail blazers, trail blazers => ptb"
    }
  ]
}

Then create the following index

PUT sport_teams_match
{
  "settings": {
    "analysis": {
      "filter": {
        "sts_filter": {
          "type": "synonym_graph",
          "synonyms_set": "sport_teams_synonyms",
          "updateable": true
        }
      },
      "analyzer": {
        "sport_teams_analyzer": {
          "type": "custom",
          "tokenizer": "standard",
          "filter": [
            "lowercase",
            "sts_filter"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "team_1": {
        "type": "text",
        "search_analyzer": "sport_teams_analyzer"
      },
      "team_2": {
        "type": "text",
        "search_analyzer": "sport_teams_analyzer"
      }
    }
  }
}

Loaded some documents

PUT _bulk
{ "index" : { "_index" : "sport_teams_match"} }
{ "team_1" : "mavs", "team_2": "lakers" }
{ "index" : { "_index" : "sport_teams_match"} }
{ "team_1" : "trail blazers", "team_2": "lakers" }

The following search queries should find you the first document

GET sport_teams_match/_search?q=team_1:"Mavericks"
GET sport_teams_match/_search?q=team_1:"Dallas Mavericks"

Awesome let try with Trail Blazers ?

GET sport_teams_match/_search?q=team_1:"Trail Blazers"

Uhuuuu not working ?? why ?? The _analyze API to the rescue. This api return given a specific analyzer pipeline and some text, the token extracted.

POST sport_teams_match/_analyze
{
  "analyzer": "sport_teams_analyzer",
  "text":     "Trail Blazers"
}

POST sport_teams_match/_analyze
{
  "analyzer": "standard",
  "text":     "trail blazers"
}

You will see that:

How can we fix this ? ptb might not be such a good synonym after all ?

Reasons:
  • Blacklisted phrase (1): How can we
  • RegEx Blacklisted phrase (1.5): fix this ?
  • RegEx Blacklisted phrase (2): working ?
  • Long answer (-1):
  • Has code block (-0.5):
  • Ends in question mark (2):
  • High reputation (-2):
Posted by: Paulo