Taking this detailed and helpful post as a base, let’s try to introduce English stemming and preserve synonym token replacement by WordNet.
First, to add WordNet prolog file to existing ElasticSearch nodes (in my case Ubuntu) perform the following:
- sudo su #switch to superuser to access ElasticSearch folder freely
- wget http://wordnetcode.princeton.edu/3.0/WNprolog-3.0.tar.gz #download ANSI Prolog version of the WordNet db
- tar -xvzf WNprolog-3.0.tar.gz #decompress tar
- cd ../../etc/elasticsearch #go to ElasticSearch config directory
- mkdir analysis #create analysis subdirectory
- mv /home/onehydraadmin/prolog/wn_s.pl /etc/elasticsearch/analysis/wn_s.pl #move WordNet file to new directory
Now we are able to create ElasticSearch index that can access WordNet db.
What do we need in terms of synonym mapping? We need both synonyms and queries to be tokenized with English stemmer after English stop words removal. Then query tokens need to be mapped to tokens in synonyms source. After that, list of synonym tokens obtained need to act as a search query tokens against indexed documents.
To achieve this, we create an index with custom synonym analyser that utilises three filters (the order matters!): english_stop, english_stemmer, synonym.
PUT request to http://localhost:9200/synonym_test/
{
"settings" : {
"index" : {
"analysis" : {
"analyzer" : {
"synonym" : {
"tokenizer" : "standard",
"filter" : ["english_stop", "english_stemmer","synonym"]
}
},
"filter" : {
"synonym" : {
"type": "synonym",
"format": "wordnet",
"synonyms_path": "analysis/wn_s.pl"
},
"english_stop": {
"type": "stop",
"stopwords": "_english_"
},
"english_stemmer": {
"type": "stemmer",
"language": "english"
}
}
}
}
},
"mappings" : {
"_default_": {
"properties" : {
"name" : {
"type" : "string",
"analyzer" : "synonym"
}
}
}
}
}
Following the blog post, let’s insert two values to the index: “baby” and “child”:
POST request to http://localhost:9200/synonym_test/1
{
"name" : "baby"
}
POST request to http://localhost:9200/synonym_test/2
{
"name" : "child"
}
Now we can search with singular and plurals queries alike and still get all synonyms in response.
POST request to http://localhost:9200/synonym_test/_search?pretty=true
{
"query" : {
"match": {
"name": {
"query": "babies"
}
}
}
}
Response
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.30685282,
"hits": [
{
"_index": "projects6",
"_type": "project",
"_id": "1",
"_score": 0.30685282,
"_source": {
"name": "baby"
}
},
{
"_index": "projects6",
"_type": "project",
"_id": "2",
"_score": 0.19178301,
"_source": {
"name": "child"
}
}
]
}
}
Hi,
I tried the above method but facing the below issue.
{“error”:{“root_cause”:[{“type”:”illegal_argument_exception”,”reason”:”failed to build synonyms”}],”type”:”illegal_argument_exception”,”reason”:”failed to build synonyms”,”caused_by”:{“type”:”parse_exception”,”reason”:”Invalid synonym rule at line 109″,”caused_by”:{“type”:”illegal_argument_exception”,”reason”:”term: course of action analyzed to a token (action) with position increment != 1 (got: 2)”}}},”status”:400}
Could you please suggest.
Thanks,
Ashwin rao
Hi Ashwin,
From the discussion on elasticsearch github page https://github.com/elastic/elasticsearch/issues/27481 it seems the ordering of filters should be changed for version 6.
Cheers,
Anna
I believe that is among the such a lot vital information for me.
And i am glad studying your article. However want to remark
on few common things, The web site style is perfect, the articles
is really nice : D. Excellent task, cheers
What’s up, yup this paragraph is actually good and I have learned lot
of things from it concerning blogging. thanks.
hey there and thank you for your information – I’ve certainly picked up anything
new from right here. I did however expertise several technical points using this website, since I experienced to reload the
web site a lot of times previous to I could get it to load correctly.
I had been wondering if your hosting is OK? Not that I’m complaining,
but slow loading instances times will very frequently affect your placement in google and can damage your high-quality score if advertising
and marketing with Adwords. Anyway I’m adding this RSS to my
email and can look out for a lot more of your respective fascinating
content. Ensure that you update this again soon.
My web blog คริปโทเคอเรนซี่
Incredible points. Great arguments. Keep up the great spirit.
Hello, after reading this remarkable post i am as well
cheerful to share my know-how here with friends.
Feel free to visit my web site ซื้อหวย
Terrific work! That is the type of information that are supposed to be shared across the web.
Disgrace on the search engines for now not positioning this publish higher!
Come on over and discuss with my website . Thank you =)
This is very interesting, You’re a very skilled blogger.
I’ve joined your feed and look forward to seeking more of your wonderful post.
Also, I’ve shared your website in my social networks!
Hello There. I discovered your weblog the use of msn. That is a really smartly written article.
I will be sure to bookmark it and return to learn extra of your helpful info.
Thank you for the post. I’ll certainly return.
This is very interesting, You’re a very skilled blogger.
I’ve joined your rss feed and look forward to seeking more of your
great post. Also, I have shared your website in my
social networks!
I do not even know how I finished up here, but
I thought this post used to be great. I don’t recognize who you might be however definitely you are going to a well-known blogger when you are
not already. Cheers!
Hello I am so grateful I found your website, I really found you by accident,
while I was searching on Google for something else, Anyways I am
here now and would just like to say cheers for a marvelous post and a all round entertaining blog (I also
love the theme/design), I don’t have time to look over it all at the minute but I
have book-marked it and also added in your RSS feeds,
so when I have time I will be back to read much more,
Please do keep up the superb b.
Exceptional post however , I was wondering if you could
write a litte more on this topic? I’d be very thankful if you could elaborate a little bit further.
Kudos!
Hey there exceptional website! Does running a
blog similar to this take a great deal of work? I’ve virtually no understanding of programming however I was hoping to start my own blog in the near future.
Anyhow, should you have any suggestions or tips for new blog owners please share.
I understand this is off subject nevertheless I simply needed to ask.
Many thanks!
Thanks designed for sharing such a good idea, post is good,
thats why i have read it fully