Lightning-Fast Search with MeiliSearch and Python
What is MeiliSearch?
Meilisearch is a RESTful search API. It aims to be a ready-to-go solution for everyone who wants a fast and relevant search experience for their end-users ⚡️🔎
Step 1: Install the dependencies
Let’s install the necessary dependencies. Run the below command and install MeiliSearch.
pip3 install meilisearch
Step 2: Run Meilisearch
Download and run Meilisearch Instance.
# Install Meilisearch
curl -L https://install.meilisearch.com | sh
# Launch Meilisearch
./meilisearch --master-key=masterKey
Step 3: Code with Python and MeiliSearch
Let’s create a Basic Python script that demonstrates how to use MeiliSearch for searching.
import meilisearch
client = meilisearch.Client('http://127.0.0.1:7700', 'masterKey')
# An index is where the documents are stored.
index = client.index('movies')
documents = [
{ 'id': 1, 'title': 'Carol', 'genres': ['Romance', 'Drama'] },
{ 'id': 2, 'title': 'Wonder Woman', 'genres': ['Action', 'Adventure'] },
{ 'id': 3, 'title': 'Life of Pi', 'genres': ['Adventure', 'Drama'] },
{ 'id': 4, 'title': 'Mad Max: Fury Road', 'genres': ['Adventure', 'Science Fiction'] },
{ 'id': 5, 'title': 'Moana', 'genres': ['Fantasy', 'Action']},
{ 'id': 6, 'title': 'Philadelphia', 'genres': ['Drama'] },
]
# If the index 'movies' does not exist, Meilisearch creates it when you first add the documents.
index.add_documents(documents) # => { "uid": 0 }
Documents: A document is like a container for data, made up of different parts called fields. Each field has a name (attribute) and something it holds (value). Think of documents as the building blocks of a Meilisearch database. To find a document, you have to put it into a special storage area called an index.
Indexes: An index is like a folder for documents with some rules. You can think of it as a table in SQL or a collection in MongoDB.
An index has a name (uid) and has three important things inside it:
A special key that helps identify documents (like a unique ID).
Customizable settings, which are like instructions for how the index works.
It can hold as many documents as you want.
Adding Search Capabilities to Your Data Using MeiliSearch
Now that we have a better understanding of how MeiliSearch operates, let’s tackle a straightforward problem: working with a CSV file containing information about various places.
This CSV file includes several fields, such as name, is_premium, hashtags, and languages.
Step 1: Index creation
First we need to create a index which will store our places data.
import meilisearch
# Define a function to get or create a Meilisearch index
def get_or_create_index(client: meilisearch.Client, index_name: str) -> Index:
try:
return client.get_index(index_name)
except MeilisearchApiError:
client.create_index(index_name)
return client.index(index_name)
if __name__ == "__main__":
index_name = "places"
client = meilisearch.Client('http://127.0.0.1:7700', 'masterKey')
index = get_or_create_index(client, index_name)
Step 2: Create Documents
Now, let’s create documents for your places
index.
To effectively manage our places data, we need to define a schema. This step will ensure that our data is structured and organized for efficient indexing and searching.
{
name: str
is_premium: bool
prebooking_required: bool
hashtags: List[str]
languages: List[str]
}
Now with the given schema let’s create documents.
import ast
import csv
import meilisearch
import time
# Define the CSV file and database connection
csv_file = "data.csv"
def place_schema(
name: str,
is_premium: bool,
prebooking_required: bool,
hashtags: list,
languages: list,
):
place_document = {
"name": name,
"is_premium": is_premium,
"prebooking_required": prebooking_required,
"hashtags": [hashtag for hashtag in hashtags],
"languages": [language for language in languages],
}
return place_document
def index_places(index):
with open(csv_file, "r") as csvfile:
csvreader = csv.DictReader(csvfile)
documents = []
for row in csvreader:
name = row["name"]
is_premium = bool(row["is_premium"])
prebooking_required = bool(row["prebooking_required"])
hashtags = ast.literal_eval(row["hashtags"])
languages = ast.literal_eval(row["languages"])
documents.append(
place_schema(
name=name,
is_premium=is_premium,
prebooking_required=prebooking_required,
hashtags=hashtags,
languages=languages,
)
)
task = index.add_documents(documents)
task_id = task.task_uid
task_status = index.get_task(task_id)
while task_status.status not in ["succeeded", "failed"]:
time.sleep(1)
task_status = index.get_task(task_id)
print(f"Task Status: {task_status.status}")
if task_status.status == "failed":
print(f"Error message: {task_status.error['message']}")
def get_or_create_index(client: meilisearch.Client, index_name: str):
client.create_index(index_name)
return client.index(index_name)
if __name__ == "__main__":
index_name = "places"
client = meilisearch.Client("http://127.0.0.1:7700", "masterKey")
indexes = client.get_indexes()
index = get_or_create_index(client, index_name)
index_places(index=index)
In the index_places
function, we are responsible for two crucial tasks:
Creating a List of Documents: First, you collect all the documents that need to be indexed into a list called
documents
. These documents are structured according to the schema we have defined earlier. Each document is a Python dictionary with fields likename
,is_premium
,prebooking_required
,hashtags
, andlanguages
.Adding Documents to the ‘places’ Index: Once you have your list of documents, you use MeiliSearch’s
add_documents
method to add these documents to your 'places' index. This method returns a task object, and you extract thetask_uid
from it.Monitoring Task Status: The unique identifier (
task_uid
) is essential for tracking the progress of the task. In this case, the task is to add documents to the index. We use a while loop to repeatedly check the status of the task until it either succeeds or fails. The status can be one of "succeeded" or "failed."Error Handling: If the task status is “failed,” you retrieve the error message to understand why the operation didn’t succeed. This is a critical step in debugging and ensuring that your indexing process works smoothly.
Let’s proceed by executing the script to generate an index and populate it with documents.
Oops! It seems we’ve encountered an issue.
Task Status: failed
Error message: The primary key inference failed as the engine did not
find any field ending with `id` in its name. Please specify the primary key
manually using the `primaryKey` query parameter.
That’s right, we haven’t defined any primary key, and there is no ‘id’ field. Let’s fix this by adding a primary field to our script.
Here’s the updated code:
def get_or_create_index(client: meilisearch.Client, index_name: str):
client.create_index(index_name, {"primaryKey": "name"})
return client.index(index_name)
Now! Lets run the script again.
Task Status: failed
Error message: Document identifier `"Climb Central Delhi"` is invalid.
A document identifier can be of type integer or string,
only composed of alphanumeric characters (a-z A-Z 0-9), hyphens (-) and
underscores (_).
Okay, it seems our document identifier format is not suitable. Let’s create a function to convert the name into the required format.
def make_primary_document_identifier(string):
string = re.sub(r"[^a-zA-Z0-9]+", "_", string)
string = string.lower()
return string
def place_schema(
name: str,
is_premium: bool,
prebooking_required: bool,
hashtags: list,
languages: list,
):
place_document = {
"name": make_primary_document_identifier(string=name),
"is_premium": is_premium,
"prebooking_required": prebooking_required,
"hashtags": [hashtag for hashtag in hashtags],
"languages": [language for language in languages],
}
return place_document
Now! Lets run the script again.
Task Status: succeeded
Nice!
Now, you can visit http://localhost:7700/
to see your index. You've successfully created an index and added documents to it.
Step 3: Let’s Do a Basic Search
Let’s start by creating a Python script named search.py
. This script will include a search function that accepts a search query and returns the search results. Below is the code for search.py
:
#search.py
import argparse
import meilisearch
def search(search_string):
return index.search(search_string)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--query", help="Search query", default=None)
args = parser.parse_args()
index_name = "places"
client = meilisearch.Client("http://127.0.0.1:7700", "masterKey")
index = client.get_index(index_name)
if args.query is not None:
print(search(args.query))
else:
print(search(""))
Let’s run the search script:
python search.py climb
Result:
{
"hits": [
{
"name": "climb_central_delhi",
"is_premium": true,
"prebooking_required": true,
"hashtags": [
"chilled",
"ecofriendly",
"wellness",
"yoga",
"familyfriendly",
"familyrun",
"lively",
"cycling",
"organic",
"party"
],
"languages": [
"English"
]
}
],
"query": "climb",
"processingTimeMs": 18,
"limit": 20,
"offset": 0,
"estimatedTotalHits": 1
}
Step 4: Filtering
The first step is to enable Meilisearch to filter places by hashtags
.To do this, we need to add hashtags
to the list of filterable attributes. This allows us to perform precise searches based on hashtags
Here’s the code snippet for adding or updating filterable attributes
in your Python script:
def add_or_update_filterable_attributes(index):
index.update_filterable_attributes(["hashtags"])
if __name__ == "__main__":
index_name = "places"
client = meilisearch.Client("http://127.0.0.1:7700", "masterKey")
indexes = client.get_indexes()
index = get_or_create_index(client, index_name)
add_or_update_filterable_attributes(index)
Make sure to run the script to update the filter attributes.
In this example, we’ll search for places with the party
hashtag.
import argparse
import meilisearch
def search(search_string):
return index.search(
search_string,
{
"filter": ["hashtags = party"]
},
)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--query", help="Search query", default=None)
args = parser.parse_args()
index_name = "places"
client = meilisearch.Client("http://127.0.0.1:7700", "masterKey")
index = client.get_index(index_name)
if args.query is not None:
print(search(args.query))
else:
print(search(""))
Let’s run the search script:
python search.py
Result:
Upon running the search script with the party
query, you’ll receive a response with places that have party
as one of their hashtags
. Here’s an example of the response:
{
"hits":[
{
"name":"climb_central_delhi",
"is_premium":true,
"prebooking_required":true,
"hashtags":[
"chilled",
"ecofriendly",
"wellness",
"party"
],
"languages":[
"English"
]
},
{
"name":"akbran_tour",
"is_premium":true,
"prebooking_required":true,
"hashtags":[
"daytours",
"walking",
"party"
],
"languages":[
"English",
"Spanish",
"Russian",
"French",
"German"
]
},
{
"name":"1_karbala_rd",
"is_premium":true,
"prebooking_required":true,
"hashtags":[
"crafts",
"boats",
"party"
],
"languages":[
"English"
]
}
],
"query":"",
"processingTimeMs":5,
"limit":3,
"offset":0,
"estimatedTotalHits":3
}
Conclusion
In this article, we saw how to make your search functions lightning-fast using MeiliSearch and Python. This powerful tool enhances search capabilities in your projects, making it a valuable addition for developers looking to provide efficient search functionality for their users.
But here’s the exciting part: MeiliSearch has more advanced features we haven’t discussed yet. It can do even more amazing things to enhance your search capabilities. In the next article, we’ll explore these advanced features and show you how to use them to improve your search experience.
If you found this article helpful, do share it with your peers. Feedback and Suggestions to the article are most welcome! ❤️