This article talks about the implementation of a search with Elasticsearch on a Symfony project.
Install Elasticsearch
- Install an Elasticsearch server and see your indexed documents
- Install the Elastica library and the FOSElasticaBundle on Symfony
- Index your first documents documents
- Search your documents with a search/filter form (based on simple criterias)
- Organize your code in the right way
Set of data used in this article
To realise our search, we’re going to create “basic” entities we can index : blog posts/articles.
<?php namespace Obtao\BlogBundle\Entity; use Doctrine\ORM\Mapping as ORM; use FOS\ElasticaBundle\Configuration\Search; /** * Article * * @ORM\Table(name="article") * @Search(repositoryClass="Obtao\BlogBundle\Entity\SearchRepository\ArticleRepository") * @ORM\HasLifecycleCallbacks * @ORM\Entity(repositoryClass="Obtao\BlogBundle\Entity\Repository\ArticleRepository") */ class Article { /** * @var integer * * @ORM\Column(name="id", type="integer", nullable=false) * @ORM\Id * @ORM\GeneratedValue(strategy="IDENTITY") */ protected $id; /** * @var string * * @ORM\Column(name="title", type="string", length=250, nullable=false) */ protected $title; /** * @var string * * @ORM\Column(type="text", nullable=false) */ protected $content; /** * @ORM\Column(name="created_at", type="datetime") */ protected $createdAt; /** * @ORM\Column(name="published_at", type="datetime", nullable=true) */ protected $publishedAt; /** * @ORM\PrePersist */ public function prePersist() { $this->createdAt = new \DateTime(); } public function isPublished() { return (null !== $this->getPublishedAt()); } // others getters and setters }
Create and configure the mapping file
To implement the indexing, you must specify the format for your documents . The fields name, their type, the filter to be applied to the search query and the indexed string, …
We’ll only talk about the entities mapping, and not the configuration of the index itself. For more details about the way to index and search the documents, read our article.
In your config.yml file, import a new fos_elastica.yml file that will contain all the configuration, and add the required parameters (host and port) in your parameters.yml file.
# app/config/config.yml imports: - { resource: fos_elastica.yml }
# app/config/parameters.yml.dist parameters: elastic_host : localhost elastic_port : 9200
# app/config/fos_elastica.yml fos_elastica: clients: default: { host: %elastic_host%, port: %elastic_port% } indexes: obtao_blog: client: default types: article: mappings: id: type: integer createdAt : type : date publishedAt : type : date published : type : boolean title : ~ content : ~ persistence: driver: orm model: Obtao\BlogBundle\Entity\Article finder: ~ provider: ~ listener: ~
- Clients : Define which clients are available for search (here, only a “default” client)
- Indexes : The name of the index in Elasticsearch. You can compare it with the name of a database in SQL. It gathers types of documents as a database gathers tables.
- Types : Specify the different types of documents that will be indexed. In this example, we have only one type : the article. A type can be compared to a database table.
- mapping : List of your document properties and types. It can be compared to SQL table column.
- persistence : Define how FOSElasticaBundle will index your documents depending on your Symfony entities.
- Driver : Driver to use (here, as often, ORM)
- Model : Allow to define an Elasticsearch document from a Symfony entity. It’s the easiest way to index documents : using the built-in models already defined for your application.
- Finder : Search interface. For the moment, we use the default one. With this service, you can perform a search to Elasticsearch
- Provider : Indexing interface. For the moment, we use the default one. With this service, you can define how to index in Elasticsearch
- Listener : The list of the listeners for which the indexing is called (default : insert, update, delete. Used in most cases)
Now, you know the “basic” configuration and how to define your Elasticsearch documents depending on your Doctrine entities.
Index / See your indexing results
It’s the moment to insert some test datas in your database, and then to fill your Elasticsearch index with the command :
$ app/console fos:elastica:populate
This command uses the provider and loop on all your Doctrine objects to fill the index.
The method is easy : for each entry defined in your mapping configuration (fos_elastica.yml), the corresponding getter is called and the returned value will be insert in your Elasticsearch index.
So, you can insert computed data that does not exist in Doctrine, simply defining a getter (like here with published/isPublished).
One the indexing is over, you can see your documents in the plugin Head “Browser” (http://localhost:9200/_plugin/head/)
Thanks to the listener, each time a Doctrine insert/insert is performed, your document will be updated in the Elasticsearch index.
Create a Symfony search object
We are going to create a search object on which the form that will help you to handle the search will be based.
This object will contain various properties which will be mapped to our filters, sort and pagination criterias. Take a look below to see what it might look like :
<?php namespace Obtao\BlogBundle\Model; use Symfony\Component\HttpFoundation\Request; class ArticleSearch { // begin of publication range protected $dateFrom; // end of publication range protected $dateTo; // published or not protected $isPublished; protected $title; public function __construct() { // initialise the dateFrom to "one month ago", and the dateTo to "today" $date = new \DateTime(); $month = new \DateInterval('P1Y'); $date->sub($month); $date->setTime('00','00','00'); $this->dateFrom = $date; $this->dateTo = new \DateTime(); $this->dateTo->setTime('23','59','59'); } public function setDateFrom($dateFrom) { if($dateFrom != ""){ $dateFrom->setTime('00','00','00'); $this->dateFrom = $dateFrom; } return $this; } public function getDateFrom() { return $this->dateFrom; } public function setDateTo($dateTo) { if($dateTo != ""){ $dateTo->setTime('23','59','59'); $this->dateTo = $dateTo; } return $this; } public function clearDates(){ $this->dateTo = null; $this->dateFrom = null; } public function getDateTo() { return $this->dateTo; } public function getIsPublished() { return $this->isPublished; } public function setIsPublished($isPublished) { $this->isPublished = $isPublished; return $this; } public function getTitle() { return $this->title; } public function setTitle($title) { $this->title = $title; return $this; } }
In this object, we have defined our search criterias. The goal is not necessarily to map all the object fields to search as they are. For example, we don’t want to allow to search on the “content” property, that’s why the ArticleSearch object has no “content” property. Conversely, the “publishedAt” property of the Article object becomes “dateFrom” and “dateTo” in the ArticleSearch object as we want to search the articles published between two dates. We also have defined a “isPublished” property as we only want to retrieve the published or unpublished articles. We could have added two properties “createdFrom” and “createdTo” to find all the articles created between two dates.
Actually, the “date” (“dateFrom”/”dateTo”) and “isPublished” filters are not compatible : specify any date implies that we only search published articles (as we filter on the publication date). So, specify a date range et ask for the unpublished articles will never return any result. It’s not a great example but our goal is not to make the app of the century but to show what you can do with Elasticsearch and Symfony.
You can obviously add other properties depending on your needs and wishes.
Create the associated search form
This object will be associated to a form that will allow us to choose our criterias. Here is the form (classic but efficient as you can see) :
<?php namespace Obtao\BlogBundle\Form\Type; use Obtao\BlogBundle\Model\ArticleSearch; use Symfony\Component\Form\AbstractType; use Symfony\Component\Form\FormBuilderInterface; use Symfony\Component\OptionsResolver\OptionsResolverInterface; class ArticleSearchType extends AbstractType { public function buildForm(FormBuilderInterface $builder, array $options) { $builder ->add('title',null,array( 'required' => false, )) ->add('dateFrom', 'date', array( 'required' => false, 'widget' => 'single_text', )) ->add('dateTo', 'date', array( 'required' => false, 'widget' => 'single_text', )) ->add('isPublished','choice', array( 'choices' => array('false'=>'non','true'=>'oui'), 'required' => false, )) ->add('search','submit') ; } public function setDefaultOptions(OptionsResolverInterface $resolver) { parent::setDefaultOptions($resolver); $resolver->setDefaults(array( // avoid to pass the csrf token in the url (but it's not protected anymore) 'csrf_protection' => false, 'data_class' => 'Obtao\BlogBundle\Model\ArticleSearch' )); } public function getName() { return 'article_search_type'; } }
Search with Elasticsearch
Now you can implement your search function. The principle is simple : in a controller, you instanciate the search form and, if it’s submitted, you call the search(ArticleSearch $articleSearch) of the ArticleRepository class. Indeed, in order to keep the code well organized and do the thing in the right way, you should place everything related to the search queries building in a dedicated class.
Remember, we have compared the search in Elasticsearch with a query in a database. You would never create a Doctrine query in a controller, right? So, here it’s the same situation (the ones who have answered “yes” can run naked in the nettles during 20 minutes).
Here is the controller (simplified) :
<?php namespace Obtao\BlogBundle\Controller; use Obtao\BlogBundle\Form\Type\ArticleSearchType; use Obtao\BlogBundle\Model\ArticleSearch; use Symfony\Bundle\FrameworkBundle\Controller\Controller; use Symfony\Component\HttpFoundation\Request; class ArticleController extends Controller { public function listAction(Request $request) { $articleSearch = new ArticleSearch(); $articleSearchForm = $this->get('form.factory') ->createNamed( '', 'article_search_type', $articleSearch, array( 'action' => $this->generateUrl('obtao-article-search'), 'method' => 'GET' ) ); $articleSearchForm->handleRequest($request); $articleSearch = $articleSearchForm->getData(); $elasticaManager = $this->container->get('fos_elastica.manager'); $results = $elasticaManager->getRepository('ObtaoBlogBundle:Article')->search($articleSearch); return $this->render('ObtaoBlogBundle:Article:list.html.twig',array( 'results' => $results, 'articleSearchForm' => $articleSearchForm->createView(), )); } }
This should work. Finally, here is the search method which builds the query for Elasticsearch. If you read this article, you are probably here for that, and thank to be still here.
<?php namespace Obtao\BlogBundle\Entity\SearchRepository; use FOS\ElasticaBundle\Repository; use Obtao\BlogBundle\Model\ArticleSearch; class ArticleRepository extends Repository { public function search(ArticleSearch $articleSearch) { // we create a query to return all the articles // but if the criteria title is specified, we use it if ($articleSearch->getTitle() != null && $articleSearch != '') { $query = new \Elastica\Query\Match(); $query->setFieldQuery('article.title', $articleSearch->getTitle()); $query->setFieldFuzziness('article.title', 0.7); $query->setFieldMinimumShouldMatch('article.title', '80%'); // } else { $query = new \Elastica\Query\MatchAll(); } $baseQuery = $query; // then we create filters depending on the chosen criterias $boolFilter = new \Elastica\Filter\Bool(); /* Dates filter We add this filter only the getIspublished filter is not at "false" */ if("false" != $articleSearch->getIsPublished() && null !== $articleSearch->getDateFrom() && null !== $articleSearch->getDateTo()) { $boolFilter->addMust(new \Elastica\Filter\Range('publishedAt', array( 'gte' => \Elastica\Util::convertDate($articleSearch->getDateFrom()->getTimestamp()), 'lte' => \Elastica\Util::convertDate($articleSearch->getDateTo()->getTimestamp()) ) )); } // Published or not filter if($articleSearch->getIsPublished() !== null){ $boolFilter->addMust( new \Elastica\Filter\Terms('published', array($articleSearch->getIsPublished())) ); } $filtered = new \Elastica\Query\Filtered($baseQuery, $boolFilter); $query = \Elastica\Query::create($filtered); return $this->find($query); } }
This is it. I spare you the template that has no interest in this article (you can find it here ). You now have a list of articles with a little search form that allow you to filter the results.
In the next article, we’ll see how to paginate this list and add sort criterias (in addition to our filters).
To know more about Elasticsearch in a Symfony project (but not only), read our other articles on the topic
In your file app/config/fos_elastica.yml at the line 5 you configure your host and port
I want to do the same thing, but connecting to a cluster.
I have tried to use like this:
default:
– { host: localhost, port: 9200 }
– { host: localhost, port: 9201 }
– { host: localhost, port: 9202 }
But it does not work
Do you know what should I do?
Thanks in advance
Hi,
What error do you have?
An Elasticsearch cluster is a group of nodes handled by a master node. Your application can contact this master node.
The default behaviour is that the first node became master node, when other nodes wake up and search for a cluster to join, as a master exists they became only “data” nodes.
If you have the head plugin on your server, take a look at the homepage. You will see your cluster and the attached nodes.
In the >1.0.0 version, the master node is marked by a Star before the name (a dot for other nodes)
In the <1.0.0 version, the master node was marked by an orange color. (White for the others)
A first try would be to connect only the master and see if your application can connect to it.
In head plugin screen, you can get more infos about a node and http_address value contains path and port to your cluster.
Hope this will help,
François
Hi Francois,
Thanks for the quick answer
I think there is something wrong with your concept about cluster connection.
As you we see here http elastica.io/getting-started/installation.html#section-connect-cluster, the elastica library itself shows how to configure a connection with a cluster, we need to pass all nodes we want to connect.
The ruby and perl libraries also works in the same way, and the reason is High Availability, if one of the nodes goes down, the client still have other(s) node(s) to connect and keep the application working.
That’s the reason of my question.
You are right! Master nodes are there to handle shards allocation.
I wanted to know if you can connect to one of your nodes (your master node).
Your error was in yaml config file read?
If you success to connect to one of your nodes, please try this configuration :
fos_elastica:
clients:
default:
servers :
– {host: localhost,port: 9200}
– {host: localhost, port: 9201}
– {host: localhost, port: 9202}
If the first node is down (9200), you will see 2 requests :
1 to 9200 => returns an error
1 to 9201 => returns your response
I don’t know if “round robin algorithm” works here. But your use case is covered (If one of your nodes is down, there is a spare one)
François
when i run the command fos:elastica: populate i get this error Fatal error: Class ‘Obtao\BlogBundle\Entity\Repository\ArticleRepository’ not found in C:\NetBeansProjects\elasticsearch\vendor\doctrine\orm\lib\Doctrine\ORM\Repository\DefaultRepositoryFactory.php on line 75.
please could you help me figure this out
Me too.
@Francois, please rely this question. I hope will get your answer asap.
Thanks.
@Gamo Nana
The issue has been fixed.
Steps as below:
1. Create file ..\Acme\DemoBundle\Resources\config\doctrine\Article.orm.yml
with content:
# src/AppBundle/Resources/config/doctrine/Product.orm.yml
Acme\DemoBundle\Entity\Article:
type: entity
table: article
id:
id:
type: integer
generator: { strategy: AUTO }
fields:
title:
type: string
length: 100
content:
type: string
length: 100
createdAt:
type: datetime
length: 100
publishedAt:
type: datetime
length: 100
2. Add getter function to Article.php
id;
}
/**
* Get id
*
* @return integer
*/
public function getTitle()
{
return $this->title;
}
/**
* Get content
*
* @return integer
*/
public function getContent()
{
return $this->content;
}
/**
* Get id
*
* @return integer
*/
public function getCreatedAt()
{
return $this->createdAt;
}
/**
* Get id
*
* @return integer
*/
public function getPublishedAt()
{
return $this->publishedAt;
}
/**
* @ORM\PrePersist
*/
public function prePersist()
{
$this->createdAt = new \DateTime();
}
public function isPublished()
{
return (null !== $this->getPublishedAt());
}
// others getters and setters
}
I have try and worked for me.
Thanks
Hi !
Are you using the blog sandbox from github ? Or creating a new application ?
If you are not, just remove this line
* @ORM\Entity(repositoryClass=”Obtao\BlogBundle\Entity\Repository\ArticleRepository”)
From Article entity
Hi,
great Article-Series. Are planning on writing something about “Organize your code in the right way”. I would be very interested in your thoughts on that.
Thanks, Hannes
i cant seem to get past this error Attempted to call an undefined method named “search” of class “FOS\ElasticaBundle\Repository” in the listAction function in the controller
great tutorial on symfony2 and elasticsearch. symfony2 should include this tutorial to its cookbook.