Share Button

Lire la version française

Elasticsearch is able to index huge sets of data, documents as well as numeric data.
In the versions before the v1.0.0, facets allowed to calculate statistics for a list of indexed documents (tag distribution, mean, standard deviation, …)
Over time, the use of facets evolved. The developers wished to use them to do more and more complex statistics.

To fit this need, the great Elasticsearch team added the implementation of aggregations.
- Metric: sum, minimum, maximum, mean, …
- Buckets : term, date or value distributions (and many others) aggregations can contain sub aggregations. So, we can calculate statistics inside a term distribution, how crazy is that?

The goal of this post is to describe one of the many uses of aggregations : statistics.

Continue reading

Share Button

Share Button

Lire la version française

In this post, we’re going to sort and paginate our articles list with Symfon Elasticsearch and the WhiteOctoberPageFantaBundle. This post follows the post about Indexing and simple search with Elasticsearch and Symfony that you should read. We suppose you have already implemented the search system and all the code will not be shown here.
Paginate with Symfony is as easy as adding some properties to our Search model and ask the WhiteOctoberPageFantaBundle to handle the pagination.

Updating the model

First, let’s update the search model to add some properties to handle the sorting and the pagination (as we said before, we’ll not show the whole code but only the one which has been changed since we wrote the previous post). And as usually, you can find the full project on Github.
Continue reading

Share Button

Share Button

Lire la version française

The XML configuration for FOSElasticaBundle is powerful, but in some cases this is not enough.
In order to solve this, FOS allows you to override the transformation from Doctrine Object to an ElasticSearch Document : ModelToElasticSearchTransformer.

Continue reading

Share Button

Share Button

Lire la version française

When and why hydrating objects thanks to Transformers?

When you use Elasticsearch and the FosElasticaBundle Finders, the responses received from Elasticsearch are automatically transformed into Doctrine objects.

In some cases, you will need to display some information that come from a join : a photo, a category translation, …
Unfortunately, the hydratation is only done on the objects : FosElasticaBundle transforms the ids that match your search into objects via a select * in (:ids).
The consequence is that, for each entry in your results list, Doctrine will make a query for the join object (for example a Category). For 100 results in 100 different categories, 100 simple queries will be done. You understand what can happen with a complex model and several joins.

Fortunately, you can avoid that by overriding the Transformer.
Continue reading

Share Button

Share Button

Lire la version française

In a non relational database system, joins can miss. Fortunately, Elasticsearch provides solutions to meet these needs :

Array Type

Read the doc on elasticsearch.org

As its name suggests, it can be an array of native types (string, int, …) but also an array of objects (the basis used for “objects” and “nested”).
Continue reading

Share Button

Share Button

Lire la version française

Elasticsearch allows you to make advanced searches. Some users may want to extract their search results to Excel (or LibreOffice Calc…) to work on the data.

As we explained in our post Export data to a csv file with Symfony, the goal of a (successful) export is to limit the effect on the server.

With Elasticsearch, we can use the scan and scroll functions to iterate.

Continue reading

Share Button

Share Button

Lire la version française

This article talks about the implementation of a search with Elasticsearch on a Symfony project.

Install Elasticsearch

Continue reading

Share Button

Share Button

Lire la version française

This article is the second one about Elasticsearch.
Its goal is not to present what is Elasticsearch, you can read more about it on other blogs, like this one (in French). But it will allow you to get to the heart of the matter, and to understand the following articles.

Continue reading

Share Button

Share Button

Lire la version française

We have worked recently on search engines in different domains :

  • 2 multi-criterias front office search engines moteurs de recherche multi-critères front-office (for frontend users)
  • 2 multi-criterias back office search engines (for administrators)
  • on statistical engine (computing, agregations, sums, means, …)

Build these engines on a database would have been too costly (in development time but also for performance reasons) for approximate results (for both statistic and full-text search).

For these reasons, we have decided to use Elasticsearch : a search engine based on Lucene. Here is the result of our adventures : 11 articles (including this one):
Continue reading

Share Button

Share Button

Lire la version française
I recently had to export a huge set of data to a csv file. This is easy and fast to do if you don’t care about memory and User Experience. I wanted the memory consumption does not increase with the volume of data.

I got inspiration from this post (in French) but, in spite of what is written in the post, some tests with the memory_get_usage function proved that the memory consumption increased quickly with the number of datas.
Continue reading

Share Button