The type called another_type and the index called another is shown in order to emphasize that Elasticsearch is multi-tenant, by which we mean that a single server can store multiple indexes and multiple types.
In the Elasticsearch documentation and related material, we often see the term “mapping type”, which is actually the name of the type inside the index, such as my_type and another_type in the figure above. When we talk about types in Elasticsearch, it is usually this definition of type. It is not to be confused with the type key inside each mapping definition that determines how the data inside the documents are handled by Elasticsearch.
Elasticsearch has the ability to be schema-less, which means that documents can be indexed without explicitly providing a schema.
If you do not specify a mapping, Elasticsearch will by default the date histogram facet from working properly.
By explicitly specifying the schema, we can avoid these problems.
The mapping is usually provided to Elasticsearch as JSON, and is a hierarchically structured format where the root is the name of the type the mapping applies to.
At the root level of the mapping, right under the type name, Elasticsearch supports a few “special” fields to configure how we should treat metadata that is not part of the document being posted, such as its type, id, size and its ip, which can be used to effectively index and search geographical locations and IPv4 addresses respectively. Using the multi_field type, we can even index a single document field into multiple virtual fields. We’ll elaborate on this in a future article.
There are two ways of providing a mapping to Elasticsearch. The most common way is during index creation:
curl -XPOST ...:9200/my_index -d '{
"settings" : {
# .. index settings
},
"mappings" : {
"my_type" : {
# mapping for my_type
}
}
}'
Another way of providing the mapping is using the Put Mapping API.
$ curl -XPUT 'http://localhost:9200/my_index/my_type/_mapping' -d '
{
"my_type" : {
# mapping for my_type
}
}
'
Note that the type (my_type) is duplicated in the request path and the request body.
This API enables us to update the mapping for an already existing index, but with some limitations with regards to potential conflicts. New mapping definitions can be added to the existing mapping, and existing types may have their configuration updated, but changing the types is considered a conflict and is not accepted. It is, however, possible to pass ignore_conflicts=true as a parameter to the Mapping API, but doing so does not guarantee producing the expected result, as already indexed documents are not re-indexed automatically with the new mapping.
Because of this, specifying the mapping during creation of the indexes is recommended over using the Put Mapping API in most cases.
I have now introduced you to the schema/mapping in Elasticsearch and demonstrated how the mapping is a hierarchical definition of data types. In a later article, I will go into more detail about a workflow I use when I explore new datasets with Elasticsearch.