Index এবং Document Structure খুঁজে বের করা

Elasticsearch প্রথমে একটু জটিল মনে হতে পারে—কিন্তু একবার কোথায় কী দেখতে হবে বুঝে গেলে খুব সহজে index, document structure আর efficient query তৈরি করা যায়।

চলুন এটাকে দুই ভাগে দেখি:
(১) index ও structure কীভাবে খুঁজবেন
(২) কীভাবে query narrow (সংকুচিত) করবেন

১. Index এবং Document Structure খুঁজে বের করা

সব index দেখুন

GET _cat/indices?v

এখানে আপনি সব index, তাদের size এবং health দেখতে পাবেন।

📑 Index mapping (structure) দেখুন

এটাই সবচেয়ে গুরুত্বপূর্ণ:

GET <your_index_name>/_mapping

এখানে আপনি পাবেন:

Field নাম
Data type (text, keyword, date, integer)
Nested object structure

উদাহরণ:

{
  "properties": {
    "name": { "type": "text" },
    "email": { "type": "keyword" },
    "createdAt": { "type": "date" }
  }
}

Sample document দেখুন

GET <your_index_name>/_search
{
  "size": 5
}

এতে আপনি বুঝতে পারবেন:

আসল data কেমন
কোন field কীভাবে ব্যবহার হচ্ছে

Field capabilities দেখুন

GET <your_index_name>/_field_caps?fields=*

এটা দেখায়:

কোন field আছে
তাদের type কী

যদি আপনি Kibana ব্যবহার করেন

তাহলে:

Index Management → index দেখুন
Discover tab → data explore করুন

২. Query narrow করার best strategy

মূল ধারণা: আগে filter করুন, তারপর scoring করুন

`bool` query ব্যবহার করুন

{
  "query": {
    "bool": {
      "must": [
        { "match": { "name": "john" } }
      ],
      "filter": [
        { "term": { "status": "ACTIVE" } },
        { "range": { "createdAt": { "gte": "now-7d" } } }
      ]
    }
  }
}

কেন এটা ভালো:

must → text search (score হিসাব করে)
filter → exact match (দ্রুত, cache হয়)

Tip: যত সম্ভব condition filter এ রাখুন

Field অনুযায়ী সঠিক query ব্যবহার করুন

Field Type	Query
`text`	`match`, `match_phrase`
`keyword`	`term`, `terms`
`date`	`range`
number	`range`, `term`

ভুল:

{ "match": { "status": "ACTIVE" } }

সঠিক:

{ "term": { "status": "ACTIVE" } }

Filter দিয়ে dataset ছোট করুন

"filter": [
  { "term": { "country": "DE" } },
  { "range": { "price": { "lte": 100 } } }
]

এতে Elasticsearch কম data নিয়ে কাজ করবে → fast হবে

`_source` filtering ব্যবহার করুন

{
  "_source": ["name", "email"]
  "query": {
    "bool": {
      "must": [
        { "match": { "name": "john" } }
      ],
      "filter": [
        { "term": { "status": "ACTIVE" } },
        { "range": { "createdAt": { "gte": "now-7d" } } }
      ]
    }
  }
}

অপ্রয়োজনীয় field না আনলে performance ভালো হয়।

Aggregation দিয়ে data বুঝুন

{
  "size": 0,
  "aggs": {
    "status_count": {
      "terms": { "field": "status.keyword" }
    }
  }
}

এটা কাজে লাগে:

data distribution বুঝতে
debugging করতে

Query debug করার পদ্ধতি

শুরু করুন:

match_all

ধাপে ধাপে condition যোগ করুন
ব্যবহার করুন:

GET <your_index>/_explain/id

অথবা:

"profile": true

Pro Strategy (প্রফেশনালদের approach)

আগে mapping বুঝুন
real document দেখুন
field type অনুযায়ী query লিখুন
যত বেশি সম্ভব filter ব্যবহার করুন
performance test করুন

Optimized Query উদাহরণ

{
  "_source": ["name", "email"],
  "query": {
    "bool": {
      "must": [
        { "match": { "description": "electric vehicle charging" } }
      ],
      "filter": [
        { "term": { "country": "DE" } },
        { "range": { "createdAt": { "gte": "now-30d" } } }
      ]
    }
  }
}