Want insights on the performance of D2C, eCommerce brands in the US & India during COVID lockdowns?

Visit Our Markets Data

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Merchantry

Get our essays & interviews delivered to your inbox.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Here's how we allocate resources for Elasticsearch

Ashwini Murthy

Sep 22, 2017

Min Read

Here's how we allocate resources for Elasticsearch

Start Free Trial

At PipeCandy, apart from a lot of ML and automation, we have a custom reporting and analysis team as well. This team, not being tech savvy, needs a way to query data from records of ~150 million people and ~10 million companies. The requirement was to perform partial search, exact search, fuzzy match, etc. on text data. We opted for Elasticsearch to do the job. We are using Elasticsearch cluster managed by AWS. After getting familiar with APIs and indexing, the question in front of the team was: "How big of a cluster do we need?". I'll walk you through some basic math that we did while selecting infra for Elasticsearch. One needs to account for:

Data storage
Computation
Network
Distribution of data

Data Storage

Every instance in Elasticsearch will have an upper cap on the disk that you can use with Elasticsearch. So this calculation matters

AWS allocates 5% of total available volume to the OS for internal file system
20% of total space is reserved for background Elasticsearch operations
One should consider that while heavy write operations, the disk usage will shoot up till the Lucene indexes are merged. In my case for 12GB of data, the disk required spiked to a max value of 20GB. As per my discussion with AWS support, one should get worried when there is only 25% of total space is left on the slave with least free space.

So if you opt for a cluster with 100GB of space, you'll be having no issues with reading and writing operations till you require 50GB (100 - 25 - 25) or less of space for your data. Index a %age of your total data and use the above math to get the final space that you'll be consuming on the cluster. Keep in mind that I haven't taken into account the replication factor.

Computation

The cost of computation will determine CPU resource requirement. It will be determined by following points

Analyzer: During both query and indexing operations, the analyzer will play a crucial role. From Lucene documentation, "An Analyzer builds Token Streams, which analyze text. It thus represents a policy for extracting index terms from text." Elasticsearch offers 8 analyzers by default and you can define your custom analyzer as well. Each of the analyzers will do different kinds of text extraction and hence different computation costs.
Reindexing: The frequency of reindexing the data will also matter. The more complex your indexing is going to be, the more your infra should shift towards compute intensive resource.
Data Volume: More the data, more the processing, higher the cost.

Network

The network requirement will depend on the size of result documents that you are getting from the query. Generally, high compute machines will have better network bandwidth as well. One way to lower your network requirement will be to ask for only those fields in the result that you actually want to use. This can be done via source filtering.

Data Distribution

The size of the cluster can be scaled vertically or horizontally. Assuming cost is same for the two scaling options, if possible, try for a horizontally scaled cluster (more machines over higher configuration machines) to ensure your data is replicated in multiple physical locations. This will provide better fault tolerance and gives better search throughput since your queries can be executed on all replicas in parallel. One should keep in mind that OS, JVM, and indexing will also have their own overheads. So, the machine configuration should be big enough to take care of these. I've discussed four points - storage, computation, network and data distribution. If you think I've omitted some point which should be considered for resource calculation, please let us know in the comments. Feedback is welcome. This post was written by Ashutosh. He's a part of our kickass tech team.

Prospector

The holy grail of predictable demand generation

Build surgically precise lead lists of your ideal eCommerce customers

Start Free Trial

Firehose

Closed lost? Not really.

Enrich your leads & engage again. Generate demand from your ignored CRM leads with right segmentation and messaging.

Start Free Trial

Researcher

The gold standard of eCommerce market size estimates

We hand-count & research every active eCommerce company. Know your true TAM.

Start Free Trial

Be a Segmentation Superpower

eCommerce-specfic lead qualification criteria

Freedom from non-Commerce databases.

Predictable demand generation. Very precise segmentation & messaging.

Free Trial

List of Growing D2C brands

2000 Companies

View List

Trusted by world's leading eCommerce enablers to research eCommerce companies and reach out to them

Get 30 mins free consultation with our research analyst.

Book Consultation

Slips poor jokes & gets away with a poker face. Carries a no BS attitude at getting things done. First to arrive at the office, Ashwin’s energy does not ebb through the day. Ashwin is one of the co-founders and he sets the tone for marketing, sales, design & culture.

Ashwin Ramasamy

CEO - PipeCandy

View My Other Articles

Get PipeCandy essays straight to your inbox.

Every week we send out tidbits that capture the eCommerce industry and its evolution, right to your inbox. It's free. No spam. Choose to opt-out whenever.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Get PipeCandy essays straight to your inbox.

Every week we send out tidbits that capture the eCommerce industry and its evolution, right to your inbox. It's free. No spam. Choose to opt-out whenever.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Get PipeCandy essays straight to your inbox.

Every week we send out tidbits that capture the eCommerce industry and its evolution, right to your inbox. It's free. No spam. Choose to opt-out whenever.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

eCommerce-specfic lead qualification criteria

Freedom from non-Commerce databases.

Predictable demand generation. Very precise segmentation & messaging.

Free Trial

List of Growing D2C brands

2000 Companies

View List

Be a Segmentation Superpower

eCommerce-specfic lead qualification criteria

Freedom from non-Commerce databases.

Predictable demand generation. Very precise segmentation & messaging.

Free Trial

List of Growing D2C brands

2000 Companies

View List

A writer by day. Illustrator by night. Currently trying to conquer the B2B marketing world one baby step at a time. Loves everything outside her comfort zone.

Ashwini Murthy

View My Other Articles

A writer by day. Illustrator by night. Currently trying to conquer the B2B marketing world one baby step at a time. Loves everything outside her comfort zone.

Ashwini Murthy

View My Other Articles

hm.com

HQ Location: Beaverton, Oregon, United States Of America, 97005

View full report

hm.com, operated by H & M Hennes & Mauritz AB, is an internationally-focused online store that generates eCommerce net sales primarily in Germany as well as in the United States and the United Kingdom. With regards to the product range, hm.com achieves the greatest part of its eCommerce net sales in the “Fashion” category.

Web Sales

Order Volume

Commercepedia Maturity Score

Record Coverage

High

Know more

Category

Arts and Entertainment, Business and Consumer Services, Business and Industrial, Consumer Electronics, Health and Fitness

Sub Category

Arts and Entertainment, Business and Industrial, Office Supplies and Stationery, Technology and Computing

hm.com funding details

Total venture and debt funding raised by hm.com from accredited investors.

View full report

The most accurate online retailers dataset curated by algorithms & analysts

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

The most accurate online retailers dataset curated by algorithms & analysts

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

A writer by day. Illustrator by night. Currently trying to conquer the B2B marketing world one baby step at a time. Loves everything outside her comfort zone.

Ashwini Murthy

View My Other Articles

A writer by day. Illustrator by night. Currently trying to conquer the B2B marketing world one baby step at a time. Loves everything outside her comfort zone.

Ashwini Murthy

View My Other Articles

Trusted by world's leading eCommerce enablers to research eCommerce companies and reach out to them

Get 30 mins free consultation with our research analyst.

Book Consultation

Trusted by world's leading eCommerce enablers to research eCommerce companies and reach out to them

Get 30 mins free consultation with our research analyst.

Book Consultation

Firehose

Closed lost? Not really.

Enrich your leads & engage again. Generate demand from your ignored CRM leads with right segmentation and messaging.

Learn more

Researcher

The gold standard of eCommerce market size estimates

We hand-count & research every active eCommerce company. Know your true TAM.

Learn more

New

COVID Impact on US D2C brands

Which D2C categories thrive & which ones struggle due to COVID – a data-driven report

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

New

D2C Playbooks

How do D2C brands think about growth. A perspective from early days of D2C

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

New

Global eCommerce market size

Total addressable market size of eCommerce companies in the US and other major markets

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

New

Subscription eCommerce industry report

Total addressable market of subscription eCommerce companies

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

New

Fashion industry & D2C fashion brands

How big is the fashion industry and how are the brands distributed by revenue?

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

New

More Reports

Country-specific eCommerce TAM reports & white papers

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

New

COVID Impact on US D2C brands

Which D2C categories thrive & which ones struggle due to COVID – a data-driven report

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

New

D2C Playbooks

How do D2C brands think about growth. A perspective from early days of D2C

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

New

Global eCommerce market size

Total addressable market size of eCommerce companies in the US and other major markets

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

New

Subscription eCommerce industry report

Total addressable market of subscription eCommerce companies

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

New

Fashion industry & D2C fashion brands

How big is the fashion industry and how are the brands distributed by revenue?

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

New

More Reports

Country-specific eCommerce TAM reports & white papers

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Prospector

The holy grail of predictable demand generation

Build surgically precise lead lists of your ideal eCommerce customers

Qualification Criteria

Technology Catogories

30 of 30 Catogories Selected

Website Technologies

6 of 1.5K Web Tech Selected

Product Category

Fashion & Apparel & 1K more

Product Sub Categories

Accessories, Babywear & 1k more

Web Traffic Rank

<1M, 1M-5M, 10M-25M & 9 more

Web Sales

<$1M, $1M-$5M, $10M... & 9 more

Countries

USA, Canada, UK & 192 more

States

California, Florida & 748 more

Customer Type

B2B, B2C, B2G, C2C

Physical Store

Choose Option

Own Brand

Choose Option

Business Tags

Subscription, DTC & & 500 more

Monthly Unique Visitors

$0 - $10B

Monthly Shipping Volume

<100, 100-500 & 5 more

Payment Partnerships

Alipay, Vantiv CNP & 2k more

Logistics Partnerships

UPS, FedEx, DHL & 2k more

Technology Catogories

Content Delivery

Content Management

CRM

Website Technologies

Azure Edge

CloudFront

Turn

DemDex

Apply Criteria

Lets Get Started!

Narrow down your keyword search using 'Commercepedia Keyword Search' on Top

Find a comprehensive list of eCommerce companies & DTC brands using 'Filters'

Here's how we allocate resources for Elasticsearch

Data Storage

Computation

Network

Data Distribution

The holy grail of predictable demand generation

Closed lost? Not really.

The gold standard of eCommerce market size estimates

How many e-commerce companies are there? What's the global e-commerce market size?

How to get started with predictive marketing for tech companies?

How PipeCandy delivers the industry's best estimates for online revenue of eCommerce companies

What is predictive lead scoring and why it works

Drones and delivery bots in last-mile fulfillment - Fad or the future?

The history of lead generation

Articles Based on Categories

Get PipeCandy essays straight to your inbox.

Get PipeCandy essays straight to your inbox.

Get PipeCandy essays straight to your inbox.

List of eCommerce Companies by Category

Here's how we allocate resources for Elasticsearch

Data Storage

Computation

Network

Data Distribution

The holy grail of predictable demand generation

Closed lost? Not really.

The gold standard of eCommerce market size estimates

How many e-commerce companies are there? What's the global e-commerce market size?

How to get started with predictive marketing for tech companies?

How PipeCandy delivers the industry's best estimates for online revenue of eCommerce companies

What is predictive lead scoring and why it works

Drones and delivery bots in last-mile fulfillment - Fad or the future?

The history of lead generation

Articles Based on Categories

Get PipeCandy essays straight to your inbox.

Get PipeCandy essays straight to your inbox.

Get PipeCandy essays straight to your inbox.

Get PipeCandy essays straight to your inbox.

Test for Widgets

Introducing – Subscription Creep

Announcing Agent Sift - the OG eCommerce companies search tool Copy

O2O and the Curious Case of U.S Omnichannel Retail

Get PipeCandy essays straight to your inbox.

Test for Widgets

Introducing – Subscription Creep

Announcing Agent Sift - the OG eCommerce companies search tool Copy

O2O and the Curious Case of U.S Omnichannel Retail

List of eCommerce Companies by Category