Amazing Speed: Elasticsearch for the .NET Developer. Codestock, 2015

This presentation is from a talk I gave at Codestock, 2015, which had over 900 attendees. There were between 50 and 60 people who attended this talk.

Here’s the text of the slides:
Amazing Speed:
Elasticsearch for the .NET Developer
4:05 PM / 301-D
Adrian Carr
With special guest- Shaunak Kashyap, Developer Advocate, Elastic.co
A little about me…
• Software Engineer & Team Lead
o The White Stone Group in Knoxville, TN.
• Previously
o Jewelry Television
o Alltel/Fidelity
o One terrible non-profit in Boone, NC that I don’t like to talk about.
• Several small startups

Experience:
• Full stack developer, C#, Ruby/Rails, JavaScript, Java, etc……. “Business Problem Solver”

A little background…
• We have some large clients, with lots of patients.

Really slow sometimes…
• Some legacy database decisions that weren’t so good back in the day.
• Worked just fine when volume was lower.
• Re-architecting would mean changing too many apps.
• Maybe you just have too much data to search efficiently, or it isn’t structured for search.

I’ve seen this before…

• Jewelry Television
• At the time, a lot of the database design wasn’t very good, but it worked.

o ~$200 million when I started in 2005
o ~$550 million three years later

The JTV Solution:

• It was awesome!

I’m a Business Problem Solver…

So….

A little research…
I found Elasticsearch -built on top of Lucene.

I found:
Wikipedia

And here:
GitHub

And here:
StackOverflow

What is Elasticsearch?
Elasticsearch is a distributed, open source search and analytics engine, designed for horizontal scalability, reliability, and easy management

So, I did a proof of concept

Terminology
• Node: DB Instance. (Java process running Elasticsearch)
• Cluster: Database Cluster- One or more nodes with same cluster name
• Index: Database, logical grouping of tables
• Type: Database Table
• Document: Like a row. JSON, key-value pairs
• Fields: Columns
• Shard: Worker processes. (These mostly happen automagically by Elastic)

Tools
• Elasticsearch (Obviously)
o Demo of how to install and run. (It’s easy.)

• Browser http://localhost:9200/, etc.

• Sense- a fantastic Chrome plugin

• C# with Nest

How to Load & Synchronize Data?
• It depends…

This is what I did:
• SQL Server Triggers on data that matters
• Windows Service
o Nest, C#, Elastic’s Bulk API (Found 10,000 rows at a time to be the sweet spot for insertion speed)

Gotchas
• Elasticsearch.net vs Nest
• Documentation on Elasticsearch is fantastic. Documentation on Nest is very sparse.

More Gotchas
• It’s a different way of thinking.
o Not RDBMS
o Joins? Nope
• Case sensitivity
• Index size- disk space can grow a lot
• Integration with existing application. Are your existing users going to have their cheese moved?

There is much more to know…

• Analyzers- These can be very complex
• Scaling- Elastic has a lot built in, but it can be tuned quite a bit.
• Field weighting- Relevance of blog post vs body vs comments
• This is just a start.

Questions?
• Code samples will eventually be at http://adriancarr.com
• Contact:

o adriancarr@gmail.com

o shaunak@elastic.co

o https://discuss.elastic.co/

Thank You!

Leave a Reply

Your email address will not be published. Required fields are marked *