This presentation is from a talk I gave at Codestock, 2015, which had over 900 attendees. There were between 50 and 60 people who attended this talk.
Here’s the text of the slides:
Elasticsearch for the .NET Developer
4:05 PM / 301-D
With special guest- Shaunak Kashyap, Developer Advocate, Elastic.co
A little about me…
• Software Engineer & Team Lead
o The White Stone Group in Knoxville, TN.
o Jewelry Television
o One terrible non-profit in Boone, NC that I don’t like to talk about.
• Several small startups
A little background…
• We have some large clients, with lots of patients.
Really slow sometimes…
• Some legacy database decisions that weren’t so good back in the day.
• Worked just fine when volume was lower.
• Re-architecting would mean changing too many apps.
• Maybe you just have too much data to search efficiently, or it isn’t structured for search.
I’ve seen this before…
• Jewelry Television
• At the time, a lot of the database design wasn’t very good, but it worked.
o ~$200 million when I started in 2005
o ~$550 million three years later
The JTV Solution:
• It was awesome!
I’m a Business Problem Solver…
A little research…
I found Elasticsearch -built on top of Lucene.
What is Elasticsearch?
Elasticsearch is a distributed, open source search and analytics engine, designed for horizontal scalability, reliability, and easy management
So, I did a proof of concept
• Node: DB Instance. (Java process running Elasticsearch)
• Cluster: Database Cluster- One or more nodes with same cluster name
• Index: Database, logical grouping of tables
• Type: Database Table
• Document: Like a row. JSON, key-value pairs
• Fields: Columns
• Shard: Worker processes. (These mostly happen automagically by Elastic)
• Elasticsearch (Obviously)
o Demo of how to install and run. (It’s easy.)
• Browser http://localhost:9200/, etc.
• Sense- a fantastic Chrome plugin
• C# with Nest
How to Load & Synchronize Data?
• It depends…
This is what I did:
• SQL Server Triggers on data that matters
• Windows Service
o Nest, C#, Elastic’s Bulk API (Found 10,000 rows at a time to be the sweet spot for insertion speed)
• Elasticsearch.net vs Nest
• Documentation on Elasticsearch is fantastic. Documentation on Nest is very sparse.
• It’s a different way of thinking.
o Not RDBMS
o Joins? Nope
• Case sensitivity
• Index size- disk space can grow a lot
• Integration with existing application. Are your existing users going to have their cheese moved?
There is much more to know…
• Analyzers- These can be very complex
• Scaling- Elastic has a lot built in, but it can be tuned quite a bit.
• Field weighting- Relevance of blog post vs body vs comments
• This is just a start.
• Code samples will eventually be at https://adriancarr.com