Good advice on how to implement micro-services on .NET
published on 2022/10/09
We've been doing services/micro-services for many years now and have learnt by getting many things wrong and things right (eventually). I'm not aware of a single resource that covers everything, but here's some high level > things to look into/advice:
- BEWARE shared libraries. It's tempting to stick things into shared libraries rather than duplicate code, this is a common pitfall that can take years to correct once established. Business logic belongs in services and be careful about locking yourself across the entire system into a set of dependencies.
- Duplication between services is OK. Don't jump to stick things into libraries in the early days.
- Understand and accept the fallacies of distributed computing.
- Do create/maintain service templates that get the basics right and let people spin up services easily. Don't make these overly complex, just the basics.
- Beware of using things like NServicebus/MassTransit etc. to communicate between services. You are going to lock yourself into them hard, major version upgrades become system wide, you lock yourself into that AND only .NET can communicate with your system. We ended up having to do a rewrite to get out of this situation.
- Services should expose simple APIs, contracts/protocols as much as possible.
- Internally, services can use whatever they want (Rabbit, SQS, Kafka, Akka) but shouldn't bleed this over onto other services. There are sometimes exceptions to this however.
- Avoid creating official service clients. Tempting, but it often leads to logic/behavior drifting into the client libraries which makes it impossible to onboard another tech stack. Protocols and contracts!
- Monitoring - Have a comprehensive strategy for collection of and alerting on:
- Logging
- Metrics
- Tracing
- Service health
- Performance (errors rates, latency, status codes, trends)
Resilience (this is supposed to be a subsection, but reddit won't let me do it)
- Retries are critical.
- Understand when you can and when you can't retry (idempotency).
- Strongly favor idempotency because transient errors/timeouts will happen and you will have to retry.
- Try push most resilience to the infrastructure level (balancing/failover, retries, circuit breaking). Service meshes have most of this functionality built in.
- Beyond simple retries, you will need backoffs and then replays. Many people (including us) rely on queues + error/dead letter queues or streams do to this (eventual consistency).
- Transactions/Consistency: This is a major one for people moving from a monolith where typically you have one db and one transaction. Things are now spread over the network and might involve multiple services/DBs/cloud services that can't participate in the same transaction.For many things, you need to move to using eventual consistency with retries.
Develop/Testing
- Ensure services can be developed and tested in isolation and this experience is good. Requiring the whole system to run in local dev is a scaling time bomb. Very hard to fix this once its set in.
- You are going to need a good mocking strategy.
- You will want to lean more on isolated integration tests (automated tests that poke at a single service while its running). There's probably more, but off the top of my head, those are the most common headaches.
These are sensible advice.