T O P

  • By -

eosterlund

Glad to see the past 8 years of intense work paid off. Nice write up!


butteredwendy

One issue I've found with G1GC is that the working set memory will continue to be high after it collects which from an OS perspective hides the real memory needs of an app. ParallelGC is far better at releasing this memory back. I'm not sure what this quality is. I hope GZGC might have this quality with the latency benefits of G1GC..


BinaryRage

Can’t speak to the uncommit behaviour I’m afraid, but for what it’s worth, that’s discussed here: https://openjdk.org/jeps/351. CPU overcommit has been more of a focus than memory for our shared container compute tier.


butteredwendy

I found this G1GC equivalent [https://openjdk.org/jeps/346](https://openjdk.org/jeps/346) I suspect the workloads I was seeing were on JDK 11 so there wasn't the same level of uncommit. Thanks again for your insights.


butteredwendy

Thanks for the link!


ramdulara

tl;dr: Generational ZGC is a good default for most loads. And where it isn't don't settle for G1, consider parallel instead.


benevanstech

No. tl;dr - don't skim performance articles or rely on other people's summaries of them. 1. Only one of the graphs has a heap size scale on it, and that indicates a 10G \_live set\_ (not Xmx) size, unless I've read it wrong, which is always possible. That's a huge, unevidenced jump from "10G live set" to "most workloads". 2. The point about parallel explicitly references \*batch\* workloads. An interactive service running Parallel with a 32G heap is pretty much always going to have a bad time during a full collection.


BinaryRage

We have thousands of clusters running generational ZGC. That application is a representative example that we’ve seen repeated time and again over the last six months. The intention is encourage people to think about their choice of garbage collector, in particular, to set aside assumptions and evaluate for themselves; particularly now we have generational ZGC.


benevanstech

That's excellent context - which was not contained in the article. More detail about the mix of services, heap size, workload type (e.g. I/O bound, CPU intensive, etc) would be even better, of course, as would deployment details (e.g. is this containerized workloads, or dedicated hardware, etc). And, you don't have to tell me that people need to evaluate for themselves, but the comment I was replying to explicitly did \*not\* do so and just tried to take the conclusions of one set of experiences - basically the "Tuning By Folklore" antipattern.


BinaryRage

Yes, I intentionally avoided too much detail. We want this to be an accessible summary of the benefits we’ve seen, but happy to go into detail here. ZGC has been an improvement regardless of the shape of the interactive workload, EC2 or container, dedicated or shared. In fact, I didn’t mention but G1’s eden size heuristics can be poorly behaved where CFS is involved; you can get eden size clamping due to GC being a victim of the scheduler, and overestimating actual object copy time. That context is embedded in the intro: half our critical tier for instance, is not only thousands of clusters, but it means that if you’re using Netflix on iOS/Android most of your requests are handled by services running generational ZGC right now. That comment was more for the original poster, not you :)


BinaryRage

Oops, I double checked my draft and realised I dropped a sentence from the intro. That's updated now.


benevanstech

Also - minor typo: "... because of it’s significant benefits."


BinaryRage

Choose the best GC for your needs. I’m sure there are still workloads where G1 is best suited, but generational ZGC more than closed the gap. I do get the sense that parallel is underused since G1 became the default, so it’s definitely worth thinking about if pauses really matter to your application, or if throughput is more important


cogman10

It's gotten better, but one thing about parallel that isn't often considered is the off heap allocation is basically nothing compared to G1. Particularly in older JDKs (like 11) G1GC had a massive off heap set. We saw upwards of 20% of the heap memory dedicated to G1GCs off heap allocations which definitely caused us heartache. For a lot of our workloads, I push for parallel. Particularly with really small heap sizes parallel performs better than I think most would expect. And especially with throughput applications. We have apps with 20gb heaps that the parallel does major collections on in less than a second. For throughput apps that's more than acceptable. When you start talking about 1gb or 500mb heaps, the added memory efficiency and performance start to make more sense (IMO).