![]() ![]() It is _cheaper_ to throw these out and regenerate as a consequence of minor GCs than to tenure them and collect them later. ![]() The caches do not have poor GC side-effects for CMS or the throughput collector. ![]() The reason why this works so well is because the objects in the cache are never tenured. These are related, but depending on which collector we are talking about they may be more or less intertwined. The second is overall application performance, accelerated by heavy use of caches with weakly referenced values. The first is long application pause times, caused by full GCs and to a lesser extend minor GCs. Let me state things more clearly, I believe I have not explained well: Granted, I haven't tried to use and tune G1GC for this in over a year and a half, and perhaps it is better now. I have little confidence that any voices outside of the ivory tower will be listened to. However I don't think G1's design works with a separate young generation in front of it. With what is available now, a ParNew grafted in front of a G1GC for tenured space would probably be great for this application. This tenured heap could be G1GC or a new concurrent collector. G1GC just doesn't seem to like this workload at all.įor this application, the ideal would be the ParNew young collector (or an upgraded throughput young collector that has functional tuning of tenuring thresholds), backed by a tenured heap that can avoid full GCs. (i)CMS lives much longer without a full GC than the throughput collector, but a full GC is over 10 seconds long, which is entirely unacceptable. (i)CMS has a more tunable young collector - by tuning the size of the eden spaces and how many bounces before tenure, less is tenured. With the throughput collector the first two goals are met, but the last one is not - a full GC of about 2 seconds leads to high latency somewhere past the 99.99th percentile. Tolerance at the 99th percentile is about 40ms, and ideally nothing should ever take more than 100ms. The application is latency sensitive, but not terribly so - the median request is about 3ms. ![]() It incurs no extra GC overhead and young generation collections are easy to keep below target latency goals - the throughput : GC-time tradeoff curve as a function of young generation size is smooth. In this application, a weak reference cache for such data is a massive performance win for the throughput collector and (i)CMS. A larger heap makes GC times longer, not shorter, and eats into the OS's cache for the data on disk.Ī soft reference cache is better than an LRU, but suffers from many of the same problems on the GC side. With the throughput collector or CMS, this causes lots of thrashing in the young generation and survivor spaces. Reading this data into an on heap LRU is a self-defeating proposition, the data set is significanly larger than RAM and although this improves average application latency and throughput somewhat due to avoiding work, garbage collection becomes very heavy-weight since an LRU by definition puts a lower bound on object lifetimes. The hot data is not specifically of any particular kind that you can easily partition off, its simply that the access patterns are not random, but skewed strongly. Some of this access is rather random but some of the data is very 'hot' and frequently accessed. Imagine you have 400GB of data on disk (SSD) representing data that is needed to service requests. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |