Java — and in a more general sense — the garbage collector is partly responsible for the huge productivity gains in the past several decades as development moved increasingly to memory-managed environments. This has come at a cost, with less attention paid to the amount of memory consumed in such environments.Of course, the obvious answer is to just increase the VM heap size for the application. But sometimes reducing memory footprint is crucial on underpowered machines, or maybe you just want to cram more stuff into memory before you run out of address space.
Take note, and you may be able to significantly reduce memory usage with only a few small tweaks to your code:
- In most JVMs, every object has an overhead of 2 words.
- References take up 4 bytes, since they are essentially pointers (this is true on 32bit systems, and some 64bit systems are able to compress this address to 4 bytes as well).
- A 64-bit forced object “alignment” means that all objects occupy memory in multiples of 8 bytes.
- Arrays have an extra 4 byte overhead to store the size of the array. Yes, this means that arrays can only be about 2 billion elements long, approximately equal to Integer.MAX_VALUE.
- An instance of an array is basically an object, even if it holds primitives.
- Unlike C, each row of a multidimensional array is actually a separate object with its own memory overhead — it is an array of arrays. In C, multidimensional arrays are stored in a single array with some pointer manipulation thrown in.
- All numerical data types in Java are signed.
- Booleans — even primitive booleans — take up one byte each. The other 7 bits are wasted.
In other words: Java uses a shit ton of memory.
You’ll probably personally run into a VM heap space problem when you have to deal with large sets of data, such as huge sparsely connected graphs. In a common scenario: You have a big array of objects, and you’ll eventually have to face the reality that you are probably wasting lots of memory due to the object alignment.
The best advice I can offer in these cases is to try using arrays of primitives instead of arrays of objects containing primitives. After that, it’s time to consider more RAM or multi-process.