The Size of Java Objects
As Java programmers, we don't necessarily think about how the objects our programs use map to the machine's underlying memory. The Java and JVM specifications do not specify the amount of memory an object takes up, and ordinary Java code does not let you do pointer arithmetic or directory manipulate an object's memory.
In fact, you could implement JVM objec another language's map. That implementation would be slow and use an excessive amount of memory, but would not violate the JVM spec.
In practice, when using Hotspot, you can reason about the size of an object from its source code. OpenJDK includes Java Object Layout (JOL), which lets you examine and print the internal memory layout of an object. This post is not about how to use JOL, but the basic usage is quite straightforward. I've linked the code I've written for this post, and you can read the detailed samples linked from the JOL project page.
The most accurate way to understand the memory use of your application is to use controlled experiments and measurement in real world scenarios. However, those approaches take orders of magnitude longer than napkin math. Remember: if it's worth doing, it's worth doing badly.
What follows is the sizes of common java objects as measured on my system. The results generalize to most production x8664 systems, but there are exceptions. Most importantly, a JVM using 32GB of heap will result in all objects being larger, but there are other configurations that can affect the results.1
These values may be surprising, but they reflect the requirement that an Object size must be a multiple of the processor's word size (8 bytes on x64).
Arrays have a chunk of continuous memory that contains either primitives or references to Java objects. An array requires the 16 bytes any object takes up, followed by storage for the values contained in it. In general, this value is the "logical size" of the value (1 byte for a byte, 4 for an int, 8 for a long), but booleans require an entire byte to ensure that reads and writes are atomic.
Strings are a wrapper around arrays of chars. The size of a string ends up being 24 for the String and its three fields, plus the size of its underlying array of characters.
Prior to Java 9, an array of chars was always stored in UTF-16 encoding (2 bytes per character). Starting with Java 9, the default storage for strings uses a Latin-1 compatible encoding for compatible strings, falling back to a UTF-16 encoding for other strings. So, ASCII strings (and a few others) can be stored with 1 byte per additional character, while arbitary strings require 2 bytes per character.2
|String||Size (chars)||Size (bytes)|
Collections introduce some complications. Before, we only had to consider the space used up by a single object, and perhaps an inline array. ArrayLists work similarly, as they're essentially a wrapper around an array. However, a HashMap is slightly more complicated. It contains its own fields, and an array, which contains an entry object for each key/value inserted in the map.
If a collection object is backed by an array, it requires a strategy for sizing that array and growing it as needed. Depending on whether the array is full or not, the memory usage per element may vary significantly. ArrayList and HashMap both allow control over this process, via constructors and ArrayList#ensureCapacity(int).
The table below shows some examples of the space usage both for collections. These figures are just the space for the collection itself, not the objects contained in the collection. Unless those objects are frequently reused, they will typically use up more space than the collections that contain them.
Additionaly, the table show space use for both densely and relatively sparse collections. In the case of ArrayList, you can call ensureCapacity(int) to increase the size of the collection and ensure no space is wasted (if you happen to know the space up-front).3 In the case of HashMaps, the wasted space is relatively small compared to the space needed for
HashMap$Node objects, but in an ArrayList, it can mean a 50% increase in space/element.
|Class||Collection Size||EnsureCapacity||Size (bytes)||bytes/element|
For completeness, tests were run on openjdk build 18.104.22.168 on an
x86_64CPU on Linux. I'm not aware of changes in how the JRE lays out ordinary objects between versions, and I believe the JVM memory layout for a processor architecture is unaffected by the operating system. I also have no experience with non-Hotspot JVMs or non-x86 JVMs.↩
It's not directly related to the space an individual string uses, but there's one other optimization related to size. Strings are given special treatment when using the G1 garbage collector: duplicate strings will continue to be separate objects, but share their underlying array of bytes/characters. If your strings are unique or short-lived, you can just add up the space taken up by each string. But if your strings live long enough to be deduplicated by the garbage collector, adding up the space for each string will overcount the actual memory used.↩
HashMaps don't give you explicit control over the size of their backing array, as it must be a power of 2 to let the collection easily convert a hash value into an array index.↩