Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I am using Chronicle Map to temporarily store / lookup a very large number of KV pairs (several billion in fact). I don't need durability or replication, and I'm using memory mapped files, rather than pure off-heap memory. Average key length is 8 bytes.

For smallish data sets - up to 200 million entries - I get throughput of around 1m entries per second i.e it takes approx 200 seconds to create the entries, which is stunning, but by 400 million entries, the map has slowed down significantly and it takes 1500 seconds to create them.

I have run tests on both Mac OSX/16GB Quad Core/500GB SSD and Proliant G6 server running Linux with 8 cores/64GB ram/300GB Raid 1 (not SSD). The same behaviour exhibits on both platforms.

If it helps, here's the map setup:

    try {
        f = File.createTempFile(name, ".map");
        catalog = ChronicleMapBuilder
                .of(String.class, Long.class)
                .entries(size)
                .averageKeySize(8)
                .createPersistedTo(f);
    } catch (IOException ioe) {
        // blah

And a simple writer test:

    long now = -System.currentTimeMillis();
    long count = 400_000_000L;
    for (long i = 0; i < count; i++) {
        catalog.put(Long.toString(i), i);
        if ((i % 1_000_000) == 0) {
            System.out.println(i + ": " + (now + System.currentTimeMillis()));
    System.out.println(count + ": " + (now + System.currentTimeMillis()));
    catalog.close();

So my question is - is there some sort of tuning I can do to improve this, e.g. change number of segments, use a different key type (e.g. CharSequence), or is this simply an artefact of the OS paging such large files?

  • Ensure you use the latest available Chronicle Map version (currently this is 3.3.0-beta, the next 3.4.0-beta comes in days)

  • Indeed use garbage-free techniques, even for such test this could matter, because garbage collection may kick in:

  • Use CharSequence as the key type and LongValue as the value type.
  • Simple test code could look like

    public class VinceTest {
        public static void main(String[] args) throws IOException {
            long count = 400_000_000L;
            File f = File.createTempFile("vince", ".map");
            f.deleteOnExit();
            try (ChronicleMap<CharSequence, LongValue> catalog = ChronicleMap
                    .of(CharSequence.class, LongValue.class)
                    .entries(count)
                    .averageKeySize(8.72)
                    .putReturnsNull(true)
                    .createPersistedTo(f)) {
                long prev = System.currentTimeMillis();
                StringBuilder key = new StringBuilder();
                LongValue value = Values.newHeapInstance(LongValue.class);
                for (long i = 1; i <= count; i++) {
                    key.setLength(0);
                    key.append(i);
                    value.setValue(i);
                    catalog.put(key, value);
                    if ((i % 1_000_000) == 0) {
                        long now = System.currentTimeMillis();
                        System.out.printf("Average ns to insert per mi #%d: %d\n",
                                (i / 1_000_000), now - prev);
                        prev = now;
                System.out.println("file size " + MEGABYTES.convert(f.length(), BYTES) + " MB");
    
  • From the above source, note the usage of putReturnsNull(true) to avoid accidental garbage creation as the value returned (though not the case for this test, because all keys are unique and put() always returns null, but may be the case for your production)

  • Ensure you specified the right averageKeySize(). From this test, the average key size is actually closer to 9 bytes (because most keys are bigger than 100 000 000). But better to be as precise as possible, this is 8.72 for this particular test with the count of 400 000 000.

  • Thanks very much! I'm not sure I can use 3.3 because I am restricted to Java 7, but I'll change the example above to work for 2.x and let you know how it goes. – Vince Jan 6, 2016 at 16:29 It could compile for Chronicle Map 2, but a little different interface should be used, DataValueClasses instead of Values. – leventov Jan 6, 2016 at 22:15

    Thanks for contributing an answer to Stack Overflow!

    • Please be sure to answer the question. Provide details and share your research!

    But avoid

    • Asking for help, clarification, or responding to other answers.
    • Making statements based on opinion; back them up with references or personal experience.

    To learn more, see our tips on writing great answers.

  •