Does java8 streams produce slow code?

There are too much hype about functional programming and particularly the new Java 8 streams API. It is advertised as good replacement for old good loops and imperative paradigm. Indeed sometimes it could look nice and do the job well. But what about performance?

E.g. here is the good article about that: Java 8: No more loops Using the loop you can do all the job with a one iteration. But with a new stream API you will chain multiple loops which make it much slower(is it right?). Look at their first sample. Loop will do not walk even through the whole array in most cases. However to do the filtering with a new stream API you have to cycle through the whole array to filter out all candidates and then you will be able to get the first one.

In this article it was mentioned about some laziness:

We first use the filter operation to find all articles that have the Java tag, then used the findFirst() operation to get the first occurrence. Since streams are lazy and filter returns a stream, this approach only processes elements until it finds the first match.

What does author mean about that laziness?

I did simple test and it shows that old good loop solution works 10x fast then stream approach.

public void test() {
    List<String> list = Arrays.asList(
            "First string",
            "Second string",
            "Third string",
            "Good string",
            "Another",
            "Best",
            "Super string",
            "Light",
            "Better",
            "For string",
            "Not string",
            "Great",
            "Super change",
            "Very nice",
            "Super cool",
            "Nice",
            "Very good",
            "Not yet string",
            "Let's do the string",
            "First string",
            "Low string",
            "Big bunny",
            "Superstar",
            "Last");

    long start = System.currentTimeMillis();
    for (int i = 0; i < 100000000; i++) {
        getFirstByLoop(list);
    }
    long end = System.currentTimeMillis();

    System.out.println("Loop: " + (end - start));

    start = System.currentTimeMillis();
    for (int i = 0; i < 100000000; i++) {
        getFirstByStream(list);
    }
    end = System.currentTimeMillis();

    System.out.println("Stream: " + (end - start));
}

public String getFirstByLoop(List<String> list) {

    for (String s : list) {
        if (s.endsWith("string")) {
            return s;
        }
    }

    return null;
}

public Optional<String> getFirstByStream(List<String> list) {
    return list.stream().filter(s -> s.endsWith("string")).findFirst();
}

Results was:

Loop: 517

Stream: 5790

BTW if I will use String[] instead of List the difference will be even more! Almost 100x!

QUESTION: Should I use old loop imperative approach if I'm looking for the best code performance? Is FP paradigm is just to make code "more concise and readable" but not about performance?

OR

is there something I missed and new stream API could be at least the same as efficient as loop imperative approach?

2 answers

  • answered 2018-01-14 11:00 Eugene

    Laziness is about how elements are taken from the source of the stream - that is on demand. If there is needed to take more elements - they will, otherwise they will not. Here is an example:

     Arrays.asList(1, 2, 3, 4, 5)
                .stream()
                .peek(x -> System.out.println("before filter : " + x))
                .filter(x -> x > 2)
                .peek(System.out::println)
                .anyMatch(x -> x > 3);
    

    Notice how each element goes through the entire pipeline of stages; that is filter is applied to one element at at time - not all of them, thus filter returns a Stream<Integer>. This allows the stream to be short-circuiting, as anyMatch does not even process 5 since there is no need at all.

    Just notice that not all intermediate operations are lazy. For example sorted and distinct is not - and these are called stateful intermediate operations. Think about this way - to actually sort elements you do need to traverse the entire source. One more example that is not intuitive is flatMap, but this is not guaranteed and is seen more like a bug, more to read here

    The speed is about how you measure, measuring micro-benchmarks in java is not easy, and de facto tool for that is jmh - you can try that out. There are numerous posts here on SO that show that streams are indeed slower (which in normal - they have an infrastructure), but the difference is not that big to actually care.

  • answered 2018-01-14 11:00 Stephen C

    QUESTION: Should I use old loop imperative approach if I'm looking for the best code performance?

    Right now, probably yes. Various benchmarks seem to suggest that streams are slower than loops for most tests. Though not catastrophically slower.

    Counter examples:

    It is possible to do equivalent things with loops, you can't do it with just loops.

    But the bottom line is that performance is complicated and streams are not (yet) a magic bullet for speeding up your code.

    Is FP paradigm is just to make code "more concise and readable" but not about performance?

    Not exactly. It is certainly true that the FP paradigm is more concise and (to someone who is familiar with it) more readable.

    However, by expressing the using the FP paradigm, you are also expressing it in a way that potentially could be optimized in ways that are much harder to achieve with code expressed using loops and assignment. FP code is also more amenable to formal methods; i.e. formal proof of correctness.