Need help understanding an article on parallel stream performance gains

I have been reading this article about parallel streams. It's quite long winded and I understood everything up to a part relating to how parallel streams work. I will quote the section I have difficulty understanding:

"Parallelization requires: A pool of threads to execute the subtasks, Dividing the initial task into subtasks, Distributing subtasks to threads, Collating the results. Without entering the details, all this implies some overhead. It will show amazing results when:

  • Some tasks imply blocking for a long time, such as accessing a remote service, or

  • There are not many threads running at the same time, and in particular no other parallel stream.

If all subtasks imply intense calculation, the potential gain is limited by the number of available processors. Java 8 will by default use as many threads as they are processors on the computer, so, for intensive tasks, the result is highly dependent upon what other threads may be doing at the same time. Of course, if each subtask is essentially waiting, the gain may appear to be huge."

I don't understand the 2 statements highlighted in bold above.

The first sentence: Some tasks imply blocking for a long time, such as accessing a remote service

My understanding is that performance gains would be large relative to the same tasks being executed in a concurrent programming environment as opposed to a parallel processing environment?

This one: Of course, if each subtask is essentially waiting, the gain may appear to be huge.

I haven't a clue what the author means here.