Adding lines, like adding cores, introduces complexities that don’t occur when speeding up a single one. The jobs that need to be done have to be divided to make sure things are being done as efficiently as possible. Otherwise, cores sit idle, or worse still, things get done in the wrong order, or twice, or not at all.
Dividing the work results in a certain amount of overhead. People at the different lines often need information from someone at another line: the time they take communicating eats into the time they can spend processing. Even if they can look up the information they need without disturbing the people on the other line, they have to run around and look for it, called cache snooping in CPU terms.
The largest problems occur if two cores are doing different things to the same data at once, called a race condition. If one core is trundling through a database applying a currency conversion to all the numbers in several columns of several tables, and another is calculating column totals in the same tables, they can make quite a mess.
To prevent this, one process can lock the data it is using. This solves the problem, though at the cost of having the other core sit idle. If you lock every step of a program, it runs serially, not in parallel. Even if the work is processed by several cores, it will take just as long if they are each waiting for the previous steps to be completed. Actually, the overhead means it can take longer.