Modern distributed real-Time embedded applications have high processing requirements associated with strict deadlines. For some applications, such constraints cannot be fulfilled by existing single-core embedded platforms. A solution is to parallelize the execution of the applications, by allowing networked nodes to distribute their workload to remote nodes with spare capacity. In that context, this paper presents a holistic timing analysis for fixed-priority fork-join parallel/distributed tasks. Furthermore, we extend the holistic approach to consider the interaction between parallel threads and messages interchanged through a flexible time triggered switched Ethernet network, and we show how the pessimism on the worst case response time computation of such tasks can be reduced by considering the pipeline effect that occurs in such distributed systems. To evaluate the performance and correctness of the holistic model, this paper includes a numerical evaluation based on a real automotive application. The obtained results show that the proposed method is effective in distributing the load by different nodes, allowing a significant reduction of the worst case response time of the tasks. Moreover, the paper also reports an implementation of the model on a Linux library, called parallel/distributed real-Time, as well as the corresponding results obtained on a real testbed. The obtained results are in accordance with the predictions of the holistic timing analysis.