Coded modulation (CM) is the combination of forward error correction (FEC) and multilevel constellations. Coherent optical communication systems result in a four-dimensional (4D) signal space, which naturally leads to 4D-CM transceivers. A practically attractive design paradigm is to use a bit-wise decoder, where the detection process is (suboptimally) separated into two steps: soft-decision demapping followed by binary decoding. In this paper, bit-wise decoders are studied from an information-theoretic viewpoint. 4D constellations with up to 4096 constellation points are considered. Metrics to predict the post-FEC bit-error rate (BER) of bit-wise decoders are analyzed. The mutual information is shown to fail at predicting the post-FEC BER of bit-wise decoders and the so-called generalized mutual information is shown to be a much more robust metric. For the suboptimal scheme under consideration, it is also shown that constellations that transmit and receive information in each polarization and quadrature independently (e.g., PM-QPSK, PM-16QAM, and PM-64QAM) outperform the best 4D constellations designed for uncoded transmission. Theoretical gains are as high as 4 dB, which are then validated via numerical simulations of low-density parity check codes.