It's because memory channels are quite dense, most of the circuit work on the PCB is memory channels into the GPU. It might be easier on a dual gpu card, but I don't see how it would ever be possible through an SLI/Crossfire bridge without seriously increasing the density of it's tracks. Shared memory channels on a single card with multiple GPU's would also lead to a staggering increase in it's footprint.
Might be possible in the future with higher density VRAM modules, smaller GPU construction processes and better materials used in the circuits allowing for them to be smaller with the same thermal and electrical characteristics. Or some sort of shared on board memory controller (similar idea as a RAID controller) that controlled how the GPU's access the VRAM, but that could add a performance overhead that's not acceptable right now.