Resource-constrained devices increasingly rely on wireless communication for the reliable and low-latency transmission of short messages. However, especially the implementation of adequate integrity protection of time-critical messages places a significant burden on these devices. We address this issue by proposing BP-MAC, a fast and memory-efficient approach for computing message authentication codes based on the well-established Carter-Wegman construction. Our key idea is to offload resource-intensive computations to idle phases and thus save valuable time in latency-critical phases, i.e., when new data awaits processing. Therefore, BP-MAC leverages a universal hash function designed for the bitwise preprocessing of integrity protection to later only require a few XOR operations during the latency-critical phase. Our evaluation on embedded hardware shows that BP-MAC outperforms the state-of-the-art in terms of latency and memory overhead, notably for small messages, as required to adequately protect resource-constrained devices with stringent security and latency requirements.