The notion of a Network-on-Chip (NoC) is rapidly gaining a foothold as the communication fabric in complex System-on-Chip (SoC) architectures. Scalability is the NoCs most valuable asset, which makes it ideal for larger designs. However, increasingly diminishing feature sizes have rendered the interconnect as the primary bottleneck in terms of both latency and power consumption in on-chip systems. It is, therefore, imperative to optimize the network infrastructure to maximize performance. Research has primarily focused on architectural improvements within the router and the development of deadlock avoidance/recovery schemes. The latter tend to rely on fairly complex algorithms, which are sometimes infeasible to implement in NoCs due to their resource-constrained nature. In this paper, we propose a new NoC topology and architecture which injects data into the network using four sub-NICs (Network Interface Controllers), rather than one NIC, per node. It is shown that this scheme achieves significant improvements in network latency and energy consumption with only negligible area overhead and complexity over existing architectures. In fact, in the case of MESH network topologies, the proposed scheme provides substantial savings in area as well, because it requires fewer routers. Cycle-accurate simulation validates our assertions. Most importantly, it is also shown that this implementation is inherently deadlock-free, thus eliminating the need to rely on specialized, resource-hungry algorithms for deadlock avoidance.