A hardware sorter suitable for VLSI implementation is proposed. It operates in a parallel and pipelined fashion, with the actual sorting time absorbed by the input/output time. A detailed VLSI implementation is described which has a very favorable device count compared to existing static RAM.