Given an n-bit array A, the succinct rank data structure problem asks to construct a data structure using space n + r bits for r ≪ n, supporting rank queries of form rank(u) = Íui=−01 A[i]. In this paper, we design a new succinct rank data structure with r = n/(log n)Ω(t) + n1−c and query time O(t) for some constant c > 0, improving the previous best-known by Pǎtraşcu, which has r = n/(logtn )Ω(t) + Õ(n3/4) bits of redundancy. For r > n1−c, our space-time tradeoff matches the cell-probe lower bound by Pǎtraşcu and Viola, which asserts that r must be at least n/(log n)O(t). Moreover, one can avoid an n1−c-bit lookup table when the data structure is implemented in the cell-probe model, achieving r = ⌈n/(log n)Ω(t)⌉. It matches the lower bound for the full range of parameters. En route to our new data structure design, we establish an interesting connection between succinct data structures and approximate nonnegative tensor decomposition. Our connection shows that for specific problems, to construct a space-efficient data structure, it suffices to approximate a particular tensor by a sum of (few) nonnegative rank-1 tensors. For the rank problem, we explicitly construct such an approximation, which yields an explicit construction of the data structure.