[][src]Struct fid_rs::fid::Fid

pub struct Fid { /* fields omitted */ }

FID (Fully Indexable Dictionary).

This class can handle bit sequence of virtually arbitrary length.

In fact, N (FID's length) is designed to be limited to: N <= 2^64.
It should be enough for almost all usecases since a binary data of length of 2^64 consumes 2^21 = 2,097,152 TB (terabyte), which is hard to handle by state-of-the-art computer architecture.

Implementation detail

Index<u64>'s implementation is trivial.

select() just uses binary search of rank() results.

rank()'s implementation is standard but non-trivial. So here explains implementation of rank().

rank()'s implementation

Say you have the following bit vector.

00001000 01000001 00000100 11000000 00100000 00000101 10100000 00010000 001 ; (N=67)

Answer rank(48) in O(1) time-complexity and o(N) space-complexity.

Naively, you can count the number of '1' from left to right. You will find rank(48) == 10 but it took O(N) time-complexity.

To reduce time-complexity to O(1), you can use memonization technique.
Of course, you can memonize results of rank(i) for every i ([0, N-1]).

Bit vector;   0  0  0  0  1  0  0  0  0  1  0  0  0  0  0  1  0  0  0  0  0  1  0  0  1  1  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  1  0  1  [1]  0  1  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  1 ; (N=67)
Memo rank(i); 0  0  0  0  1  1  1  1  1  2  2  2  2  2  2  3  3  3  3  3  3  4  4  4  5  6  6  6  6  6  6  6  6  6  7  7  7  7  7  7  7  7  7  7  7  8  8  9  10  10 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 13

From this memo, you can answer rank(48) == 10 in constant time, although space-complexity for this memo is O(N) > o(N).

To reduce space-complexity using memonization, we divide the bit vector into Chunk and Block.

Bit vector; 00001000 01000001 00000100 11000000 00100000 00000101 [1]0100000 00010000 001  ; (N=67)
Chunk;     |                  7                    |                13                  |  ; (size = (log N)^2 = 36)
Block;     |0 |1 |1  |2 |2 |3  |3 |4 |6  |6 |6  |7 |0 |0  |0 |2 |4    |4 |4  |5 |5 |5  |6| ; (size = (log N) / 2 = 3)

Now you want to answer rank(48). 48-th bit is in the 2nd chunk, and in the 5th block in the chunk.
So the rank(48) is at least:

7 (value of 1st chunk) + 2 (value of 4th block in the 2nd chunk)

Then, focus on 3 bits in 5th block in the 2nd chunk; [1]01.
As you can see, only 1 '1' is included up to 48-th bit (101 has 2 '1's but 2nd '1' is 50-th bit, irrelevant to rank(48)).

Therefore, the rank(48) is calculated as:

7 (value of 1st chunk) + 2 (value of 4th block in the 2nd chunk) + 1 ('1's in 5th block up to 48-th bit)

OK. That's all... Wait!
rank() must be in O(1) time-complexity.

Counting '1's in a block must also be O(1), while using o(N) space.
We use Table for this purpose.

Block contentNumber of '1's in block
0000
0011
0101
0112
1001
1012
1102
1113

This table is constructed in build(). So we can find the number of '1's in block in O(1) time.
Note that this table has O(log N) = o(N) length.

In summary:

rank() = (value of left chunk) + (value of left block) + (value of table keyed by inner block bits).

Methods

impl Fid[src]

pub fn rank(&self, i: u64) -> u64[src]

Returns the number of 1 in [0, i] elements of the Fid.

Panics

When i >= length of the Fid.

Implementation detail

 00001000 01000001 00000100 11000000 00100000 00000101 00100000 00010000 001  Raw data (N=67)
                                                          ^
                                                          i = 51
|                  7                    |                13                |  Chunk (size = (log N)^2 = 36)
                                        ^
               chunk_left            i_chunk = 1      chunk_right

|0 |1 |1  |2 |2 |3  |3 |4 |6  |6 |6  |7 |0 |0  |0 |2 |3 |3 |4  |4 |4 |5  |5|  Block (size = log N / 2 = 3)
                                                        ^
                                                     i_block = 17
                                             block_left | block_right
  1. Find i_chunk. i_chunk = i / chunk_size.
  2. Get chunk_left = Chunks[i_chunk - 1] only if i_chunk > 0.
  3. Get rank from chunk_left if chunk_left exists.
  4. Get chunk_right = Chunks[i_chunk].
  5. Find i_block. i_block = (i - i_chunk * chunk_size) / block size.
  6. Get block_left = chunk_right.blocks[ i_block - 1]_ only if _i_block` > 0.
  7. Get rank from block_left if block_left exists.
  8. Get inner-block data _block_bits. block_bits must be of block size length, fulfilled with 0 in right bits.
  9. Calculate rank of block_bits in O(1) using a table memonizing block size bit's popcount.

pub fn rank0(&self, i: u64) -> u64[src]

Returns the number of 0 in [0, i] elements of the Fid.

Panics

When i >= length of the Fid.

pub fn select(&self, num: u64) -> Option<u64>[src]

Returns the minimum position (0-origin) i where rank(i) == num of num-th 1 if exists. Else returns None.

Panics

When num > length of the Fid.

Implementation detail

Binary search using rank().

pub fn select0(&self, num: u64) -> Option<u64>[src]

Returns the minimum position (0-origin) i where rank(i) == num of num-th 0 if exists. Else returns None.

Panics

When num > length of the Fid.

pub fn len(&self) -> u64[src]

Returns bit length of this FID.

impl<'iter> Fid[src]

Important traits for FidIter<'iter>
pub fn iter(&'iter self) -> FidIter<'iter>[src]

Creates an iterator over FID's bit vector.

Examples

use fid_rs::Fid;

let fid = Fid::from("1010_1010");
for (i, bit) in fid.iter().enumerate() {
    assert_eq!(bit, fid[i as u64]);
}

Trait Implementations

impl<'_> From<&'_ [bool]> for Fid[src]

fn from(bits: &[bool]) -> Self[src]

Constructor from slice of boolean.

Examples

use fid_rs::Fid;

let bits = [false, true, true, true];
let fid = Fid::from(&bits[..]);
assert_eq!(fid[0], false);
assert_eq!(fid[1], true);
assert_eq!(fid[2], true);
assert_eq!(fid[3], true);

Panics

When:

  • bits is empty.

impl<'_> From<&'_ str> for Fid[src]

fn from(s: &str) -> Self[src]

Constructor from string representation of bit sequence.

  • '0' is interpreted as 0.
  • '1' is interpreted as 1.
  • '_' is just ignored.

Examples

use fid_rs::Fid;

let fid = Fid::from("01_11");
assert_eq!(fid[0], false);
assert_eq!(fid[1], true);
assert_eq!(fid[2], true);
assert_eq!(fid[3], true);

Panics

When:

  • s contains any character other than '0', '1', and '_'.
  • s does not contain any '0' or '1'

impl Index<u64> for Fid[src]

type Output = bool

The returned type after indexing.

fn index(&self, index: u64) -> &Self::Output[src]

Returns i-th element of the Fid.

Panics

When i >= length of the Fid.

Auto Trait Implementations

impl RefUnwindSafe for Fid

impl Send for Fid

impl Sync for Fid

impl Unpin for Fid

impl UnwindSafe for Fid

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.