Sample SNPs
Fast ordered sampling of rows from large text or binary files. Special cases for DNA variant files (.bed, VCF, HapMap, etc).
|
Binary file input class. More...
#include <varfiles.hpp>
Public Member Functions | |
GbinFileI () | |
Default constructor. | |
GbinFileI (const string &fileName, const size_t &nCols, const size_t &elemSize) | |
File name constructor. More... | |
GbinFileI (const GbinFileI &in)=default | |
Copy constructor. | |
GbinFileI & | operator= (const GbinFileI &in)=default |
Copy assignment. | |
GbinFileI (GbinFileI &&in)=default | |
Move constructor. | |
GbinFileI & | operator= (GbinFileI &&in)=default |
Move assignment. | |
~GbinFileI () | |
Destructor. | |
void | open () |
Open stream to read. | |
void | sample (GbinFileO &out, const uint64_t &n) |
Sample rows and save to a binary file. More... | |
uint64_t | nlines () |
Number of rows in the object. | |
Protected Member Functions | |
virtual uint64_t | _numLines () |
Get number of rows in the binary file. More... | |
Binary file input class.
Reads binary files.
|
inline |
File name constructor.
[in] | fileName | file name including extension |
[in] | nCols | number of columns, or elements in a row |
[in] | elemSize | size of each element in bytes |
|
protectedvirtual |
Get number of rows in the binary file.
Requires knowledge of the number of elements in a row and their size in bytes. Throws a string object `‘Number of elements not divisible by row size’' if the total number of elements in the file is not divisible by the product of the number of columns and element size.
void GbinFileI::sample | ( | GbinFileO & | out, |
const uint64_t & | n | ||
) |
Sample rows and save to a binary file.
Sample \(n\) lines without replacement from the file represented by the current object and save to the out
object. Uses Vitter's [3] method. Number of samples has to be smaller that the number of rows in the file.
[in] | out | output object |
[in] | n | number of rows to sample |