Sample SNPs
Fast ordered sampling of rows from large text or binary files. Special cases for DNA variant files (.bed, VCF, HapMap, etc).
|
TPED file input class. More...
#include <varfiles.hpp>
Public Member Functions | |
TpedFileI () | |
Default constructor. | |
TpedFileI (const string &stubName) | |
File name constructor. More... | |
TpedFileI (const TpedFileI &in)=default | |
Copy constructor. | |
TpedFileI & | operator= (const TpedFileI &in)=default |
Copy assignment. | |
TpedFileI (TpedFileI &&in)=default | |
Move constructor. | |
TpedFileI & | operator= (TpedFileI &&in)=default |
Move assignment. | |
~TpedFileI () | |
Destructor. | |
void | open () |
Open stream to read. | |
void | sample (TpedFileO &out, const uint64_t &n) |
Sample SNPs and save to BED file. More... | |
uint64_t | nsnp () |
Number of SNPs in the object. | |
uint64_t | nindiv () |
Number of individuals in the object. | |
Public Member Functions inherited from sampFiles::TpedFile | |
TpedFile () | |
Default constructor. | |
TpedFile (const string &stubName) | |
File name constructor. More... | |
TpedFile (const TpedFile &in)=default | |
Copy constructor. | |
TpedFile & | operator= (const TpedFile &in)=default |
Copy assignment. | |
TpedFile (TpedFile &&in)=default | |
Move constructor. | |
TpedFile & | operator= (TpedFile &&in)=default |
Move assignment. | |
~TpedFile () | |
Destructor. | |
void | close () |
Close stream. | |
Public Member Functions inherited from sampFiles::GtxtFile | |
GtxtFile () | |
Default constructor. | |
GtxtFile (const string &fileName) | |
Constructor with file name. More... | |
GtxtFile (const string &fileName, const bool &head) | |
Constructor with file name and header indicator. More... | |
GtxtFile (const GtxtFile &in)=default | |
Copy constructor. | |
GtxtFile & | operator= (const GtxtFile &in)=default |
Copy assignment. | |
GtxtFile (GtxtFile &&in)=default | |
Move constructor. | |
GtxtFile & | operator= (GtxtFile &&in)=default |
Move assignment. | |
~GtxtFile () | |
Destructor. | |
Public Member Functions inherited from sampFiles::VarFile | |
VarFile (const VarFile &in)=default | |
Copy constructor. | |
VarFile & | operator= (const VarFile &in)=default |
Copy assignment. | |
VarFile (VarFile &&in)=default | |
Move constructor. | |
VarFile & | operator= (VarFile &&in)=default |
Move assignment. | |
~VarFile () | |
Destructor. | |
Protected Member Functions | |
uint64_t | _famLines () |
Get number of lines in the _tfamFile More... | |
uint64_t | _famLines (fstream &fam) |
Copy the .tfam file and count number of lines. More... | |
void | _famCopy (fstream &fam) |
Copy the .tfam file. More... | |
uint64_t | _numLines () |
Get number of rows in the text file. More... | |
Protected Member Functions inherited from sampFiles::VarFile | |
VarFile () | |
Default constructor (protected) | |
Additional Inherited Members | |
Protected Attributes inherited from sampFiles::TpedFile | |
fstream | _tfamFile |
Corresponding .tfam file stream. | |
string | _fileStub |
File name stub (minus the extension) | |
Protected Attributes inherited from sampFiles::GtxtFile | |
string | _fileName |
File name. | |
bool | _head |
Is there a header? | |
Protected Attributes inherited from sampFiles::VarFile | |
fstream | _varFile |
Variant file stream. | |
TPED file input class.
Reads TPED files and the corresponding .tfam files as necessary.
|
inline |
File name constructor.
[in] | stubName | file name minus the extension |
|
protected |
Copy the .tfam file.
The current object's .tfam file is copied to the provided file stream, which should already be open for writing. If not, the function throws a string object `‘Output .fam filestream not open’'.
[in] | fam | .tfam file stream |
|
protected |
Get number of lines in the _tfamFile
Assumes Unix-like line endings. The result is equal to the number of individuals. The _tfamFile
should already be open for reading.
_tfamFile
|
protected |
Copy the .tfam file and count number of lines.
Assumes Unix-like line endings. The result is equal to the number of individuals. The current object's .tfam file is copied to the provided file stream, which should already be open for writing. If not, the function throws a string object `‘Output .fam filestream not open’'.
[in] | fam | .tfam file stream |
_tfamFile
|
protected |
Get number of rows in the text file.
Assumes Unix-like line endings. Header, if present, is not counted. Is overriden in some, but not all, derived classes.
void TpedFileI::sample | ( | TpedFileO & | out, |
const uint64_t & | n | ||
) |
Sample SNPs and save to BED file.
Sample \(n\) SNPs without replacement from the file represented by the current object and save to the out
object. Uses Vitter's [3] method. Number of samples has to be smaller that the number of SNPs in the file.
[in] | out | output object |
[in] | n | number of SNPs to sample |