Sample SNPs
Fast ordered sampling of rows from large text or binary files. Special cases for DNA variant files (.bed, VCF, HapMap, etc).
Public Member Functions | Protected Member Functions | List of all members
sampFiles::TpedFileI Class Reference

TPED file input class. More...

#include <varfiles.hpp>

Inheritance diagram for sampFiles::TpedFileI:
[legend]
Collaboration diagram for sampFiles::TpedFileI:
[legend]

Public Member Functions

 TpedFileI ()
 Default constructor.
 
 TpedFileI (const string &stubName)
 File name constructor. More...
 
 TpedFileI (const TpedFileI &in)=default
 Copy constructor.
 
TpedFileIoperator= (const TpedFileI &in)=default
 Copy assignment.
 
 TpedFileI (TpedFileI &&in)=default
 Move constructor.
 
TpedFileIoperator= (TpedFileI &&in)=default
 Move assignment.
 
 ~TpedFileI ()
 Destructor.
 
void open ()
 Open stream to read.
 
void sample (TpedFileO &out, const uint64_t &n)
 Sample SNPs and save to BED file. More...
 
uint64_t nsnp ()
 Number of SNPs in the object.
 
uint64_t nindiv ()
 Number of individuals in the object.
 
- Public Member Functions inherited from sampFiles::TpedFile
 TpedFile ()
 Default constructor.
 
 TpedFile (const string &stubName)
 File name constructor. More...
 
 TpedFile (const TpedFile &in)=default
 Copy constructor.
 
TpedFileoperator= (const TpedFile &in)=default
 Copy assignment.
 
 TpedFile (TpedFile &&in)=default
 Move constructor.
 
TpedFileoperator= (TpedFile &&in)=default
 Move assignment.
 
 ~TpedFile ()
 Destructor.
 
void close ()
 Close stream.
 
- Public Member Functions inherited from sampFiles::GtxtFile
 GtxtFile ()
 Default constructor.
 
 GtxtFile (const string &fileName)
 Constructor with file name. More...
 
 GtxtFile (const string &fileName, const bool &head)
 Constructor with file name and header indicator. More...
 
 GtxtFile (const GtxtFile &in)=default
 Copy constructor.
 
GtxtFileoperator= (const GtxtFile &in)=default
 Copy assignment.
 
 GtxtFile (GtxtFile &&in)=default
 Move constructor.
 
GtxtFileoperator= (GtxtFile &&in)=default
 Move assignment.
 
 ~GtxtFile ()
 Destructor.
 
- Public Member Functions inherited from sampFiles::VarFile
 VarFile (const VarFile &in)=default
 Copy constructor.
 
VarFileoperator= (const VarFile &in)=default
 Copy assignment.
 
 VarFile (VarFile &&in)=default
 Move constructor.
 
VarFileoperator= (VarFile &&in)=default
 Move assignment.
 
 ~VarFile ()
 Destructor.
 

Protected Member Functions

uint64_t _famLines ()
 Get number of lines in the _tfamFile More...
 
uint64_t _famLines (fstream &fam)
 Copy the .tfam file and count number of lines. More...
 
void _famCopy (fstream &fam)
 Copy the .tfam file. More...
 
uint64_t _numLines ()
 Get number of rows in the text file. More...
 
- Protected Member Functions inherited from sampFiles::VarFile
 VarFile ()
 Default constructor (protected)
 

Additional Inherited Members

- Protected Attributes inherited from sampFiles::TpedFile
fstream _tfamFile
 Corresponding .tfam file stream.
 
string _fileStub
 File name stub (minus the extension)
 
- Protected Attributes inherited from sampFiles::GtxtFile
string _fileName
 File name.
 
bool _head
 Is there a header?
 
- Protected Attributes inherited from sampFiles::VarFile
fstream _varFile
 Variant file stream.
 

Detailed Description

TPED file input class.

Reads TPED files and the corresponding .tfam files as necessary.

Constructor & Destructor Documentation

◆ TpedFileI()

sampFiles::TpedFileI::TpedFileI ( const string &  stubName)
inline

File name constructor.

Parameters
[in]stubNamefile name minus the extension

Member Function Documentation

◆ _famCopy()

void TpedFileI::_famCopy ( fstream &  fam)
protected

Copy the .tfam file.

The current object's .tfam file is copied to the provided file stream, which should already be open for writing. If not, the function throws a string object `‘Output .fam filestream not open’'.

Parameters
[in]fam.tfam file stream

◆ _famLines() [1/2]

uint64_t TpedFileI::_famLines ( )
protected

Get number of lines in the _tfamFile

Assumes Unix-like line endings. The result is equal to the number of individuals. The _tfamFile should already be open for reading.

Returns
number of lines in _tfamFile

◆ _famLines() [2/2]

uint64_t TpedFileI::_famLines ( fstream &  fam)
protected

Copy the .tfam file and count number of lines.

Assumes Unix-like line endings. The result is equal to the number of individuals. The current object's .tfam file is copied to the provided file stream, which should already be open for writing. If not, the function throws a string object `‘Output .fam filestream not open’'.

Parameters
[in]fam.tfam file stream
Returns
number of lines in _tfamFile

◆ _numLines()

uint64_t TpedFileI::_numLines ( )
protected

Get number of rows in the text file.

Assumes Unix-like line endings. Header, if present, is not counted. Is overriden in some, but not all, derived classes.

Returns
number of rows

◆ sample()

void TpedFileI::sample ( TpedFileO out,
const uint64_t &  n 
)

Sample SNPs and save to BED file.

Sample \(n\) SNPs without replacement from the file represented by the current object and save to the out object. Uses Vitter's [3] method. Number of samples has to be smaller that the number of SNPs in the file.

Parameters
[in]outoutput object
[in]nnumber of SNPs to sample

The documentation for this class was generated from the following files: