Sample SNPs
Fast ordered sampling of rows from large text or binary files. Special cases for DNA variant files (.bed, VCF, HapMap, etc).
Public Member Functions | Protected Member Functions | List of all members
sampFiles::VcfFileI Class Reference

VCF file input class. More...

#include <varfiles.hpp>

Inheritance diagram for sampFiles::VcfFileI:
[legend]
Collaboration diagram for sampFiles::VcfFileI:
[legend]

Public Member Functions

 VcfFileI ()
 Default constructor.
 
 VcfFileI (const string &fileName)
 File name constructor. More...
 
 VcfFileI (const VcfFileI &in)=default
 Copy constructor.
 
VcfFileIoperator= (const VcfFileI &in)=default
 Copy assignment.
 
 VcfFileI (VcfFileI &&in)=default
 Move constructor.
 
VcfFileIoperator= (VcfFileI &&in)=default
 Move assignment.
 
 ~VcfFileI ()
 Destructor.
 
void open ()
 Open stream to read.
 
void sample (VcfFileO &out, const uint64_t &n)
 Sample SNPs and save to VCF file. More...
 
uint64_t nsnp ()
 Number of SNPs in the object.
 
- Public Member Functions inherited from sampFiles::VcfFile
 VcfFile ()
 Default constructor.
 
 VcfFile (const string &fileName)
 Constructor with file name. More...
 
 VcfFile (const VcfFile &in)=default
 Copy constructor.
 
VcfFileoperator= (const VcfFile &in)=default
 Copy assignment.
 
 VcfFile (VcfFile &&in)=default
 Move constructor.
 
VcfFileoperator= (VcfFile &&in)=default
 Move assignment.
 
 ~VcfFile ()
 Destructor.
 
void close ()
 Close stream.
 
- Public Member Functions inherited from sampFiles::GtxtFile
 GtxtFile ()
 Default constructor.
 
 GtxtFile (const string &fileName)
 Constructor with file name. More...
 
 GtxtFile (const string &fileName, const bool &head)
 Constructor with file name and header indicator. More...
 
 GtxtFile (const GtxtFile &in)=default
 Copy constructor.
 
GtxtFileoperator= (const GtxtFile &in)=default
 Copy assignment.
 
 GtxtFile (GtxtFile &&in)=default
 Move constructor.
 
GtxtFileoperator= (GtxtFile &&in)=default
 Move assignment.
 
 ~GtxtFile ()
 Destructor.
 
- Public Member Functions inherited from sampFiles::VarFile
 VarFile (const VarFile &in)=default
 Copy constructor.
 
VarFileoperator= (const VarFile &in)=default
 Copy assignment.
 
 VarFile (VarFile &&in)=default
 Move constructor.
 
VarFileoperator= (VarFile &&in)=default
 Move assignment.
 
 ~VarFile ()
 Destructor.
 

Protected Member Functions

uint64_t _numLines ()
 Get number of SNPs in the VCF file. More...
 
- Protected Member Functions inherited from sampFiles::VarFile
 VarFile ()
 Default constructor (protected)
 

Additional Inherited Members

- Protected Attributes inherited from sampFiles::GtxtFile
string _fileName
 File name.
 
bool _head
 Is there a header?
 
- Protected Attributes inherited from sampFiles::VarFile
fstream _varFile
 Variant file stream.
 

Detailed Description

VCF file input class.

Reads VCF files, skipping or copying the header as necessary; .idx files are ignored.

Constructor & Destructor Documentation

◆ VcfFileI()

sampFiles::VcfFileI::VcfFileI ( const string &  fileName)
inline

File name constructor.

Parameters
[in]fileNamefile name including extension

Member Function Documentation

◆ _numLines()

uint64_t VcfFileI::_numLines ( )
protected

Get number of SNPs in the VCF file.

Assumes Unix-like line endings. Header is not counted.

Returns
number of SNPs

◆ sample()

void VcfFileI::sample ( VcfFileO out,
const uint64_t &  n 
)

Sample SNPs and save to VCF file.

Sample \(n\) SNPs without replacement from the file represented by the current object and save to the out object. Uses Vitter's [3] method. Number of samples has to be smaller that the number of SNPs in the file.

Parameters
[in]outoutput object
[in]nnumber of SNPs to sample

The documentation for this class was generated from the following files: