| 
    Sample SNPs
    
   Fast ordered sampling of rows from large text or binary files. Special cases for DNA variant files (.bed, VCF, HapMap, etc). 
   | 
 
Text file input class. More...
#include <varfiles.hpp>
Public Member Functions | |
| GtxtFileI () | |
| Default constructor.  | |
| GtxtFileI (const string &fileName) | |
| File name constructor with header specification.  More... | |
| GtxtFileI (const string &fileName, const bool &head) | |
| File name constructor with header specification.  More... | |
| GtxtFileI (const GtxtFileI &in)=default | |
| Copy constructor.  | |
| GtxtFileI & | operator= (const GtxtFileI &in)=default | 
| Copy assignment.  | |
| GtxtFileI (GtxtFileI &&in)=default | |
| Move constructor.  | |
| GtxtFileI & | operator= (GtxtFileI &&in)=default | 
| Move assignment.  | |
| ~GtxtFileI () | |
| Destructor.  | |
| void | open () | 
| Open stream to read.  | |
| void | sample (GtxtFileO &out, const uint64_t &n, const bool &headSkip) | 
| Sample rows and save to a text file.  More... | |
| void | sample (const uint64_t &n, const bool &headSkip, const char &delim, vector< string > &out) | 
| Sample rows and save export to a vector of strings.  More... | |
| uint64_t | nlines () | 
| Number of SNPs in the object.  | |
Protected Member Functions | |
| virtual uint64_t | _numLines () | 
| Get number of rows in the text file.  More... | |
Text file input class.
Reads text files, skipping or copying the header as necessary.
      
  | 
  inline | 
File name constructor with header specification.
| [in] | fileName | file name including extension | 
      
  | 
  inline | 
File name constructor with header specification.
| [in] | fileName | file name including extension | 
| [in] | head | header presence | 
      
  | 
  protectedvirtual | 
Get number of rows in the text file.
Assumes Unix-like line endings. Header, if present, is not counted. Is overriden in some, but not all, derived classes.
| void GtxtFileI::sample | ( | const uint64_t & | n, | 
| const bool & | headSkip, | ||
| const char & | delim, | ||
| vector< string > & | out | ||
| ) | 
Sample rows and save export to a vector of strings.
Sample \(n\) rows without replacement from the file represented by the current object and output a vector of strings. Each field separated by the specified delimiter is stored as an element of the vector. Uses Vitter's [3] method. Number of samples has to be smaller that the number of rows in the file. The output vector is erased if it is not empty.
| [in] | n | number of SNPs to sample | 
| [in] | headSkip | skip header? Ignored if there is no header | 
| [in] | delim | field delimiter | 
| [out] | out | output vector | 
| void GtxtFileI::sample | ( | GtxtFileO & | out, | 
| const uint64_t & | n, | ||
| const bool & | headSkip | ||
| ) | 
Sample rows and save to a text file.
Sample \(n\) lines without replacement from the file represented by the current object and save to the out object. Uses Vitter's [3] method. Number of samples has to be smaller that the number of rows in the file.
| [in] | out | output object | 
| [in] | n | number of rows to sample | 
| [in] | headSkip | skip header? Ignored if there is no header |