SNAP Library 3.0, User Reference
2016-07-20 17:56:49
SNAP, a general purpose, high performance system for analysis and manipulation of large networks
|
Table class: Relational table with columnar data storage. More...
#include <table.h>
Public Member Functions | |
void | AddIntCol (const TStr &ColName) |
Adds an integer column with name ColName . More... | |
void | AddFltCol (const TStr &ColName) |
Adds a float column with name ColName . More... | |
void | AddStrCol (const TStr &ColName) |
Adds a string column with name ColName . More... | |
void | GroupByIntColMP (const TStr &GroupBy, THashMP< TInt, TIntV > &Grouping, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with integer values, using OpenMP multi-threading. More... | |
TTable () | |
TTable (TTableContext *Context) | |
TTable (const Schema &S, TTableContext *Context) | |
TTable (TSIn &SIn, TTableContext *Context) | |
TTable (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) | |
Constructor to build table out of a hash table of int->int. More... | |
TTable (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) | |
Constructor to build table out of a hash table of int->float. More... | |
TTable (const TTable &Table) | |
Copy constructor. More... | |
TTable (const TTable &Table, const TIntV &RowIds) | |
void | SaveSS (const TStr &OutFNm) |
Saves table schema and content to a TSV file. More... | |
void | SaveBin (const TStr &OutFNm) |
Saves table schema and content to a binary file. More... | |
void | Save (TSOut &SOut) |
Saves table schema and content to a binary format. More... | |
void | Dump (FILE *OutF=stdout) const |
Prints table contents to a text file. More... | |
void | AddRow (const TTableRow &Row) |
Adds row with values taken from given TTableRow. More... | |
TTableContext * | GetContext () |
Returns the context. More... | |
TTableContext * | ChangeContext (TTableContext *Context) |
Changes the current context. Moves all object items to the new context. More... | |
TInt | GetColIdx (const TStr &ColName) const |
Gets index of column ColName among columns of the same type in the schema. More... | |
TInt | GetIntVal (const TStr &ColName, const TInt &RowIdx) |
Gets the value of integer attribute ColName at row RowIdx . More... | |
TFlt | GetFltVal (const TStr &ColName, const TInt &RowIdx) |
Gets the value of float attribute ColName at row RowIdx . More... | |
TStr | GetStrVal (const TStr &ColName, const TInt &RowIdx) const |
Gets the value of string attribute ColName at row RowIdx . More... | |
TInt | GetStrMapById (TInt ColIdx, TInt RowIdx) const |
Gets the integer mapping of the string at column ColIdx at row RowIdx . More... | |
TInt | GetStrMapByName (const TStr &ColName, TInt RowIdx) const |
Gets the integer mapping of the string at column ColName at row RowIdx . More... | |
TStr | GetStrValById (TInt ColIdx, TInt RowIdx) const |
Gets the value of the string attribute at column ColIdx at row RowIdx . More... | |
TStr | GetStrValByName (const TStr &ColName, const TInt &RowIdx) const |
Gets the value of the string attribute at column ColName at row RowIdx . More... | |
TIntV | GetIntRowIdxByVal (const TStr &ColName, const TInt &Val) const |
Gets the rows containing Val in int column ColName . More... | |
TIntV | GetStrRowIdxByMap (const TStr &ColName, const TInt &Map) const |
Gets the rows containing int mapping Map in str column ColName . More... | |
TIntV | GetFltRowIdxByVal (const TStr &ColName, const TFlt &Val) const |
Gets the rows containing Val in flt column ColName . More... | |
TInt | RequestIndexInt (const TStr &ColName) |
Creates Index for Int Column ColName . More... | |
TInt | RequestIndexFlt (const TStr &ColName) |
Creates Index for Flt Column ColName . More... | |
TInt | RequestIndexStrMap (const TStr &ColName) |
Creates Index for Str Column ColName . More... | |
TStr | GetStr (const TInt &KeyId) const |
Gets the string with KeyId . More... | |
TInt | GetIntValAtRowIdx (const TInt &ColIdx, const TInt &RowIdx) |
Get the integer value at column ColIdx and row RowIdx . More... | |
TFlt | GetFltValAtRowIdx (const TInt &ColIdx, const TInt &RowIdx) |
Get the float value at column ColIdx and row RowIdx . More... | |
Schema | GetSchema () |
Gets the schema of this table. More... | |
TVec< PNEANet > | ToGraphSequence (TStr SplitAttr, TAttrAggr AggrPolicy, TInt WindowSize, TInt JumpSize, TInt StartVal=TInt::Mn, TInt EndVal=TInt::Mx) |
Creates a sequence of graphs based on values of column SplitAttr and windows specified by JumpSize and WindowSize. More... | |
TVec< PNEANet > | ToVarGraphSequence (TStr SplitAttr, TAttrAggr AggrPolicy, TIntPrV SplitIntervals) |
Creates a sequence of graphs based on values of column SplitAttr and intervals specified by SplitIntervals. More... | |
TVec< PNEANet > | ToGraphPerGroup (TStr GroupAttr, TAttrAggr AggrPolicy) |
Creates a sequence of graphs based on grouping specified by GroupAttr. More... | |
PNEANet | ToGraphSequenceIterator (TStr SplitAttr, TAttrAggr AggrPolicy, TInt WindowSize, TInt JumpSize, TInt StartVal=TInt::Mn, TInt EndVal=TInt::Mx) |
Creates the graph sequence one at a time. More... | |
PNEANet | ToVarGraphSequenceIterator (TStr SplitAttr, TAttrAggr AggrPolicy, TIntPrV SplitIntervals) |
Creates the graph sequence one at a time. More... | |
PNEANet | ToGraphPerGroupIterator (TStr GroupAttr, TAttrAggr AggrPolicy) |
Creates the graph sequence one at a time. More... | |
PNEANet | NextGraphIterator () |
Calls to this must be preceded by a call to one of the above ToGraph*Iterator functions. More... | |
TBool | IsLastGraphOfSequence () |
Checks if the end of the graph sequence is reached. More... | |
TStr | GetSrcCol () const |
Gets the name of the column to be used as src nodes in the graph. More... | |
void | SetSrcCol (const TStr &Src) |
Sets the name of the column to be used as src nodes in the graph. More... | |
TStr | GetDstCol () const |
Gets the name of the column to be used as dst nodes in the graph. More... | |
void | SetDstCol (const TStr &Dst) |
Sets the name of the column to be used as dst nodes in the graph. More... | |
void | AddEdgeAttr (const TStr &Attr) |
Adds column to be used as graph edge attribute. More... | |
void | AddEdgeAttr (TStrV &Attrs) |
Adds columns to be used as graph edge attributes. More... | |
void | AddSrcNodeAttr (const TStr &Attr) |
Adds column to be used as src node atribute of the graph. More... | |
void | AddSrcNodeAttr (TStrV &Attrs) |
Adds columns to be used as src node attributes of the graph. More... | |
void | AddDstNodeAttr (const TStr &Attr) |
Adds column to be used as dst node atribute of the graph. More... | |
void | AddDstNodeAttr (TStrV &Attrs) |
Adds columns to be used as dst node attributes of the graph. More... | |
void | AddNodeAttr (const TStr &Attr) |
Handles the common case where src and dst both belong to the same "universe" of entities. More... | |
void | AddNodeAttr (TStrV &Attrs) |
Handles the common case where src and dst both belong to the same "universe" of entities. More... | |
void | SetCommonNodeAttrs (const TStr &SrcAttr, const TStr &DstAttr, const TStr &CommonAttrName) |
Sets the columns to be used as both src and dst node attributes. More... | |
TStrV | GetSrcNodeIntAttrV () const |
Gets src node int attribute name vector. More... | |
TStrV | GetDstNodeIntAttrV () const |
Gets dst node int attribute name vector. More... | |
TStrV | GetEdgeIntAttrV () const |
Gets edge int attribute name vector. More... | |
TStrV | GetSrcNodeFltAttrV () const |
Gets src node float attribute name vector. More... | |
TStrV | GetDstNodeFltAttrV () const |
Gets dst node float attribute name vector. More... | |
TStrV | GetEdgeFltAttrV () const |
Gets edge float attribute name vector. More... | |
TStrV | GetSrcNodeStrAttrV () const |
Gets src node str attribute name vector. More... | |
TStrV | GetDstNodeStrAttrV () const |
Gets dst node str attribute name vector. More... | |
TStrV | GetEdgeStrAttrV () const |
Gets edge str attribute name vector. More... | |
TAttrType | GetColType (const TStr &ColName) const |
Gets type of column ColName . More... | |
TInt | GetNumRows () const |
Gets total number of rows in this table. More... | |
TInt | GetNumValidRows () const |
Gets number of valid, i.e. not deleted, rows in this table. More... | |
THash< TInt, TInt > | GetRowIdMap () const |
Gets a map of logical to physical row ids. More... | |
TRowIterator | BegRI () const |
Gets iterator to the first valid row of the table. More... | |
TRowIterator | EndRI () const |
Gets iterator to the last valid row of the table. More... | |
TRowIteratorWithRemove | BegRIWR () |
Gets iterator with reomve to the first valid row. More... | |
TRowIteratorWithRemove | EndRIWR () |
Gets iterator with reomve to the last valid row. More... | |
void | GetPartitionRanges (TIntPrV &Partitions, TInt NumPartitions) const |
Partitions the table into NumPartitions and populate Partitions with the ranges. More... | |
void | Rename (const TStr &Column, const TStr &NewLabel) |
Renames a column. More... | |
void | Unique (const TStr &Col) |
Removes rows with duplicate values in given column. More... | |
void | Unique (const TStrV &Cols, TBool Ordered=true) |
Removes rows with duplicate values in given columns. More... | |
void | Select (TPredicate &Predicate, TIntV &SelectedRows, TBool Remove=true) |
Selects rows that satisfy given Predicate . More... | |
void | Select (TPredicate &Predicate) |
void | Classify (TPredicate &Predicate, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
void | SelectAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp, TIntV &SelectedRows, TBool Remove=true) |
Selects rows using atomic compare operation. More... | |
void | SelectAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp) |
void | ClassifyAtomic (const TStr &Col1, const TStr &Col2, TPredComp Cmp, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
void | SelectAtomicConst (const TStr &Col, const TPrimitive &Val, TPredComp Cmp, TIntV &SelectedRows, PTable &SelectedTable, TBool Remove=true, TBool Table=true) |
Selects rows where the value of Col matches given primitive Val . More... | |
template<class T > | |
void | SelectAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp) |
template<class T > | |
void | SelectAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp, PTable &SelectedTable) |
template<class T > | |
void | ClassifyAtomicConst (const TStr &Col, const T &Val, TPredComp Cmp, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
void | SelectAtomicIntConst (const TStr &Col, const TInt &Val, TPredComp Cmp) |
void | SelectAtomicIntConst (const TStr &Col, const TInt &Val, TPredComp Cmp, PTable &SelectedTable) |
void | SelectAtomicStrConst (const TStr &Col, const TStr &Val, TPredComp Cmp) |
void | SelectAtomicStrConst (const TStr &Col, const TStr &Val, TPredComp Cmp, PTable &SelectedTable) |
void | SelectAtomicFltConst (const TStr &Col, const TFlt &Val, TPredComp Cmp) |
void | SelectAtomicFltConst (const TStr &Col, const TFlt &Val, TPredComp Cmp, PTable &SelectedTable) |
void | Group (const TStrV &GroupBy, const TStr &GroupColName, TBool Ordered=true, TBool UsePhysicalIds=true) |
Groups rows depending on values of GroupBy columns. More... | |
void | Count (const TStr &CountColName, const TStr &Col) |
Counts number of unique elements. More... | |
void | Order (const TStrV &OrderBy, TStr OrderColName="", TBool ResetRankByMSC=false, TBool Asc=true) |
Orders the rows according to the values in columns of OrderBy (in descending lexicographic order). More... | |
void | Aggregate (const TStrV &GroupByAttrs, TAttrAggr AggOp, const TStr &ValAttr, const TStr &ResAttr, TBool Ordered=true) |
Aggregates values of ValAttr after grouping with respect to GroupByAttrs. Result are stored as new attribute ResAttr. More... | |
void | AggregateCols (const TStrV &AggrAttrs, TAttrAggr AggOp, const TStr &ResAttr) |
Aggregates attributes in AggrAttrs across columns. More... | |
TVec< PTable > | SpliceByGroup (const TStrV &GroupByAttrs, TBool Ordered=true) |
Splices table into subtables according to a grouping statement. More... | |
PTable | Join (const TStr &Col1, const TTable &Table, const TStr &Col2) |
Performs equijoin. More... | |
PTable | Join (const TStr &Col1, const PTable &Table, const TStr &Col2) |
PTable | ThresholdJoin (const TStr &KeyCol1, const TStr &JoinCol1, const TTable &Table, const TStr &KeyCol2, const TStr &JoinCol2, TInt Threshold, TBool PerJoinKey=false) |
PTable | SelfJoin (const TStr &Col) |
Joins table with itself, on values of Col . More... | |
PTable | SelfSimJoin (const TStrV &Cols, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
PTable | SelfSimJoinPerGroup (const TStr &GroupAttr, const TStr &SimCol, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
Performs join if the distance between two rows is less than the specified threshold. More... | |
PTable | SelfSimJoinPerGroup (const TStrV &GroupBy, const TStr &SimCol, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
Performs join if the distance between two rows is less than the specified threshold. More... | |
PTable | SimJoin (const TStrV &Cols1, const TTable &Table, const TStrV &Cols2, const TStr &DistanceColName, const TSimType &SimType, const TFlt &Threshold) |
Performs join if the distance between two rows is less than the specified threshold. More... | |
void | SelectFirstNRows (const TInt &N) |
Selects first N rows from the table. More... | |
void | Defrag () |
Releases memory of deleted rows, and defrags. More... | |
void | StoreIntCol (const TStr &ColName, const TIntV &ColVals) |
Adds entire int column to table. More... | |
void | StoreFltCol (const TStr &ColName, const TFltV &ColVals) |
Adds entire flt column to table. More... | |
void | StoreStrCol (const TStr &ColName, const TStrV &ColVals) |
Adds entire str column to table. More... | |
void | UpdateFltFromTable (const TStr &KeyAttr, const TStr &UpdateAttr, const TTable &Table, const TStr &FKeyAttr, const TStr &ReadAttr, TFlt DefaultFltVal=0.0) |
void | UpdateFltFromTableMP (const TStr &KeyAttr, const TStr &UpdateAttr, const TTable &Table, const TStr &FKeyAttr, const TStr &ReadAttr, TFlt DefaultFltVal=0.0) |
void | SetFltColToConstMP (TInt UpdateColIdx, TFlt DefaultFltVal) |
PTable | Union (const TTable &Table) |
Returns union of this table with given Table . More... | |
PTable | Union (const PTable &Table) |
PTable | UnionAll (const TTable &Table) |
Returns union of this table with given Table , preserving duplicates. More... | |
PTable | UnionAll (const PTable &Table) |
void | UnionAllInPlace (const TTable &Table) |
Same as TTable::ConcatTable. More... | |
void | UnionAllInPlace (const PTable &Table) |
PTable | Intersection (const TTable &Table) |
Returns intersection of this table with given Table . More... | |
PTable | Intersection (const PTable &Table) |
PTable | Minus (TTable &Table) |
Returns table with rows that are present in this table but not in given Table . More... | |
PTable | Minus (const PTable &Table) |
PTable | Project (const TStrV &ProjectCols) |
Returns table with only the columns in ProjectCols . More... | |
void | ProjectInPlace (const TStrV &ProjectCols) |
Keeps only the columns specified in ProjectCols . More... | |
void | ColGenericOp (const TStr &Attr1, const TStr &Attr2, const TStr &ResAttr, TArithOp op) |
Performs columnwise arithmetic operation. More... | |
void | ColGenericOpMP (TInt ArgColIdx1, TInt ArgColIdx2, TAttrType ArgType1, TAttrType ArgType2, TInt ResColIdx, TArithOp op) |
void | ColAdd (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise addition. See TTable::ColGenericOp. More... | |
void | ColSub (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise subtraction. See TTable::ColGenericOp. More... | |
void | ColMul (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise multiplication. See TTable::ColGenericOp. More... | |
void | ColDiv (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise division. See TTable::ColGenericOp. More... | |
void | ColMod (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs columnwise modulus. See TTable::ColGenericOp. More... | |
void | ColMin (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs min of two columns. See TTable::ColGenericOp. More... | |
void | ColMax (const TStr &Attr1, const TStr &Attr2, const TStr &ResultAttrName="") |
Performs max of two columns. See TTable::ColGenericOp. More... | |
void | ColGenericOp (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr, TArithOp op, TBool AddToFirstTable) |
Performs columnwise arithmetic operation with column of given table. More... | |
void | ColAdd (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise addition with column of given table. More... | |
void | ColSub (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise subtraction with column of given table. More... | |
void | ColMul (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise multiplication with column of given table. More... | |
void | ColDiv (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise division with column of given table. More... | |
void | ColMod (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &ResAttr="", TBool AddToFirstTable=true) |
Performs columnwise modulus with column of given table. More... | |
void | ColGenericOp (const TStr &Attr1, const TFlt &Num, const TStr &ResAttr, TArithOp op, const TBool floatCast) |
Performs arithmetic op of column values and given Num . More... | |
void | ColGenericOpMP (const TInt &ColIdx1, const TInt &ColIdx2, TAttrType ArgType, const TFlt &Num, TArithOp op, TBool ShouldCast) |
void | ColAdd (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs addition of column values and given Num . More... | |
void | ColSub (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs subtraction of column values and given Num . More... | |
void | ColMul (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs multiplication of column values and given Num . More... | |
void | ColDiv (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs division of column values and given Num . More... | |
void | ColMod (const TStr &Attr1, const TFlt &Num, const TStr &ResultAttrName="", const TBool floatCast=false) |
Performs modulus of column values and given Num . More... | |
void | ColConcat (const TStr &Attr1, const TStr &Attr2, const TStr &Sep="", const TStr &ResAttr="") |
Concatenates two string columns. More... | |
void | ColConcat (const TStr &Attr1, TTable &Table, const TStr &Attr2, const TStr &Sep="", const TStr &ResAttr="", TBool AddToFirstTable=true) |
Concatenates string column with column of given table. More... | |
void | ColConcatConst (const TStr &Attr1, const TStr &Val, const TStr &Sep="", const TStr &ResAttr="") |
Concatenates column values with given string value. More... | |
void | ReadIntCol (const TStr &ColName, TIntV &Result) const |
Reads values of entire int column into Result . More... | |
void | ReadFltCol (const TStr &ColName, TFltV &Result) const |
Reads values of entire float column into Result . More... | |
void | ReadStrCol (const TStr &ColName, TStrV &Result) const |
Reads values of entire string column into Result . More... | |
void | InitIds () |
Adds explicit row ids, initialize hash set mapping ids to physical rows. More... | |
PTable | IsNextK (const TStr &OrderCol, TInt K, const TStr &GroupBy, const TStr &RankColName="") |
Distance based filter. More... | |
void | PrintSize () |
void | PrintContextSize () |
TSize | GetMemUsedKB () |
Returns approximate memory used by table in [KB]. More... | |
TSize | GetContextMemUsedKB () |
Returns approximate memory used by table context in [KB]. More... | |
Static Public Member Functions | |
static void | SetMP (TInt Value) |
static TInt | GetMP () |
static TStr | NormalizeColName (const TStr &ColName) |
Adds suffix to column name if it doesn't exist. More... | |
static TStrV | NormalizeColNameV (const TStrV &Cols) |
Adds suffix to column name if it doesn't exist. More... | |
static PTable | New () |
static PTable | New (TTableContext *Context) |
static PTable | New (const Schema &S, TTableContext *Context) |
static PTable | New (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Returns pointer to a table constructed from given int->int hash. More... | |
static PTable | New (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Returns pointer to a table constructed from given int->float hash. More... | |
static PTable | New (const PTable Table) |
Returns pointer to a new table created from given Table . More... | |
static void | GetSchema (const TStr &InFNm, Schema &S, const char &Separator= '\t') |
Returns pointer to a new table created from given Table , with name set to TableName . More... | |
static PTable | LoadSS (const Schema &S, const TStr &InFNm, TTableContext *Context, const char &Separator= '\t', TBool HasTitleLine=false) |
Loads table from spread sheet (TSV, CSV, etc). Note: HasTitleLine = true is not supported. Please comment title lines instead. More... | |
static PTable | LoadSS (const Schema &S, const TStr &InFNm, TTableContext *Context, const TIntV &RelevantCols, const char &Separator= '\t', TBool HasTitleLine=false) |
Loads table from spread sheet - but only load the columns specified by RelevantCols. Note: HasTitleLine = true is not supported. Please comment title lines instead. More... | |
static PTable | Load (TSIn &SIn, TTableContext *Context) |
Loads table from a binary format. More... | |
static PTable | TableFromHashMap (const THash< TInt, TInt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Builds table from hash table of int->int. More... | |
static PTable | TableFromHashMap (const THash< TInt, TFlt > &H, const TStr &Col1, const TStr &Col2, TTableContext *Context, const TBool IsStrKeys=false) |
Builds table from hash table of int->float. More... | |
static PTable | GetNodeTable (const PNEANet &Network, TTableContext *Context) |
Extracts node TTable from PNEANet. More... | |
static PTable | GetEdgeTable (const PNEANet &Network, TTableContext *Context) |
Extracts edge TTable from PNEANet. More... | |
static PTable | GetEdgeTablePN (const PNGraphMP &Network, TTableContext *Context) |
Extracts edge TTable from parallel graph PNGraphMP. More... | |
static PTable | GetFltNodePropertyTable (const PNEANet &Network, const TIntFltH &Property, const TStr &NodeAttrName, const TAttrType &NodeAttrType, const TStr &PropertyAttrName, TTableContext *Context) |
Extracts node and edge property TTables from THash. More... | |
static TTableIterator | GetMapPageRank (const TVec< PNEANet > &GraphSeq, TTableContext *Context, const double &C=0.85, const double &Eps=1e-4, const int &MaxIter=100) |
Gets sequence of PageRank tables from given GraphSeq . More... | |
static TTableIterator | GetMapHitsIterator (const TVec< PNEANet > &GraphSeq, TTableContext *Context, const int &MaxIter=20) |
Gets sequence of Hits tables from given GraphSeq . More... | |
Protected Member Functions | |
void | InvalidatePhysicalGroupings () |
void | InvalidateAffectedGroupings (const TStr &Attr) |
void | IncrementNext () |
Increments the next vector and set last, NumRows and NumValidRows. More... | |
void | ClassifyAux (const TIntV &SelectedRows, const TStr &LabelName, const TInt &PositiveLabel=1, const TInt &NegativeLabel=0) |
Adds a label attribute with positive labels on selected rows and negative labels on the rest. More... | |
const char * | GetContextKey (TInt Val) const |
Gets the Key of the Context StringVals pool. Used by ToGraph method in conv.cpp. More... | |
TStr | GetStrVal (TInt ColIdx, TInt RowIdx) const |
Gets the value in column with id ColIdx at row RowIdx . More... | |
void | AddStrVal (const TInt &ColIdx, const TStr &Val) |
Adds Val in column with id ColIdx . More... | |
void | AddStrVal (const TStr &Col, const TStr &Val) |
Adds Val in column with name Col . More... | |
TStr | GetIdColName () const |
Gets name of the id column of this table. More... | |
TStr | GetSchemaColName (TInt Idx) const |
Gets name of the column with index Idx in the schema. More... | |
TAttrType | GetSchemaColType (TInt Idx) const |
Gets type of the column with index Idx in the schema. More... | |
void | AddSchemaCol (const TStr &ColName, TAttrType ColType) |
Adds column with name ColName and type ColType to the schema. More... | |
TBool | IsColName (const TStr &ColName) const |
void | AddColType (const TStr &ColName, TPair< TAttrType, TInt > ColType) |
Adds column with name ColName and type ColType to the ColTypeMap. More... | |
void | AddColType (const TStr &ColName, TAttrType ColType, TInt Index) |
Adds column with name ColName and type ColType to the ColTypeMap. More... | |
void | DelColType (const TStr &ColName) |
Adds column with name ColName and type ColType to the ColTypeMap. More... | |
TPair< TAttrType, TInt > | GetColTypeMap (const TStr &ColName) const |
Gets column type and index of ColName . More... | |
TStr | RenumberColName (const TStr &ColName) const |
Returns a re-numbered column name based on number of existing columns with conflicting names. More... | |
TStr | DenormalizeColName (const TStr &ColName) const |
Removes suffix to column name if exists. More... | |
Schema | DenormalizeSchema () const |
Removes suffix to column names in the Schema. More... | |
TBool | IsAttr (const TStr &Attr) |
Checks if Attr is an attribute of this table schema. More... | |
void | AddTable (const TTable &T) |
Adds all the rows of the input table. Allows duplicate rows (not a union). More... | |
void | ConcatTable (const PTable &T) |
Appends all rows of T to this table, and recalculate indices. More... | |
void | AddRow (const TRowIterator &RI) |
Adds row corresponding to RI . More... | |
void | AddRow (const TIntV &IntVals, const TFltV &FltVals, const TStrV &StrVals) |
Adds row with values corresponding to the given vectors by type. More... | |
void | AddGraphAttribute (const TStr &Attr, TBool IsEdge, TBool IsSrc, TBool IsDst) |
Adds names of columns to be used as graph attributes. More... | |
void | AddGraphAttributeV (TStrV &Attrs, TBool IsEdge, TBool IsSrc, TBool IsDst) |
Adds vector of names of columns to be used as graph attributes. More... | |
void | CheckAndAddIntNode (PNEANet Graph, THashSet< TInt > &NodeVals, TInt NodeId) |
Checks if given NodeId is seen earlier; if not, add it to Graph and hashmap NodeVals . More... | |
template<class T > | |
TInt | CheckAndAddFltNode (T Graph, THash< TFlt, TInt > &NodeVals, TFlt FNodeVal) |
Checks if given NodeVal is seen earlier; if not, add it to Graph and hashmap NodeVals . More... | |
void | AddEdgeAttributes (PNEANet &Graph, int RowId) |
Adds attributes of edge corresponding to RowId to the Graph . More... | |
void | AddNodeAttributes (TInt NId, TStrV NodeAttrV, TInt RowId, THash< TInt, TStrIntVH > &NodeIntAttrs, THash< TInt, TStrFltVH > &NodeFltAttrs, THash< TInt, TStrStrVH > &NodeStrAttrs) |
Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribute values). More... | |
PNEANet | BuildGraph (const TIntV &RowIds, TAttrAggr AggrPolicy) |
Makes a single pass over the rows in the given row id set, and creates nodes, edges, assigns node and edge attributes. More... | |
void | InitRowIdBuckets (int NumBuckets) |
Initializes the RowIdBuckets vector which will be used for the graph sequence creation. More... | |
void | FillBucketsByWindow (TStr SplitAttr, TInt JumpSize, TInt WindowSize, TInt StartVal, TInt EndVal) |
Fills RowIdBuckets with sets of row ids. More... | |
void | FillBucketsByInterval (TStr SplitAttr, TIntPrV SplitIntervals) |
Fills RowIdBuckets with sets of row ids. More... | |
TVec< PNEANet > | GetGraphsFromSequence (TAttrAggr AggrPolicy) |
Returns a sequence of graphs. More... | |
PNEANet | GetFirstGraphFromSequence (TAttrAggr AggrPolicy) |
Returns the first graph of the sequence. More... | |
PNEANet | GetNextGraphFromSequence () |
Returns the next graph in sequence corresponding to RowIdBuckets. More... | |
template<class T > | |
T | AggregateVector (TVec< T > &V, TAttrAggr Policy) |
Aggregates vector into a single scalar value according to a policy. More... | |
void | GroupingSanityCheck (const TStr &GroupBy, const TAttrType &AttrType) const |
Checks if grouping key exists and matches given attr type. More... | |
template<class T > | |
void | GroupByIntCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with integer values. More... | |
template<class T > | |
void | GroupByFltCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with float values. Returns hash table with grouping. More... | |
template<class T > | |
void | GroupByStrCol (const TStr &GroupBy, T &Grouping, const TIntV &IndexSet, TBool All, TBool UsePhysicalIds=true) const |
Groups/hashes by a single column with string values. Returns hash table with grouping. More... | |
template<class T > | |
void | UpdateGrouping (THash< T, TIntV > &Grouping, T Key, TInt Val) const |
Template for utility function to update a grouping hash map. More... | |
template<class T > | |
void | UpdateGrouping (THashMP< T, TIntV > &Grouping, T Key, TInt Val) const |
Template for utility function to update a parallel grouping hash map. More... | |
void | PrintGrouping (const THash< TGroupKey, TIntV > &Grouping) const |
TInt | CompareRows (TInt R1, TInt R2, const TAttrType &CompareByType, const TInt &CompareByIndex, TBool Asc=true) |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics). More... | |
TInt | CompareRows (TInt R1, TInt R2, const TVec< TAttrType > &CompareByTypes, const TIntV &CompareByIndices, TBool Asc=true) |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics). More... | |
TInt | GetPivot (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc) |
Gets pivot element for QSort. More... | |
TInt | Partition (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc) |
Partitions vector for QSort. More... | |
void | ISort (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Performs insertion sort on given vector V . More... | |
void | QSort (TIntV &V, TInt StartIdx, TInt EndIdx, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Performs QSort on given vector V . More... | |
void | Merge (TIntV &V, TInt Idx1, TInt Idx2, TInt Idx3, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Helper function for parallel QSort. More... | |
void | QSortPar (TIntV &V, const TVec< TAttrType > &SortByTypes, const TIntV &SortByIndices, TBool Asc=true) |
Performs QSort in parallel on given vector V . More... | |
bool | IsRowValid (TInt RowIdx) const |
Checks if RowIdx corresponds to a valid (i.e. not deleted) row. More... | |
TInt | GetLastValidRowIdx () |
Gets the id of the last valid row of the table. More... | |
void | RemoveFirstRow () |
Removes first valid row of the table. More... | |
void | RemoveRow (TInt RowIdx, TInt PrevRowIdx) |
Removes row with id RowIdx . More... | |
void | KeepSortedRows (const TIntV &KeepV) |
Removes all rows that are not mentioned in the SORTED vector KeepV . More... | |
void | SetFirstValidRow () |
Sets the first valid row of the TTable. More... | |
PTable | InitializeJointTable (const TTable &Table) |
Initializes an empty table for the join of this table with the given table. More... | |
void | AddJointRow (const TTable &T1, const TTable &T2, TInt RowIdx1, TInt RowIdx2) |
Adds joint row T1[RowIdx1]<=>T2[RowIdx2]. More... | |
void | ThresholdJoinInputCorrectness (const TStr &KeyCol1, const TStr &JoinCol1, const TTable &Table, const TStr &KeyCol2, const TStr &JoinCol2) |
void | ThresholdJoinCountCollisions (const TTable &TB, const TTable &TS, const TIntIntVH &T, TInt JoinColIdxB, TInt KeyColIdxB, TInt KeyColIdxS, THash< TIntPr, TIntTr > &Counters, TBool ThisIsSmaller, TAttrType JoinColType, TAttrType KeyType) |
PTable | ThresholdJoinOutputTable (const THash< TIntPr, TIntTr > &Counters, TInt Threshold, const TTable &Table) |
void | ThresholdJoinCountPerJoinKeyCollisions (const TTable &TB, const TTable &TS, const TIntIntVH &T, TInt JoinColIdxB, TInt KeyColIdxB, TInt KeyColIdxS, THash< TIntTr, TIntTr > &Counters, TBool ThisIsSmaller, TAttrType JoinColType, TAttrType KeyType) |
PTable | ThresholdJoinPerJoinKeyOutputTable (const THash< TIntTr, TIntTr > &Counters, TInt Threshold, const TTable &Table) |
void | ResizeTable (int RowCount) |
Resizes the table to hold RowCount rows. More... | |
int | GetEmptyRowsStart (int NewRows) |
Gets the start index to a chunk of empty rows of size NewRows . More... | |
void | AddSelectedRows (const TTable &Table, const TIntV &RowIDs) |
Adds rows from Table that correspond to ids in RowIDs . More... | |
void | AddNRows (int NewRows, const TVec< TIntV > &IntColsP, const TVec< TFltV > &FltColsP, const TVec< TIntV > &StrColMapsP) |
Adds NewRows rows from the given vectors for each column type. More... | |
void | AddNJointRowsMP (const TTable &T1, const TTable &T2, const TVec< TIntPrV > &JointRowIDSet) |
Adds rows from T1 and T2 to this table in a parallel manner. Used by Join. More... | |
void | UpdateTableForNewRow () |
Updates table state after adding one or more rows. More... | |
void | GroupAux (const TStrV &GroupBy, THash< TGroupKey, TPair< TInt, TIntV > > &Grouping, TBool Ordered, const TStr &GroupColName, TBool KeepUnique, TIntV &UniqueVec, TBool UsePhysicalIds=true) |
Helper function for grouping. More... | |
void | StoreGroupCol (const TStr &GroupColName, const TVec< TPair< TInt, TInt > > &GroupAndRowIds) |
Parallel helper function for grouping. - we currently don't support such parallel grouping by complex keys. More... | |
void | Reindex () |
Reinitializes row ids. More... | |
void | AddIdColumn (const TStr &IdColName) |
Adds a column of explicit integer identifiers to the rows. More... | |
void | GetCollidingRows (const TTable &T, THashSet< TInt > &Collisions) |
Gets set of row ids of rows common with table T . More... | |
Static Protected Member Functions | |
static void | LoadSSPar (PTable &NewTable, const Schema &S, const TStr &InFNm, const TIntV &RelevantCols, const char &Separator, TBool HasTitleLine) |
Parallelly loads data from input file at InFNm into NewTable. Only work when NewTable has no string columns. More... | |
static void | LoadSSSeq (PTable &NewTable, const Schema &S, const TStr &InFNm, const TIntV &RelevantCols, const char &Separator, TBool HasTitleLine) |
Sequentially loads data from input file at InFNm into NewTable. More... | |
static TInt | CompareKeyVal (const TInt &K1, const TInt &V1, const TInt &K2, const TInt &V2) |
static TInt | CheckSortedKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static void | ISortKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static TInt | GetPivotKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static TInt | PartitionKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
static void | QSortKeyVal (TIntV &Key, TIntV &Val, TInt Start, TInt End) |
Protected Attributes | |
TTableContext * | Context |
Execution Context. More... | |
Schema | Sch |
Table Schema. More... | |
TCRef | CRef |
TInt | NumRows |
Number of rows in the table (valid and invalid). More... | |
TInt | NumValidRows |
Number of valid rows in the table (i.e. rows that were not logically removed). More... | |
TInt | FirstValidRow |
Physical index of first valid row. More... | |
TInt | LastValidRow |
Physical index of last valid row. More... | |
TIntV | Next |
A vector describing the logical order of the rows. More... | |
TVec< TIntV > | IntCols |
Next [i] is the successor of row i . Table iterators follow the order dictated by Next More... | |
TVec< TFltV > | FltCols |
Data columns of floating point attributes. More... | |
TVec< TIntV > | StrColMaps |
Data columns of integer mappings of string attributes. More... | |
THash< TStr, TPair< TAttrType, TInt > > | ColTypeMap |
TStr | IdColName |
A mapping from column name to column type and column index among columns of the same type. More... | |
TIntIntH | RowIdMap |
Mapping of permanent row ids to physical id. More... | |
THash< TStr, THash< TInt, TIntV > > | IntColIndexes |
Indexes for Int Columns. More... | |
THash< TStr, THash< TInt, TIntV > > | StrMapColIndexes |
Indexes for String Columns. More... | |
THash< TStr, THash< TFlt, TIntV > > | FltColIndexes |
Indexes for Float Columns. More... | |
THash< TStr, GroupStmt > | GroupStmtNames |
Maps user-given grouping statement names to their group-by attributes. More... | |
THash< GroupStmt, THash< TInt, TGroupKey > > | GroupIDMapping |
Maps grouping statements to their (group id –> group-by key) mapping. More... | |
THash< GroupStmt, THash < TGroupKey, TIntV > > | GroupMapping |
Maps grouping statements to their (group-by key –> group id) mapping. More... | |
TStr | SrcCol |
Column (attribute) to serve as src nodes when constructing the graph. More... | |
TStr | DstCol |
Column (attribute) to serve as dst nodes when constructing the graph. More... | |
TStrV | EdgeAttrV |
List of columns (attributes) to serve as edge attributes. More... | |
TStrV | SrcNodeAttrV |
List of columns (attributes) to serve as source node attributes. More... | |
TStrV | DstNodeAttrV |
List of columns (attributes) to serve as destination node attributes. More... | |
TStrTrV | CommonNodeAttrs |
List of attribute pairs with values common to source and destination and their common given name. More... | |
TVec< TIntV > | RowIdBuckets |
Partitioning of row ids into buckets corresponding to different graph objects when generating a sequence of graphs. More... | |
TInt | CurrBucket |
Current row id bucket - used when generating a sequence of graphs using an iterator. More... | |
TAttrAggr | AggrPolicy |
Aggregation policy used for solving conflicts between different values of an attribute of the same node. More... | |
TInt | IsNextDirty |
Flag to signify whether the rows are stored in logical sequence or reordered. Used for optimizing GetPartitionRanges. More... | |
Static Protected Attributes | |
static const TInt | Last = -1 |
Special value for Next vector entry - last row in table. More... | |
static const TInt | Invalid = -2 |
Special value for Next vector entry - logically removed row. More... | |
static TInt | UseMP = 1 |
Global switch for choosing multi-threaded versions of TTable functions. More... | |
Friends | |
class | TPt< TTable > |
class | TRowIterator |
class | TRowIteratorWithRemove |
template<class PGraph > | |
PGraph | TSnap::ToGraph (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy) |
template<class PGraph > | |
PGraph | TSnap::ToNetwork (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, PTable NodeTable, const TStr &NodeCol, TStrV &NodeAttrV, TAttrAggr AggrPolicy) |
int | TSnap::LoadCrossNet (TCrossNet &Graph, PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV) |
int | TSnap::LoadMode (TModeNet &Graph, PTable Table, const TStr &NCol, TStrV &NodeAttrV) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToGraphMP (PTable Table, const TStr &SrcCol, const TStr &DstCol) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToGraphMP3 (PTable Table, const TStr &SrcCol, const TStr &DstCol) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP2 (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &SrcAttrs, TStrV &DstAttrs, TStrV &EdgeAttrs, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TAttrAggr AggrPolicy) |
template<class PGraphMP > | |
PGraphMP | TSnap::ToNetworkMP (PTable Table, const TStr &SrcCol, const TStr &DstCol, TStrV &EdgeAttrV, PTable NodeTable, const TStr &NodeCol, TStrV &NodeAttrV, TAttrAggr AggrPolicy) |
TTable::TTable | ( | ) |
Definition at line 302 of file table.cpp.
TTable::TTable | ( | TTableContext * | Context | ) |
Definition at line 305 of file table.cpp.
TTable::TTable | ( | const Schema & | S, |
TTableContext * | Context | ||
) |
Definition at line 308 of file table.cpp.
TTable::TTable | ( | TSIn & | SIn, |
TTableContext * | Context | ||
) |
Definition at line 337 of file table.cpp.
TTable::TTable | ( | const THash< TInt, TInt > & | H, |
const TStr & | Col1, | ||
const TStr & | Col2, | ||
TTableContext * | Context, | ||
const TBool | IsStrKeys = false |
||
) |
Constructor to build table out of a hash table of int->int.
Definition at line 365 of file table.cpp.
TTable::TTable | ( | const THash< TInt, TFlt > & | H, |
const TStr & | Col1, | ||
const TStr & | Col2, | ||
TTableContext * | Context, | ||
const TBool | IsStrKeys = false |
||
) |
Constructor to build table out of a hash table of int->float.
Definition at line 392 of file table.cpp.
|
inline |
Copy constructor.
Definition at line 918 of file table.h.
Definition at line 418 of file table.cpp.
Adds column with name ColName
and type ColType
to the ColTypeMap.
Definition at line 661 of file table.h.
Adds column with name ColName
and type ColType
to the ColTypeMap.
Definition at line 666 of file table.h.
|
inline |
Adds column to be used as dst node atribute of the graph.
Definition at line 1171 of file table.h.
|
inline |
Adds columns to be used as dst node attributes of the graph.
Definition at line 1173 of file table.h.
|
inline |
Adds column to be used as graph edge attribute.
Definition at line 1163 of file table.h.
|
inline |
Adds columns to be used as graph edge attributes.
Definition at line 1165 of file table.h.
|
inlineprotected |
Adds attributes of edge corresponding to RowId
to the Graph
.
Definition at line 3375 of file table.cpp.
void TTable::AddFltCol | ( | const TStr & | ColName | ) |
Adds a float column with name ColName
.
Definition at line 4657 of file table.cpp.
|
protected |
Adds names of columns to be used as graph attributes.
Definition at line 965 of file table.cpp.
Adds vector of names of columns to be used as graph attributes.
Definition at line 972 of file table.cpp.
|
protected |
Adds a column of explicit integer identifiers to the rows.
Definition at line 1880 of file table.cpp.
void TTable::AddIntCol | ( | const TStr & | ColName | ) |
Adds an integer column with name ColName
.
Definition at line 4650 of file table.cpp.
|
protected |
Adds joint row T1[RowIdx1]<=>T2[RowIdx2].
Definition at line 1937 of file table.cpp.
|
protected |
Adds rows from T1 and T2 to this table in a parallel manner. Used by Join.
Definition at line 4419 of file table.cpp.
|
inline |
Handles the common case where src and dst both belong to the same "universe" of entities.
Definition at line 1175 of file table.h.
|
inline |
Handles the common case where src and dst both belong to the same "universe" of entities.
Definition at line 1177 of file table.h.
|
inlineprotected |
Takes as parameters, and updates, maps NodeXAttrs: Node Id –> (attribute name –> Vector of attribute values).
Definition at line 3394 of file table.cpp.
|
protected |
Adds NewRows
rows from the given vectors for each column type.
Definition at line 4398 of file table.cpp.
|
protected |
Adds row corresponding to RI
.
Definition at line 4272 of file table.cpp.
|
protected |
Adds row with values corresponding to the given vectors by type.
Definition at line 4294 of file table.cpp.
|
inline |
Adds row with values taken from given TTableRow.
Definition at line 993 of file table.h.
Adds column with name ColName
and type ColType
to the schema.
Definition at line 652 of file table.h.
Adds rows from Table
that correspond to ids in RowIDs
.
Definition at line 4376 of file table.cpp.
|
inline |
Adds column to be used as src node atribute of the graph.
Definition at line 1167 of file table.h.
|
inline |
Adds columns to be used as src node attributes of the graph.
Definition at line 1169 of file table.h.
void TTable::AddStrCol | ( | const TStr & | ColName | ) |
Adds a string column with name ColName
.
Definition at line 4664 of file table.cpp.
Adds Val
in column with id ColIdx
.
Definition at line 951 of file table.cpp.
Adds Val
in column with name Col
.
Definition at line 957 of file table.cpp.
|
protected |
Adds all the rows of the input table. Allows duplicate rows (not a union).
Definition at line 3952 of file table.cpp.
void TTable::Aggregate | ( | const TStrV & | GroupByAttrs, |
TAttrAggr | AggOp, | ||
const TStr & | ValAttr, | ||
const TStr & | ResAttr, | ||
TBool | Ordered = true |
||
) |
Aggregates values of ValAttr after grouping with respect to GroupByAttrs. Result are stored as new attribute ResAttr.
Definition at line 1565 of file table.cpp.
Aggregates attributes in AggrAttrs across columns.
Definition at line 1730 of file table.cpp.
Aggregates vector into a single scalar value according to a policy.
Aggregate vector into a single scalar value according to a policy. Used for choosing an attribute value for a node when this node appears in several records and has conflicting attribute values
Definition at line 1551 of file table.h.
|
inline |
|
inline |
Makes a single pass over the rows in the given row id set, and creates nodes, edges, assigns node and edge attributes.
Definition at line 3425 of file table.cpp.
TTableContext * TTable::ChangeContext | ( | TTableContext * | Context | ) |
Changes the current context. Moves all object items to the new context.
Definition at line 901 of file table.cpp.
|
protected |
|
inlineprotected |
Definition at line 5287 of file table.cpp.
void TTable::Classify | ( | TPredicate & | Predicate, |
const TStr & | LabelName, | ||
const TInt & | PositiveLabel = 1 , |
||
const TInt & | NegativeLabel = 0 |
||
) |
Definition at line 2785 of file table.cpp.
void TTable::ClassifyAtomic | ( | const TStr & | Col1, |
const TStr & | Col2, | ||
TPredComp | Cmp, | ||
const TStr & | LabelName, | ||
const TInt & | PositiveLabel = 1 , |
||
const TInt & | NegativeLabel = 0 |
||
) |
Definition at line 2846 of file table.cpp.
|
inline |
Definition at line 1292 of file table.h.
|
protected |
Adds a label attribute with positive labels on selected rows and negative labels on the rest.
Definition at line 4671 of file table.cpp.
Performs columnwise addition. See TTable::ColGenericOp.
Definition at line 4793 of file table.cpp.
void TTable::ColAdd | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise addition with column of given table.
Definition at line 4926 of file table.cpp.
void TTable::ColAdd | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs addition of column values and given Num
.
Definition at line 5040 of file table.cpp.
void TTable::ColConcat | ( | const TStr & | Attr1, |
const TStr & | Attr2, | ||
const TStr & | Sep = "" , |
||
const TStr & | ResAttr = "" |
||
) |
Concatenates two string columns.
Definition at line 5060 of file table.cpp.
void TTable::ColConcat | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | Sep = "" , |
||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Concatenates string column with column of given table.
Definition at line 5094 of file table.cpp.
void TTable::ColConcatConst | ( | const TStr & | Attr1, |
const TStr & | Val, | ||
const TStr & | Sep = "" , |
||
const TStr & | ResAttr = "" |
||
) |
Concatenates column values with given string value.
Definition at line 5159 of file table.cpp.
Performs columnwise division. See TTable::ColGenericOp.
Definition at line 4805 of file table.cpp.
void TTable::ColDiv | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise division with column of given table.
Definition at line 4941 of file table.cpp.
void TTable::ColDiv | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs division of column values and given Num
.
Definition at line 5052 of file table.cpp.
void TTable::ColGenericOp | ( | const TStr & | Attr1, |
const TStr & | Attr2, | ||
const TStr & | ResAttr, | ||
TArithOp | op | ||
) |
Performs columnwise arithmetic operation.
Performs Attr1 OP Attr2 and stores it in Attr1 If ResAttr != "", result is stored in a new column ResAttr
Definition at line 4729 of file table.cpp.
void TTable::ColGenericOp | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr, | ||
TArithOp | op, | ||
TBool | AddToFirstTable | ||
) |
Performs columnwise arithmetic operation with column of given table.
Definition at line 4821 of file table.cpp.
void TTable::ColGenericOp | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResAttr, | ||
TArithOp | op, | ||
const TBool | floatCast | ||
) |
Performs arithmetic op of column values and given Num
.
Definition at line 4952 of file table.cpp.
void TTable::ColGenericOpMP | ( | TInt | ArgColIdx1, |
TInt | ArgColIdx2, | ||
TAttrType | ArgType1, | ||
TAttrType | ArgType2, | ||
TInt | ResColIdx, | ||
TArithOp | op | ||
) |
Definition at line 4685 of file table.cpp.
void TTable::ColGenericOpMP | ( | const TInt & | ColIdx1, |
const TInt & | ColIdx2, | ||
TAttrType | ArgType, | ||
const TFlt & | Num, | ||
TArithOp | op, | ||
TBool | ShouldCast | ||
) |
Definition at line 5009 of file table.cpp.
Performs max of two columns. See TTable::ColGenericOp.
Definition at line 4817 of file table.cpp.
Performs min of two columns. See TTable::ColGenericOp.
Definition at line 4813 of file table.cpp.
Performs columnwise modulus. See TTable::ColGenericOp.
Definition at line 4809 of file table.cpp.
void TTable::ColMod | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise modulus with column of given table.
Definition at line 4946 of file table.cpp.
void TTable::ColMod | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs modulus of column values and given Num
.
Definition at line 5056 of file table.cpp.
Performs columnwise multiplication. See TTable::ColGenericOp.
Definition at line 4801 of file table.cpp.
void TTable::ColMul | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise multiplication with column of given table.
Definition at line 4936 of file table.cpp.
void TTable::ColMul | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs multiplication of column values and given Num
.
Definition at line 5048 of file table.cpp.
Performs columnwise subtraction. See TTable::ColGenericOp.
Definition at line 4797 of file table.cpp.
void TTable::ColSub | ( | const TStr & | Attr1, |
TTable & | Table, | ||
const TStr & | Attr2, | ||
const TStr & | ResAttr = "" , |
||
TBool | AddToFirstTable = true |
||
) |
Performs columnwise subtraction with column of given table.
Definition at line 4931 of file table.cpp.
void TTable::ColSub | ( | const TStr & | Attr1, |
const TFlt & | Num, | ||
const TStr & | ResultAttrName = "" , |
||
const TBool | floatCast = false |
||
) |
Performs subtraction of column values and given Num
.
Definition at line 5044 of file table.cpp.
|
staticprotected |
|
inlineprotected |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics).
Definition at line 3044 of file table.cpp.
|
inlineprotected |
Returns positive value if R1 is bigger, negative value if R2 is bigger, and 0 if they are equal (strcmp semantics).
Definition at line 3068 of file table.cpp.
|
inlineprotected |
Appends all rows of T
to this table, and recalculate indices.
Definition at line 693 of file table.h.
Counts number of unique elements.
Count the number of appearences of the different elements of column . Record results in column CountCol
Definition at line 1782 of file table.cpp.
void TTable::Defrag | ( | ) |
Releases memory of deleted rows, and defrags.
Also updates meta-data as row indices have changed Need some liveness analysis of columns
Definition at line 3291 of file table.cpp.
|
inlineprotected |
Adds column with name ColName
and type ColType
to the ColTypeMap.
Definition at line 671 of file table.h.
|
protected |
Removes suffix to column names in the Schema.
Definition at line 4642 of file table.cpp.
void TTable::Dump | ( | FILE * | OutF = stdout | ) | const |
Prints table contents to a text file.
Definition at line 867 of file table.cpp.
|
inline |
Gets iterator to the last valid row of the table.
Definition at line 1234 of file table.h.
|
inline |
Gets iterator with reomve to the last valid row.
Definition at line 1238 of file table.h.
Fills RowIdBuckets with sets of row ids.
Fill RowIdBuckets with sets of row ids, partitioned on the value of the column SplitAttr, according to the intervals specified by SplitIntervals. Called by ToVarGraphSequence and ToVarGraphSequenceIterator.
Definition at line 3577 of file table.cpp.
|
protected |
Fills RowIdBuckets with sets of row ids.
Fill RowIdBuckets with sets of row ids partitioned on the value of the column SplitAttr, according to the windows specified by JumpSize and WindowSize. Called by ToGraphSequence and ToGraphSequenceIterator.
Definition at line 3527 of file table.cpp.
Gets index of column ColName
among columns of the same type in the schema.
Definition at line 1004 of file table.h.
Gets set of row ids of rows common with table T
.
Definition at line 3991 of file table.cpp.
Gets type of column ColName
.
Definition at line 1218 of file table.h.
Gets column type and index of ColName
.
Definition at line 676 of file table.h.
|
inline |
|
inlineprotected |
Gets the Key of the Context StringVals pool. Used by ToGraph method in conv.cpp.
Definition at line 632 of file table.h.
TSize TTable::GetContextMemUsedKB | ( | ) |
Returns approximate memory used by table context in [KB].
Definition at line 3946 of file table.cpp.
|
inline |
Gets the name of the column to be used as dst nodes in the graph.
Definition at line 1156 of file table.h.
TStrV TTable::GetDstNodeFltAttrV | ( | ) | const |
Gets dst node float attribute name vector.
Definition at line 1029 of file table.cpp.
TStrV TTable::GetDstNodeIntAttrV | ( | ) | const |
Gets dst node int attribute name vector.
Definition at line 996 of file table.cpp.
TStrV TTable::GetDstNodeStrAttrV | ( | ) | const |
Gets dst node str attribute name vector.
Definition at line 1062 of file table.cpp.
TStrV TTable::GetEdgeFltAttrV | ( | ) | const |
Gets edge float attribute name vector.
Definition at line 1040 of file table.cpp.
TStrV TTable::GetEdgeIntAttrV | ( | ) | const |
Gets edge int attribute name vector.
Definition at line 1007 of file table.cpp.
TStrV TTable::GetEdgeStrAttrV | ( | ) | const |
Gets edge str attribute name vector.
Definition at line 1074 of file table.cpp.
|
static |
Extracts edge TTable from PNEANet.
Definition at line 3719 of file table.cpp.
|
static |
Extracts edge TTable from parallel graph PNGraphMP.
Definition at line 3777 of file table.cpp.
|
protected |
Gets the start index to a chunk of empty rows of size NewRows
.
Definition at line 4353 of file table.cpp.
Returns the first graph of the sequence.
Return the first graph of the sequence corresponding to the sets of row ids in RowIdBuckets. This is used by the ToGraph*Iterator functions.
Definition at line 3606 of file table.cpp.
|
static |
Extracts node and edge property TTables from THash.
Definition at line 3830 of file table.cpp.
Gets the rows containing Val in flt column ColName
.
Returns the RowIdxs in the float column given by ColName which have value Val, as a Vector. (If no such value is found, returns an empty vector.) Uses an index created by RequestIndex method if it exists, else loops over the entire table (which can be slow, so it is recommended to request an index if multiple queries must be made).
Definition at line 5430 of file table.cpp.
Gets the value of float attribute ColName
at row RowIdx
.
Definition at line 1015 of file table.h.
Get the float value at column ColIdx
and row RowIdx
.
Definition at line 1111 of file table.h.
Returns a sequence of graphs.
Return a sequence of graphs, each constructed from the set of row ids corresponding to a particular bucket in RowIdBuckets.
Definition at line 3594 of file table.cpp.
|
inlineprotected |
Gets name of the id column of this table.
Definition at line 646 of file table.h.
Gets the rows containing Val in int column ColName
.
Returns the RowIdxs in the integer column given by ColName which have value Val, as a Vector. (If no such value is found, returns an empty vector.) Uses an index created by RequestIndex method if it exists, else loops over the entire table (which can be slow, so it is recommended to request an index if multiple queries must be made).
Definition at line 5387 of file table.cpp.
Gets the value of integer attribute ColName
at row RowIdx
.
Definition at line 1011 of file table.h.
Get the integer value at column ColIdx
and row RowIdx
.
Definition at line 1107 of file table.h.
|
protected |
Gets the id of the last valid row of the table.
|
inlinestatic |
Gets sequence of Hits tables from given GraphSeq
.
Definition at line 1518 of file table.h.
|
inlinestatic |
Gets sequence of PageRank tables from given GraphSeq
.
Definition at line 1510 of file table.h.
TSize TTable::GetMemUsedKB | ( | ) |
Returns approximate memory used by table in [KB].
Definition at line 3918 of file table.cpp.
|
inlinestatic |
Definition at line 537 of file table.h.
|
protected |
Returns the next graph in sequence corresponding to RowIdBuckets.
Returns the next graph in sequence corresponding to RowIdBuckets. This is used to iterate over the graph sequence by constructing one graph at a time. Called by NextGraphIterator().
Definition at line 3612 of file table.cpp.
|
static |
Extracts node TTable from PNEANet.
Definition at line 3667 of file table.cpp.
|
inline |
|
inline |
Gets number of valid, i.e. not deleted, rows in this table.
Definition at line 1225 of file table.h.
Partitions the table into NumPartitions
and populate Partitions
with the ranges.
Definition at line 1157 of file table.cpp.
|
protected |
Gets pivot element for QSort.
Definition at line 3090 of file table.cpp.
Definition at line 5315 of file table.cpp.
Gets a map of logical to physical row ids.
Definition at line 1228 of file table.h.
Returns pointer to a new table created from given Table
, with name set to TableName
.
Automatically detects the Schema of a input file (data is assumed to be in tsv format)
Definition at line 435 of file table.cpp.
|
inline |
Gets the schema of this table.
Definition at line 1116 of file table.h.
|
inline |
Gets the name of the column to be used as src nodes in the graph.
Definition at line 1149 of file table.h.
TStrV TTable::GetSrcNodeFltAttrV | ( | ) | const |
Gets src node float attribute name vector.
Definition at line 1018 of file table.cpp.
TStrV TTable::GetSrcNodeIntAttrV | ( | ) | const |
Gets src node int attribute name vector.
Definition at line 985 of file table.cpp.
TStrV TTable::GetSrcNodeStrAttrV | ( | ) | const |
Gets src node str attribute name vector.
Definition at line 1051 of file table.cpp.
Gets the string with KeyId
.
Definition at line 1100 of file table.h.
Gets the integer mapping of the string at column ColIdx
at row RowIdx
.
Definition at line 1024 of file table.h.
Gets the integer mapping of the string at column ColName
at row RowIdx
.
Definition at line 1029 of file table.h.
Gets the rows containing int mapping Map in str column ColName
.
Returns the RowIdxs in the string column given by ColName which have the string with integer mapping Map, as a Vector. (If no such value is found, returns an empty vector.) Uses an index created by RequestIndex method if it exists, else loops over the entire table (which can be slow, so it is recommended to request an index if multiple queries must be made).
Definition at line 5408 of file table.cpp.
Gets the value in column with id ColIdx
at row RowIdx
.
Definition at line 636 of file table.h.
Gets the value of string attribute ColName
at row RowIdx
.
Definition at line 1019 of file table.h.
Gets the value of the string attribute at column ColIdx
at row RowIdx
.
Definition at line 1034 of file table.h.
Gets the value of the string attribute at column ColName
at row RowIdx
.
Definition at line 1039 of file table.h.
void TTable::Group | ( | const TStrV & | GroupBy, |
const TStr & | GroupColName, | ||
TBool | Ordered = true , |
||
TBool | UsePhysicalIds = true |
||
) |
Groups rows depending on values of GroupBy
columns.
Specify columns to group by, name of column in new table, whether to treat columns as ordered If name of column is an empty string, no column is created
Definition at line 1549 of file table.cpp.
|
protected |
Helper function for grouping.
If KeepUnique is true, UniqueVec will be modified to contain a row from each group If KeepUnique is false, then normal grouping is done and a new column is added depending on whether GroupColName is empty
Definition at line 1302 of file table.cpp.
|
protected |
Groups/hashes by a single column with float values. Returns hash table with grouping.
Definition at line 1633 of file table.h.
|
protected |
Groups/hashes by a single column with integer values.
Group/hash by a single column with integer values. Returns hash table with grouping. IndexSet tells what rows to consider (vector of physical row ids). It is used only if All == true. Note that the IndexSet option is currently not used anywhere.
Definition at line 1605 of file table.h.
void TTable::GroupByIntColMP | ( | const TStr & | GroupBy, |
THashMP< TInt, TIntV > & | Grouping, | ||
TBool | UsePhysicalIds = true |
||
) | const |
Groups/hashes by a single column with integer values, using OpenMP multi-threading.
Definition at line 1205 of file table.cpp.
|
protected |
Groups/hashes by a single column with string values. Returns hash table with grouping.
Definition at line 1660 of file table.h.
|
protected |
Checks if grouping key exists and matches given attr type.
Definition at line 1195 of file table.cpp.
|
protected |
Increments the next vector and set last, NumRows and NumValidRows.
Definition at line 2235 of file table.cpp.
Initializes an empty table for the join of this table with the given table.
Definition at line 1896 of file table.cpp.
void TTable::InitIds | ( | ) |
Adds explicit row ids, initialize hash set mapping ids to physical rows.
Definition at line 1863 of file table.cpp.
|
protected |
Initializes the RowIdBuckets vector which will be used for the graph sequence creation.
Definition at line 3515 of file table.cpp.
Returns intersection of this table with given Table
.
Definition at line 4544 of file table.cpp.
Definition at line 1413 of file table.h.
|
protected |
|
protected |
Definition at line 656 of file table.h.
TBool TTable::IsLastGraphOfSequence | ( | ) |
Checks if the end of the graph sequence is reached.
Definition at line 3663 of file table.cpp.
PTable TTable::IsNextK | ( | const TStr & | OrderCol, |
TInt | K, | ||
const TStr & | GroupBy, | ||
const TStr & | RankColName = "" |
||
) |
Distance based filter.
Creates a table T' where the rows are joint rows (T[r1],T[r2]) such that r2 is one of the successive rows to r1 when this table is ordered by OrderCol, and both r1 and r2 have the same value of GroupBy column
Definition at line 3869 of file table.cpp.
|
protected |
Performs insertion sort on given vector V
.
Definition at line 3076 of file table.cpp.
Definition at line 5298 of file table.cpp.
|
inlineprotected |
Checks if RowIdx
corresponds to a valid (i.e. not deleted) row.
Definition at line 811 of file table.h.
Performs equijoin.
Perform equi-join with given columns - i.e. keep tuple pairs where this->Col1 == Table->Col2 Implementation: Hash-Join - build a hash out of the smaller table hash the larger table and check for collisions
Definition at line 2252 of file table.cpp.
Definition at line 1351 of file table.h.
|
protected |
Removes all rows that are not mentioned in the SORTED vector KeepV
.
Definition at line 1132 of file table.cpp.
|
inlinestatic |
Loads table from a binary format.
TTableContext Context
must be provided as a parameter and loaded separately from a table load as it can be shared among multiple tables. Context
can be loaded either before and after the table load, but must be available for operations that require string values (as opposed to string references).
Definition at line 970 of file table.h.
|
static |
Loads table from spread sheet (TSV, CSV, etc). Note: HasTitleLine = true is not supported. Please comment title lines instead.
Definition at line 775 of file table.cpp.
|
static |
Loads table from spread sheet - but only load the columns specified by RelevantCols. Note: HasTitleLine = true is not supported. Please comment title lines instead.
Definition at line 737 of file table.cpp.
|
staticprotected |
Parallelly loads data from input file at InFNm into NewTable. Only work when NewTable has no string columns.
Definition at line 487 of file table.cpp.
|
staticprotected |
Sequentially loads data from input file at InFNm into NewTable.
Definition at line 649 of file table.cpp.
|
protected |
Helper function for parallel QSort.
Definition at line 3158 of file table.cpp.
Returns table with rows that are present in this table but not in given Table
.
Definition at line 4569 of file table.cpp.
Definition at line 1416 of file table.h.
|
inlinestatic |
|
inlinestatic |
|
inlinestatic |
|
inlinestatic |
PNEANet TTable::NextGraphIterator | ( | ) |
Calls to this must be preceded by a call to one of the above ToGraph*Iterator functions.
Definition at line 3659 of file table.cpp.
Adds suffix to column name if it doesn't exist.
Definition at line 549 of file table.h.
void TTable::Order | ( | const TStrV & | OrderBy, |
TStr | OrderColName = "" , |
||
TBool | ResetRankByMSC = false , |
||
TBool | Asc = true |
||
) |
Orders the rows according to the values in columns of OrderBy (in descending lexicographic order).
Definition at line 3220 of file table.cpp.
|
protected |
Partitions vector for QSort.
Definition at line 3106 of file table.cpp.
Definition at line 5332 of file table.cpp.
void TTable::PrintContextSize | ( | ) |
Definition at line 3937 of file table.cpp.
void TTable::PrintSize | ( | ) |
Definition at line 3908 of file table.cpp.
Returns table with only the columns in ProjectCols
.
Definition at line 4592 of file table.cpp.
void TTable::ProjectInPlace | ( | const TStrV & | ProjectCols | ) |
Keeps only the columns specified in ProjectCols
.
Definition at line 5216 of file table.cpp.
|
protected |
Performs QSort on given vector V
.
Definition at line 3134 of file table.cpp.
Definition at line 5355 of file table.cpp.
|
protected |
Performs QSort in parallel on given vector V
.
Definition at line 3186 of file table.cpp.
Reads values of entire float column into Result
.
Definition at line 5198 of file table.cpp.
Reads values of entire int column into Result
.
Definition at line 5189 of file table.cpp.
Reads values of entire string column into Result
.
Definition at line 5207 of file table.cpp.
|
protected |
Reinitializes row ids.
Register (cache) result of a grouping statement by a single group-by attribute T is a hash table mapping a key x to rows keyed by x => DISABLED FOR NOW
Definition at line 1869 of file table.cpp.
|
protected |
Removes first valid row of the table.
Definition at line 1102 of file table.cpp.
Removes row with id RowIdx
.
Definition at line 1115 of file table.cpp.
Renames a column.
Definition at line 1085 of file table.cpp.
Creates Index for Flt Column ColName
.
Creates an Index on float column ColName. The index is hash-based, going from the column value to a vector of RowIdxs in the table that correspond to the value. If it exists, the index is used by the Get*RowIdxByVal functions; else, those functions will loop over the entire table. The index is NOT updated automatically when the table is modified; it is the user's responsibility to call RequestIndex after modifying the table if the index is necessary.
Definition at line 5472 of file table.cpp.
Creates Index for Int Column ColName
.
Creates an Index on integer column ColName. The index is hash-based, going from the column value to a vector of RowIdxs in the table that correspond to the value. If it exists, the index is used by the Get*RowIdxByVal functions; else, those functions will loop over the entire table. The index is NOT updated automatically when the table is modified; it is the user's responsibility to call RequestIndex after modifying the table if the index is necessary.
Definition at line 5453 of file table.cpp.
Creates Index for Str Column ColName
.
Creates an Index on string column given by ColName. The index is hash-based, going from the column value (that is, the integer mapping of the string value) to a vector of RowIdxs in the table that correspond to the value. If it exists, the index is used by the Get*RowIdxByVal functions; else, those functions will loop over the entire table. The index is NOT updated automatically when the table is modified; it is the user's responsibility to call RequestIndex after modifying the table if the index is necessary.
Definition at line 5491 of file table.cpp.
|
protected |
Resizes the table to hold RowCount
rows.
Definition at line 4307 of file table.cpp.
void TTable::Save | ( | TSOut & | SOut | ) |
Saves table schema and content to a binary format.
Note that TTableContext must be saved separately as it can be shared among multiple tables.
Definition at line 834 of file table.cpp.
void TTable::SaveBin | ( | const TStr & | OutFNm | ) |
Saves table schema and content to a binary file.
Definition at line 829 of file table.cpp.
void TTable::SaveSS | ( | const TStr & | OutFNm | ) |
Saves table schema and content to a TSV file.
Definition at line 780 of file table.cpp.
void TTable::Select | ( | TPredicate & | Predicate, |
TIntV & | SelectedRows, | ||
TBool | Remove = true |
||
) |
Selects rows that satisfy given Predicate
.
Select. Has two modes of operation:
Definition at line 2730 of file table.cpp.
|
inline |
Definition at line 1257 of file table.h.
void TTable::SelectAtomic | ( | const TStr & | Col1, |
const TStr & | Col2, | ||
TPredComp | Cmp, | ||
TIntV & | SelectedRows, | ||
TBool | Remove = true |
||
) |
Selects rows using atomic compare operation.
Select atomic - optimized cases of select with predicate of an atomic form: compare attribute to attribute or compare attribute to a constant
Definition at line 2793 of file table.cpp.
Definition at line 1269 of file table.h.
void TTable::SelectAtomicConst | ( | const TStr & | Col, |
const TPrimitive & | Val, | ||
TPredComp | Cmp, | ||
TIntV & | SelectedRows, | ||
PTable & | SelectedTable, | ||
TBool | Remove = true , |
||
TBool | Table = true |
||
) |
Selects rows where the value of Col
matches given primitive Val
.
Definition at line 2853 of file table.cpp.
|
inline |
Definition at line 1281 of file table.h.
|
inline |
Definition at line 1287 of file table.h.
Definition at line 1314 of file table.h.
|
inline |
Definition at line 1317 of file table.h.
Definition at line 1300 of file table.h.
|
inline |
Definition at line 1303 of file table.h.
Definition at line 1307 of file table.h.
|
inline |
Definition at line 1310 of file table.h.
void TTable::SelectFirstNRows | ( | const TInt & | N | ) |
Selects first N rows from the table.
Definition at line 3337 of file table.cpp.
Joins table with itself, on values of Col
.
Definition at line 1357 of file table.h.
|
inline |
Definition at line 1358 of file table.h.
PTable TTable::SelfSimJoinPerGroup | ( | const TStr & | GroupAttr, |
const TStr & | SimCol, | ||
const TStr & | DistanceColName, | ||
const TSimType & | SimType, | ||
const TFlt & | Threshold | ||
) |
Performs join if the distance between two rows is less than the specified threshold.
Returns table with schema (GroupId1, GroupId2, Similarity).
Definition at line 2074 of file table.cpp.
PTable TTable::SelfSimJoinPerGroup | ( | const TStrV & | GroupBy, |
const TStr & | SimCol, | ||
const TStr & | DistanceColName, | ||
const TSimType & | SimType, | ||
const TFlt & | Threshold | ||
) |
Performs join if the distance between two rows is less than the specified threshold.
SimJoinPerGroup performs SimJoin based on a set of attributes. Performs the grouping internally and returns a projection of the columns on which groupby was performed along with the similarity.
Definition at line 2160 of file table.cpp.
|
inline |
Sets the columns to be used as both src and dst node attributes.
Definition at line 1179 of file table.h.
|
inline |
Sets the name of the column to be used as dst nodes in the graph.
Definition at line 1158 of file table.h.
|
inlineprotected |
Sets the first valid row of the TTable.
Definition at line 821 of file table.h.
Definition at line 4129 of file table.cpp.
|
inlinestatic |
Definition at line 536 of file table.h.
|
inline |
Sets the name of the column to be used as src nodes in the graph.
Definition at line 1151 of file table.h.
PTable TTable::SimJoin | ( | const TStrV & | Cols1, |
const TTable & | Table, | ||
const TStrV & | Cols2, | ||
const TStr & | DistanceColName, | ||
const TSimType & | SimType, | ||
const TFlt & | Threshold | ||
) |
Performs join if the distance between two rows is less than the specified threshold.
Returns Similarity based join of two tables based on a given distance metric and a given threshold. Records (r1, r2) that are returned satisfy the criterion: d(r1, r2) <= Threshold
Definition at line 1974 of file table.cpp.
Splices table into subtables according to a grouping statement.
Definition at line 1788 of file table.cpp.
Adds entire flt column to table.
Definition at line 4081 of file table.cpp.
|
protected |
Parallel helper function for grouping. - we currently don't support such parallel grouping by complex keys.
Stores column for a group. Physical row ids have to be passed.
Definition at line 1290 of file table.cpp.
Adds entire int column to table.
Definition at line 4064 of file table.cpp.
Adds entire str column to table.
Definition at line 4098 of file table.cpp.
|
inlinestatic |
|
inlinestatic |
PTable TTable::ThresholdJoin | ( | const TStr & | KeyCol1, |
const TStr & | JoinCol1, | ||
const TTable & | Table, | ||
const TStr & | KeyCol2, | ||
const TStr & | JoinCol2, | ||
TInt | Threshold, | ||
TBool | PerJoinKey = false |
||
) |
Definition at line 2624 of file table.cpp.
|
protected |
Definition at line 2486 of file table.cpp.
|
protected |
Definition at line 2537 of file table.cpp.
|
protected |
Definition at line 2458 of file table.cpp.
|
protected |
Definition at line 2588 of file table.cpp.
|
protected |
Definition at line 2602 of file table.cpp.
Creates a sequence of graphs based on grouping specified by GroupAttr.
Definition at line 3640 of file table.cpp.
Creates the graph sequence one at a time.
Create the graph sequence one at a time, to allow efficient use of memory. A call to this function must be followed by subsequent calls to NextGraphIterator().
Definition at line 3654 of file table.cpp.
TVec< PNEANet > TTable::ToGraphSequence | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TInt | WindowSize, | ||
TInt | JumpSize, | ||
TInt | StartVal = TInt::Mn , |
||
TInt | EndVal = TInt::Mx |
||
) |
Creates a sequence of graphs based on values of column SplitAttr and windows specified by JumpSize and WindowSize.
Definition at line 3629 of file table.cpp.
PNEANet TTable::ToGraphSequenceIterator | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TInt | WindowSize, | ||
TInt | JumpSize, | ||
TInt | StartVal = TInt::Mn , |
||
TInt | EndVal = TInt::Mx |
||
) |
Creates the graph sequence one at a time.
Create the graph sequence one at a time, to allow efficient use of memory. A call to this function must be followed by subsequent calls to NextGraphIterator().
Definition at line 3644 of file table.cpp.
TVec< PNEANet > TTable::ToVarGraphSequence | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TIntPrV | SplitIntervals | ||
) |
Creates a sequence of graphs based on values of column SplitAttr and intervals specified by SplitIntervals.
Definition at line 3635 of file table.cpp.
PNEANet TTable::ToVarGraphSequenceIterator | ( | TStr | SplitAttr, |
TAttrAggr | AggrPolicy, | ||
TIntPrV | SplitIntervals | ||
) |
Creates the graph sequence one at a time.
Create the graph sequence one at a time, to allow efficient use of memory. A call to this function must be followed by subsequent calls to NextGraphIterator().
Definition at line 3649 of file table.cpp.
Returns union of this table with given Table
.
Definition at line 4508 of file table.cpp.
Definition at line 1404 of file table.h.
Returns union of this table with given Table
, preserving duplicates.
Definition at line 4488 of file table.cpp.
Definition at line 1407 of file table.h.
void TTable::UnionAllInPlace | ( | const TTable & | Table | ) |
Same as TTable::ConcatTable.
Definition at line 4501 of file table.cpp.
|
inline |
Definition at line 1410 of file table.h.
void TTable::Unique | ( | const TStr & | Col | ) |
Removes rows with duplicate values in given column.
Definition at line 1246 of file table.cpp.
Removes rows with duplicate values in given columns.
Definition at line 1278 of file table.cpp.
void TTable::UpdateFltFromTable | ( | const TStr & | KeyAttr, |
const TStr & | UpdateAttr, | ||
const TTable & | Table, | ||
const TStr & | FKeyAttr, | ||
const TStr & | ReadAttr, | ||
TFlt | DefaultFltVal = 0.0 |
||
) |
Definition at line 4219 of file table.cpp.
void TTable::UpdateFltFromTableMP | ( | const TStr & | KeyAttr, |
const TStr & | UpdateAttr, | ||
const TTable & | Table, | ||
const TStr & | FKeyAttr, | ||
const TStr & | ReadAttr, | ||
TFlt | DefaultFltVal = 0.0 |
||
) |
Definition at line 4151 of file table.cpp.
|
protected |
|
protected |
|
protected |
Updates table state after adding one or more rows.
Definition at line 4117 of file table.cpp.
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
friend |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
protected |
|
staticprotected |
|
protected |
|
staticprotected |
|
protected |
|
protected |
|
protected |
|
protected |
Partitioning of row ids into buckets corresponding to different graph objects when generating a sequence of graphs.
Example: <T_1.age,T_2.age, age> - T_1.age is a src node attribute, T_2.age is a dst node attribute. However, since all nodes refer to the same universe of entities (users) we just do one assignment of age per node, and call that attribute 'age'. This list should be very small.
|
protected |
|
protected |
|
protected |
|
protected |
|
staticprotected |