Modern Text Mining Framework for R


[Up] [Top]

Documentation for package ‘text2vec’ version 0.5.0

Help Pages

text2vec-package text2vec
as.lda_c Converts document-term matrix sparse matrix to 'lda_c' format
char_tokenizer Simple tokenization functions for string splitting
check_analogy_accuracy Checks accuracy of word embeddings on the analogy task
collect Collects distributed data structure to master process
collect.RowDistributedMatrix Collects distributed data structure to master process
Collocations Collocations model.
create_dtm Document-term matrix construction
create_dtm.itoken Document-term matrix construction
create_dtm.itoken_parallel Document-term matrix construction
create_dtm.list Document-term matrix construction
create_tcm Term-co-occurence matrix construction
create_tcm.itoken Term-co-occurence matrix construction
create_tcm.itoken_parallel Term-co-occurence matrix construction
create_vocabulary Creates a vocabulary of unique terms
create_vocabulary.character Creates a vocabulary of unique terms
create_vocabulary.itoken Creates a vocabulary of unique terms
create_vocabulary.itoken_parallel Creates a vocabulary of unique terms
create_vocabulary.list Creates a vocabulary of unique terms
dist2 Pairwise Distance Matrix Computation
distances Pairwise Distance Matrix Computation
fit Fits model to data
fit.Matrix Fits model to data
fit.matrix Fits model to data
fit_transform Fit model to data, then transform it
fit_transform.Matrix Fit model to data, then transform it
fit_transform.matrix Fit model to data, then transform it
GlobalVectors Creates Global Vectors word-embeddings model.
GloVe Creates Global Vectors word-embeddings model.
glove Fit a GloVe word-embedded model
hash_vectorizer Vocabulary and hash vectorizers
idir Creates iterator over text files from the disk
ifiles Creates iterator over text files from the disk
ifiles_parallel Creates iterator over text files from the disk
itoken Iterators (and parallel iterators) over input objects
itoken.character Iterators (and parallel iterators) over input objects
itoken.iterator Iterators (and parallel iterators) over input objects
itoken.list Iterators (and parallel iterators) over input objects
itoken_parallel Iterators (and parallel iterators) over input objects
itoken_parallel.character Iterators (and parallel iterators) over input objects
itoken_parallel.ifiles_parallel Iterators (and parallel iterators) over input objects
LatentDirichletAllocation Creates Latent Dirichlet Allocation model.
LatentDirichletAllocationDistributed Creates Latent Dirichlet Allocation model.
LatentSemanticAnalysis Latent Semantic Analysis model
LDA Creates Latent Dirichlet Allocation model.
LSA Latent Semantic Analysis model
movie_review IMDB movie reviews
normalize Matrix normalization
pdist2 Pairwise Distance Matrix Computation
perplexity Perplexity of a topic model
prepare_analogy_questions Prepares list of analogy questions
prune_vocabulary Prune vocabulary
psim2 Pairwise Similarity Matrix Computation
regexp_tokenizer Simple tokenization functions for string splitting
RelaxedWordMoversDistance Creates model which can be used for calculation of "relaxed word movers distance".
RWMD Creates model which can be used for calculation of "relaxed word movers distance".
sim2 Pairwise Similarity Matrix Computation
similarities Pairwise Similarity Matrix Computation
space_tokenizer Simple tokenization functions for string splitting
split_into Split a vector for parallel processing
text2vec text2vec
TfIdf TfIdf
tokenizers Simple tokenization functions for string splitting
transform Transforms Matrix-like object using 'model'
transform.Matrix Transforms Matrix-like object using 'model'
transform.matrix Transforms Matrix-like object using 'model'
vectorizers Vocabulary and hash vectorizers
vocabulary Creates a vocabulary of unique terms
vocab_vectorizer Vocabulary and hash vectorizers
word_tokenizer Simple tokenization functions for string splitting